Connect to a specific server node

akiko · April 21, 2016, 4:51pm

It seems like gocb connects to one of server nodes randomly when there are multiple nodes in a cluster (get*Ep() functions in bucket.go)

Say for example I have 3 server nodes in a cluster -
10.20.30.40
10.20.30.41
10.20.30.42
and use .42 as “master” (the only node that accepts writes).

This is a very simple version of how I connect -
cluster, _ := gocb.Connect(“couchbase://10.20.30.42”)
bucket, _ := cluster.OpenBucket(“default”, “”)

When I do an operation, it uses one of the 3 nodes (randomly chosen) but I want to use .42 only.
Is there a way to use a specific node instead of randomly chosen one?

househippo · April 21, 2016, 6:08pm

Couchbase is a Master/Master system with copies of data in other nodes.

In Couchbase the SDK is Cluster topography aware.

It get a streaming data vBucket(shard) map of a bucket. A bucket is nothing more then a logical name space that holds your JSON.

Ex.
Bucket name is “great_bucket”.

“great_bucket” create 1024 vbuckets(shards) or files and spreads them evenly over the cluster.

When the Go SDK wants to find or insert a key = “bob_bike_12345”

i.e.
FIND
cb.get(“bob_bike_12345”) = CRC32(‘bob_bike_12345’)%1024 = “some number lets say 2”.
INSERT
cb.set(“bob_bike_12345”,“JSON_HERE”) = CRC32(‘bob_bike_12345’)%1024 = “some number lets say 2”.

The SDK then finds out which vbucket(shard) “2” lives in the cluster. via a cluster map that looks like this below.

{
"nodes":[
              "10.20.30.40",
              "10.20.30.41",
              "10.20.30.42"              
              ],
"vBucket":[
                  0:[1,0],
                  1:[0,2],
                  2:[1,0],
                   .
                   .
                  1023:[2,1]
                  ]
}

So vBucket “2” , according to the map 2:[1,0], lives in “10.20.30.41” active/Main copy and the replica copy lives in “10.20.30.40”.

Note. Couchbase server auto replicates in the cluster so you do not have to do anything to write to a replica.

The benefit of this method are:

You go directly to the machine that has your data.
Your reads/writes are always consistent and fast
You can just add more nodes for more READ/WRITE through put and capacity in minutes.
When nodes go down in the cluster the replica in other nodes become active so your application still can get data with very low impact.

The other SDK has the ability to pass in an array of other nodes in the cluster, golang SDK is still young , but I heard that they will have that option soon, just in case when you launch a new instance of the app and the CB node you listed is removed you can try other ip or hostnames.

akiko · April 21, 2016, 6:51pm

Thank you very much for explanation! It’s making more sense to me.

Reason why I wanted to specify a node is because the master (.42 in above example) is a same server as where Go application lives, so connecting locally could remove network latency. But if data is spread all over the cluster then this cannot be done…

Using your cluster map example, what happens if there is a network problem talking to other nodes (.40 and .41) from .42 for the vBucket “2” data? The master copy lives in .41 and replica is in .40.
Does it mean cannot do anything with the data since .42 doesn’t have a copy?

househippo · April 22, 2016, 12:22am

Not to clear on your question could you give me a more detailed example.

akiko · April 22, 2016, 1:08am

Using your earlier example:

If there’s a network issue between where I run Go SDK and 10.20.30.40 & 10.20.30.41, but can connect to 10.20.30.42, what will happen? (the server nodes are up, just connecting to those nodes has a problem)

The data exists in 10.20.30.41 (main copy) and in 10.20.30.40 (replica) but not in 10.20.30.42 according to the map.

So my question was, if such thing happens then it cannot deal with the data?
or will 10.20.30.42 somehow have a copy and can get/update it? (or even, all nodes have replica copy?? only one replica??)
*I’m using couchbase 4.0 community edition

Does this make sense? Sorry if this is too basic question! I’m just not good at finding it on documentation

househippo · April 22, 2016, 2:54am

In my example I only have original(active) and replica(copy).
You can make more or less replica here in the bucket config.

Below is what happens if a node is not reachable and auto failover happens.

HERE is where you set the auto failover time.

akiko · April 22, 2016, 3:52pm

Thank you so much for your help!!! It really helped