We have a 5 node cluster, and three of them are data nodes. When we connect with the cluster we mention all the three data nodes. My understanding is that after the connection the SDK builds the cluster map and even if one node goes down later, my query will still return the data. But it did not happen like this. What happened is as below
- We brought down one data node. Auto-failover kicked in. We also saw that rebalance also kicked in, which I was not expecting. Auto rebalance was turned off
- We fired a query against the cluster with the three data nodes(one of them we brought down in step#1)
- We ran the test twice
a. In one run, the query returned the data, but it waited till the rebalance occurred(our understanding was that rebalance will not start since we turned off auto-rebalancing)
b. In the 2nd run, the query failed telling that cluster refused connection
This is contrary to my understanding of how Couchbase works. My understanding was that
- Firstly rebalance should not happen automatically
- Secondly, in both cases I should have got back the query results since the cluster map knows which are the healthy nodes.
Did something change in Couchbase?
I got the reason behind this. The cluster map is refreshed every 2.5 seconds. So, when the node went down, it was probably a timing issue. The queries failed when the cluster map was not updated yet. Regarding the rebalance, the rebalance did not happen. It was AUTO FAILOVER that happened, but since the POPUP header was shown as rebalance, we mistook it for REBALANCE. The solution for this is to use getFromReplica if it is a KV operation. if N1QL, then we need to have a retry framework.
Indeed, failover is different than rebalance. However, if you have a sufficient number of replicas, within seconds of failover you should go back to full availability. This is something we test to all of the time. You may have less replicas though, so a rebalance is needed to bring back in some redundancy, or adding a repaired/new node and then rebalancing.
You can use a getFromReplica if you are okay with the idea that in transient failure situations you may get an older copy of the data, yes. See the discussion on this in the docs. Also, I should say that with N1QL, SDKs after 3.x will automatically retry if it is safe to do so. You don’t indicate which SDK you’re using, but all of the modern SDKs have built-in retries up until the timeout as a default, with the ability to change the behavior to best effort if you see fit.