Swap(?) rebalance in a single node failure scenario


#1

Hi All,
I know the Swap Rebalance is a quite useful feature when upgrading servers - taking one down and replacing it with another.
But lets have a look at the following scenario:
Given we have 3 nodes and a bucket with 2 replicas - this mean each note holds a copy of the data.
When one node is dow and we attempt to replace the failed node with another one - does the cluster need a full reballance? How this affects the performance of the cluster?


#2

Hi Nekomatic,
Thanks for using our products.
A 3 nodes cluster could create a bucket with 2 replicas. It means in this cluster, it could handle 2 failure nodes without data lost. It needs to replace the failure node as soon as possible to make stable cluster.
In you case, if a node is failed, there are 2 ways you could do:

  1. Failover a failed node and do rebalance completely. Then you need to add a new node to cluster and do rebalance again to make cluster stable.
  2. Failover a failed node and add a new node to cluster. Do rebalance. This rebalance will automatically remove the fail node and add new node to cluster. At the end of rebalance, you will get back stable 3 nodes cluster.

If you have any question, drop me a line to: thuan at couchbase dot com
Thanks.


#3

Hi @thuan. We do not actually use couchbase yet :), I have been tasked with finding an optimal Couchbase cluster configuration we may start with in production (a new project).
So, if I understand correctly Swap Rebalance allows to avoid expensive reshuffling data when replacing a healthy node, However there is no way to avoid reshuffling when replacing a failed node even if the number of nodes stays the same as before the failure, or using different words, even if restoring the node can be done by literally copying data from existing nodes without full reballancing. Is this correct?