One cluster with two nodes running a bucket named test. The bucket type is Couchbase. The bucket is set up to have one replica.
I upload 10K documents to the bucket. The cluster now divides the data across the nodes. Node 1: 5K active / 5K replica Node 2: 5K active / 5K replica.
If you remove one node and rebalance the replica of the still active node moves all documents from replica to active. Now the active node has 10K active and 0K replica. This is perfect behavior.
Now lets say I have both my nodes working with node 1 having 5K active / 5K replica and node 2 having 5K active / 5K replica).
Clients are now polling data from the cluster at lets say a rate of 2K gets per second. Then suddenly node 2 goes down. Now when the clients are polling data they will only get a result when the document is part of the 5K active on node 1. I would expect that the clients should get their data but with a little delay since it now needs to also get data from node one’s replica documents.
What happens now (at least when using the C# API) is that when one node goes down half of the data is unavailable to the clients until a rebalance is done. If you have lets say 1M documents you will have about 500K documents unavailable to the clients until rebalance is done.
Shouldn’t the cluster always return the same result as long as the data exist in active or replica?