What I am saying is that this should be a temporary state. Internally, the client will retry the operations on the other nodes.
You may see the message in the logs, but the actual result of the operation more often than not should be successful. In other words, an operation may internally temporarily fail, and this will be logged, but should succeed in subsequent retries. In the end, the IOperationResult.Success should be true.
“in our test scenarios here with a 4 node cluster it is taking up to 10 minutes to rebalance when we drop a node, I understand that this is dependant on the volume of data. but for a client not to get to the data for this length of time seem a little strange to me?”
No, the client should be successfully performing operations at this time, albeit at a lower rate of throughput and you’ll definitely see some warn or even error messages in your logs as the client adjusts to the new topology.
What version of the client and server are you using?
Note, server versions 2.5 and later have a feature called CCCP, which mitigates much of the problems associated with swap/rebalance on the clients. Unfortunately, the 1.3.X client will not support it, but the forthcoming 2.X client will.