Failing to read from replica


#1

Setup is a 2 node cluster with a bucket with 1 replica.
Everything is working.

I suddenly shut down one node,

Bucket.Get<T>(sKey) fails

I’m calling:

Bucket.GetFromReplica<T>(sKey);

Also fails with “Failed to acquire a connection after 5 tries.” :frowning:

  1. What should I do to read from replica ?

Restoring the node and the app cannot continue unless I restart it (which is really bad). Errors are: “The operation has timed out” and “Failed to acquire a connection after 5 tries.”

  1. What should I change for the app to restore functionality without restart ?

#2

@itay -

Assuming you have re-added the node these errors should be resolved once the client syncs with the cluster. Are you doing a remove node/failover from the mgmt console or literally stopping the server hosting CB?

There was a bug found with the existing replica reads found when this ticket was being implemented: [NCBC-840][1]. It is fixed now and will be in 2.1.0.

-Jeff
[1]: https://issues.couchbase.com/browse/NCBC-840


#3

Hi @jmorris,

  1. Shut down is shut down, simulating a lost node.
    I thought that if I have 1 replica it should mitigate the unfortunate lose of a server immediately.

  2. Restoring is making it run again, simulating a recovered node
    I thought that if the node returns, then it should seamlessly join the cluster, which it does. However, the app is not capable to restore communication.

Am I right ?
Will 1&2 be fixed in 2.1.0 ?

Itay


#4

I have been seeing similar behavior. I had to resize my data partitions and when I failover a node, it seems that SDK is not always reading the replica data. I wish I knew how to test this. I am using views to get the keys for documents I need, so maybe that is where the failures are.


#5

Doing a failover from the management console


#6

@jmorris

I understand that 2.1.0 is due this week ?


#7

@Itay -

It’s scheduled for next Tuesday, but it depends on QE; so it could be later.

-Jeff


#9

Does 2.1.0 fix the replica issue or am I understanding replica’s wrongly ?
Is there a doc about replicas ?


#10

@itay -

2.1.0 will fix the replica issue.

-Jeff


#11

Can’t wait :sunglasses: