I have 6 couchbase nodes with 3 replicas and autofailover. If I stop one node “service couchbase-server stop” the failover runs ok and the java client gets notified and removes the node from its internal map.
2015-04-10 15:38:50 c.c.client.core.node.Node [INFO] Disconnected from Node xxx
If instead of stopping the couchbase-server service I shutdown the machine the java client never gets notified and tries to hit that dead node every time. The autofailover happens but the client does not update its internal node map.
Any ideas on this?
I am using Couchbase Server Version: 3.0.1 in Ubuntu 64bits and 3.0.3 Enterprise.
Couchbase Java Client 2.1.2
What’s the workload? If you’re shutting down the server by “shutdown -h” or the like, the behavior should be the same, but simulating a failure is rather different. You’d not have a TCP RST message, so it may take a certain amount of ‘failed’ workload for the client to update it’s internal map. There’s a backstop as well that should eventually get the client to update.
The scenario you describe is like a whole section of our testing, so I’m pretty confident it’s correct. Our test has a basic workload. More info on the scenario would be appreciated.
I am shutting down it stopping the EC2 instance through the AWS manager console.
The workload that I am using in this particular case is low, just a few request because I was trying to test that particular scenario.
I tried with a heavier workload hitting couchbase with 800 get ops per second for 60 seconds without any success. The client does not update its configuration unless I restart it.
I think that it has something to do with the Carrier Publication. I disabled it and it worked as expected.
CouchbaseEnvironment environment = DefaultCouchbaseEnvironment
.builder().
.bootstrapCarrierEnabled(false)
.bootstrapHttpDirectPort(8080)
.build();
But this is a workaround.
When using the carrier publication the client never gets notified when a node shuts down in a “hard” fashion or if it has network issues.
Now I am testing it with a very simple scenario. One client and 2 couchbase nodes doing the failover manually.
Could you test the same scenario? An easy way to test it is by disconnecting one couchbase node from the network.
I add more information to the issue.
In the logs I get every 20-30 seconds a keep alive request without errors nor responses.
2015-04-13 11:28:48 c.c.c.c.e.AbstractGenericHandler [DEBUG] [node-a/10.10.9.135:11210][KeyValueEndpoint]: KeepAlive fired.
@yorugua is it possible for you to share the code that you are using and the steps to reproduce? That would greatly help. Also, if you can share TRACE level logging that would be great.
If you don’t want to share it publicly you can also drop me an email.
Document with id A in Node 1
Document with id B in Node 2
Api client with JAVA sdk 2.1.2
Get key A (OK)
Get key B (OK)
Unplug Node 2 network cable
Get key A (OK)
Get key B (FAIL expected until failover)
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:93) ~[java-client-2.1.2.jar:2.1.2]
Failover Node 2 (without doing the rebalance)
Get key A (OK)
Get Key B (FAIL not expected behavior)
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:93) ~[java-client-2.1.2.jar:2.1.2]
The same happens in the cloud when stopping the EC2 instance of Node 2.
However if I stop the couchbase-service in Node 2 doing a “service couchbase-server stop” it works as expected.
If I disable the bootstrap carrier all scenarios work as expected. But it is not the idea.
Code:
public class CouchbaseRepository {
private Cluster cluster;
private Bucket bucket;
public CouchbaseRepository() {
//Initialization of cluster and bucket
CouchbaseEnvironment environment = DefaultCouchbaseEnvironment
.builder().requestBufferSize(16384)
.build();
cluster = CouchbaseCluster.create(environment, "10.10.8.189,10.10.9.135");
bucket = cluster.openBucket("default");
}
public JsonDocument getByKey(String key) {
JsonDocument doc = bucket.get(key);
return doc;
}