ConfigurationProvider - could not read proposed config when node is lost


#1

We’ve a spring-boot app that runs a scheduled query (once per minute) to a bucket on a 3-node Couchbase cluster (@5.0.1) using the 2.6.2 SDK. The bucket has 2 replicas and the number of docs or size of bucket does not seem to matter (my last test was with 20 docs nearly all single line json of no more than 50 chars) .

During normal operation everything works great and the query always succeeds, however if we suffer a server outage we have long period (14 minutes) of the query failing with a RuntimeException, followed by intermittent RuntimeExceptions (246 out of 1076 total) until the server is restored/rebalanced. During the server failure, I see the Couchbase cluster auto-failover correctly - and the same query always works via the UI and cbq, it’s a simple select which returns a single result.

Shortly after the server failure, I see the following warning:

{"@timestamp":"2018-10-18T17:07:00.466+00:00","message":"Could not read proposed configuration, ignoring. Message: Could not parse configuration","logger_name":"com.couchbase.client.core.config.ConfigurationProvider","thread_name":"cb-computations-5","level":"WARN"}

My guess is that the proposed configuration contains cluster information which states the lost server is no longer part of the cluster?

Firstly is there a way to fix/correct this parsing? Secondly, if not, is there something we can ‘re-bootstrap’ to achieve the same effect?

Many thanks


#2

I rebuilt the couchbase cluster using 5.1.1 and updated the SDK to version 2.7.0 and this appeared to improve things. Queries worked reliably after ~3 minutes from server loss. I still get the error message about not parsing the config. Ideally I’d like to be in a position where the queries never fail.