Java client throws "com.couchbase.client.core.config.ConfigurationException: Could not open bucket." when there is no Couchbase running


#1

I’ve stumbled on this issue trying to setup my integration test environment. I’m writing Couchbase client in Scala using Java SDK v2.5.1

So when I don’t have Couchbase running at all Couchbase Java SDK client throws exception on openBucket call, while it should try to reconnect to the server first.

For example cluster.clusterManager().hasBucket("mybucket") behaves correctly and tries to reconnect until it reaches managementTimeout.

Reproducing client is extremely simple (I’m using Scala, but it’s pretty clear what it’s doing).

  private val env = DefaultCouchbaseEnvironment.builder()
    .connectTimeout(60000)
    .build()

  val cluster = CouchbaseCluster.create(env, "localhost")

  try {
    val bucket = cluster.openBucket("mybucket")
    readChar()
  }
  finally {
    cluster.disconnect()
  }

In logs you will see that there is attempt to reconnect, but it’s ruined by this exception. So i think you have a race condition in the code or something. Last change list for 2.5.1 contained similar description, but then it’s not fixed yet.

Client log
Oct 31, 2017 2:08:52 PM com.couchbase.client.core.CouchbaseCore <init>
INFO: CouchbaseEnvironment: {sslEnabled=false, sslKeystoreFile='null', sslKeystorePassword=false, sslKeystore=null, bootstrapHttpEnabled=true, bootstrapCarrierEnabled=true, bootstrapHttpDirectPort=8091, bootstrapHttpSslPort=18091, bootstrapCarrierDirectPort=11210, bootstrapCarrierSslPort=11207, ioPoolSize=8, computationPoolSize=8, responseBufferSize=16384, requestBufferSize=16384, kvServiceEndpoints=1, viewServiceEndpoints=12, queryServiceEndpoints=12, searchServiceEndpoints=12, configPollInterval=2500, ioPool=NioEventLoopGroup, kvIoPool=null, viewIoPool=null, searchIoPool=null, queryIoPool=null, coreScheduler=CoreScheduler, memcachedHashingStrategy=DefaultMemcachedHashingStrategy, eventBus=DefaultEventBus, packageNameAndVersion=couchbase-java-client/2.5.1 (git: 2.5.1, core: 1.5.1), retryStrategy=BestEffort, maxRequestLifetime=75000, retryDelay=ExponentialDelay{growBy 1.0 MICROSECONDS, powers of 2; lower=100, upper=100000}, reconnectDelay=ExponentialDelay{growBy 1.0 MILLISECONDS, powers of 2; lower=32, upper=4096}, observeIntervalDelay=ExponentialDelay{growBy 1.0 MICROSECONDS, powers of 2; lower=10, upper=100000}, keepAliveInterval=30000, continuousKeepAliveEnabled=true, keepAliveErrorThreshold=4, keepAliveTimeout=2500, autoreleaseAfter=2000, bufferPoolingEnabled=true, tcpNodelayEnabled=true, mutationTokensEnabled=false, socketConnectTimeout=1000, callbacksOnIoPool=false, disconnectTimeout=25000, requestBufferWaitStrategy=com.couchbase.client.core.env.DefaultCoreEnvironment$2@3712b94, certAuthEnabled=false, coreSendHook=null, queryTimeout=75000, viewTimeout=75000, searchTimeout=75000, analyticsTimeout=75000, kvTimeout=2500, connectTimeout=10000, dnsSrvEnabled=false}
Oct 31, 2017 2:08:53 PM com.couchbase.client.core.endpoint.AbstractEndpoint$2 onSuccess
WARNING: [null][KeyValueEndpoint]: Could not connect to remote socket.
Oct 31, 2017 2:08:53 PM com.couchbase.client.core.endpoint.AbstractEndpoint$2 onSuccess
WARNING: [null][KeyValueEndpoint]: Could not connect to remote socket.
Oct 31, 2017 2:08:53 PM com.couchbase.client.core.endpoint.AbstractEndpoint$2$1 onError
WARNING: Error during reconnect: 
com.couchbase.client.deps.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:11210
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	at com.couchbase.client.deps.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:259)
	at com.couchbase.client.deps.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:291)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:634)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460)
	at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
	at com.couchbase.client.deps.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused
	... 11 more

Oct 31, 2017 2:08:53 PM com.couchbase.client.core.endpoint.AbstractEndpoint$2 onSuccess
WARNING: [null][KeyValueEndpoint]: Could not connect to endpoint, retrying with delay 32 MILLISECONDS: 
com.couchbase.client.deps.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:11210
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	at com.couchbase.client.deps.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:259)
	at com.couchbase.client.deps.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:291)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:634)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460)
	at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
	at com.couchbase.client.deps.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused
	... 11 more

Oct 31, 2017 2:08:53 PM com.couchbase.client.core.endpoint.AbstractEndpoint$2 onSuccess
WARNING: [null][ConfigEndpoint]: Could not connect to remote socket.
Oct 31, 2017 2:08:53 PM com.couchbase.client.core.endpoint.AbstractEndpoint$2 onSuccess
WARNING: [null][ConfigEndpoint]: Could not connect to remote socket.
Oct 31, 2017 2:08:53 PM com.couchbase.client.core.endpoint.AbstractEndpoint$2$1 onError
WARNING: Error during reconnect: 
com.couchbase.client.deps.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8091
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	at com.couchbase.client.deps.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:259)
	at com.couchbase.client.deps.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:291)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:634)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460)
	at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
	at com.couchbase.client.deps.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused
	... 11 more

Exception in thread "main" com.couchbase.client.core.config.ConfigurationException: Could not open bucket.
	at com.couchbase.client.core.config.DefaultConfigurationProvider$6.call(DefaultConfigurationProvider.java:293)
	at com.couchbase.client.core.config.DefaultConfigurationProvider$6.call(DefaultConfigurationProvider.java:290)
	at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$4.onError(OperatorOnErrorResumeNextViaFunction.java:140)
	at rx.internal.operators.OnSubscribeDoOnEach$DoOnEachSubscriber.onError(OnSubscribeDoOnEach.java:87)
	at rx.internal.operators.OnSubscribeDoOnEach$DoOnEachSubscriber.onError(OnSubscribeDoOnEach.java:87)
	at rx.internal.operators.OnSubscribeMap$MapSubscriber.onError(OnSubscribeMap.java:88)
	at rx.internal.operators.OnSubscribeDoOnEach$DoOnEachSubscriber.onError(OnSubscribeDoOnEach.java:87)
	at rx.internal.operators.OperatorTake$1.onError(OperatorTake.java:65)
	at rx.internal.operators.OperatorMerge$MergeSubscriber.reportError(OperatorMerge.java:266)
	at rx.internal.operators.OperatorMerge$MergeSubscriber.emitLoop(OperatorMerge.java:658)
	at rx.internal.operators.OperatorMerge$MergeSubscriber.emit(OperatorMerge.java:568)
	at rx.internal.operators.OperatorMerge$InnerSubscriber.onError(OperatorMerge.java:852)
	at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$4$1.onError(OperatorOnErrorResumeNextViaFunction.java:122)
	at rx.internal.operators.OnSubscribeMap$MapSubscriber.onError(OnSubscribeMap.java:88)
	at rx.internal.operators.OperatorMerge$MergeSubscriber.reportError(OperatorMerge.java:266)
	at rx.internal.operators.OperatorMerge$MergeSubscriber.checkTerminate(OperatorMerge.java:818)
	at rx.internal.operators.OperatorMerge$MergeSubscriber.emitLoop(OperatorMerge.java:579)
	at rx.internal.operators.OperatorMerge$MergeSubscriber.emit(OperatorMerge.java:568)
	at rx.internal.operators.OperatorMerge$InnerSubscriber.onError(OperatorMerge.java:852)
	at rx.internal.operators.OnSubscribeMap$MapSubscriber.onError(OnSubscribeMap.java:88)
	at rx.internal.operators.OperatorMerge$MergeSubscriber.reportError(OperatorMerge.java:266)
	at rx.internal.operators.OperatorMerge$MergeSubscriber.checkTerminate(OperatorMerge.java:818)
	at rx.internal.operators.OperatorMerge$MergeSubscriber.emitLoop(OperatorMerge.java:579)
	at rx.internal.operators.OperatorMerge$MergeSubscriber.emit(OperatorMerge.java:568)
	at rx.internal.operators.OperatorMerge$MergeSubscriber.onError(OperatorMerge.java:276)
	at rx.internal.operators.OnSubscribeMap$MapSubscriber.onError(OnSubscribeMap.java:88)
	at rx.subjects.SubjectSubscriptionManager$SubjectObserver.onError(SubjectSubscriptionManager.java:227)
	at rx.subjects.AsyncSubject.onError(AsyncSubject.java:116)
	at com.couchbase.client.core.service.AbstractOnDemandService$1.onError(AbstractOnDemandService.java:56)
	at rx.observers.SafeSubscriber._onError(SafeSubscriber.java:153)
	at rx.observers.SafeSubscriber.onError(SafeSubscriber.java:115)
	at rx.subjects.SubjectSubscriptionManager$SubjectObserver.onError(SubjectSubscriptionManager.java:227)
	at rx.subjects.AsyncSubject.onError(AsyncSubject.java:116)
	at com.couchbase.client.core.endpoint.AbstractEndpoint$2.onSuccess(AbstractEndpoint.java:416)
	at com.couchbase.client.core.endpoint.AbstractEndpoint$2.onSuccess(AbstractEndpoint.java:377)
	at rx.internal.operators.SingleOperatorOnErrorResumeNext$2.onSuccess(SingleOperatorOnErrorResumeNext.java:63)
	at rx.internal.operators.SingleTimeout$TimeoutSingleSubscriber.onSuccess(SingleTimeout.java:79)
	at com.couchbase.client.core.endpoint.AbstractEndpoint$4$1.operationComplete(AbstractEndpoint.java:354)
	at com.couchbase.client.core.endpoint.AbstractEndpoint$4$1.operationComplete(AbstractEndpoint.java:338)
	at com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
	at com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
	at com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
	at com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
	at com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
	at com.couchbase.client.deps.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:278)
	at com.couchbase.client.deps.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:294)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:634)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498)
	at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460)
	at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
	at com.couchbase.client.deps.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
	at java.lang.Thread.run(Thread.java:748)
Caused by: com.couchbase.client.deps.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8091
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	at com.couchbase.client.deps.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:259)
	at com.couchbase.client.deps.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:291)
	... 7 more
Caused by: java.net.ConnectException: Connection refused
	... 11 more

Let me know if you need debug logs, they are very big and I haven’t found how to attach file here.


#2

Hm I’m not sure if I fully understand the scenario. The way the code should work is on openBucket, if it can at least find one node to bootstrap off keep going then on the subset of downed nodes. but if it can’t find any bucket, bail out with an exception and stop retrying. Since there is a semantical difference between being able to bootstrap but only operate partially while on the other hand not being able to service the user at all.

Or am I missing something?


#3

I expect that at least blocking version should retry to reconnect to nodes until it reaches ConnectionTimeout time. As client is not connected it’s strange to throw configuration Exception about bucket. Bucket is configured right, you just cannot connect.

As I mentioned cluster.clusterManager().hasBucket("mybucket") behaves exactly the way I described (tries to reconnect until reaching timeout and then gives TimeoutException) and it looks right for me. So behavior at least inconsistent between different calls.

You can also see in logs that Netty do reconnect attempts and if you retry openBucket you will end up with many attempts of connection