TimeoutException when trying to connect to a remote cluster

I set up a Couchbase + Sync Gateway cluster as described in Traun Leyden’s post there: http://tleyden.github.io/blog/2014/12/15/running-a-sync-gateway-cluster-under-coreos-on-aws/

The cluster seems to be running fine, I can access the web console for all my machines.
I also opened ports 8092 and 11209-11211 to all IP addresses. Now I’m trying to connect to this cluster using the Java SDK (2.1).

cluster = CouchbaseCluster.create(config.nodes)
bucket = cluster.openBucket(config.bucket, config.password)

But the second call gives me a TimeoutException:

[main] INFO com.couchbase.client.core.CouchbaseCore - CoreEnvironment: {sslEnabled=false, sslKeystoreFile='null', sslKeystorePassword='null', queryEnabled=false, queryPort=8093, bootstrapHttpEnabled=true, bootstrapCarrierEnabled=true, bootstrapHttpDirectPort=8091, bootstrapHttpSslPort=18091, bootstrapCarrierDirectPort=11210, bootstrapCarrierSslPort=11207, ioPoolSize=8, computationPoolSize=8, responseBufferSize=16384, requestBufferSize=16384, kvServiceEndpoints=1, viewServiceEndpoints=1, queryServiceEndpoints=1, ioPool=NioEventLoopGroup, coreScheduler=CoreScheduler, eventBus=DefaultEventBus, packageNameAndVersion=couchbase-java-client/2.1.0 (git: 2.1.0), dcpEnabled=false, retryStrategy=BestEffort, maxRequestLifetime=75000, retryDelay=com.couchbase.client.core.time.ExponentialDelay@574b560f, reconnectDelay=com.couchbase.client.core.time.ExponentialDelay@ba54932, observeIntervalDelay=com.couchbase.client.core.time.ExponentialDelay@28975c28, keepAliveInterval=30000}
[cb-io-1-3] INFO com.couchbase.client.core.node.Node - Connected to Node ec2-node1.compute-1.amazonaws.com
[cb-io-1-1] INFO com.couchbase.client.core.node.Node - Connected to Node ec2-node2.compute-1.amazonaws.com
[cb-io-1-2] INFO com.couchbase.client.core.node.Node - Connected to Node ec2-node3.compute-1.amazonaws.com
Exception in thread "main" java.lang.RuntimeException: java.util.concurrent.TimeoutException
	at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:93)
	at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:108)
	at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:99)
	at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:89)
	at com.couchbase.client.java.Cluster$openBucket.call(Unknown Source)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:128)
	at CouchbaseService.<init>(CouchbaseService.groovy:14)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
	at org.codehaus.groovy.reflection.CachedConstructor.invoke(CachedConstructor.java:77)
	at org.codehaus.groovy.runtime.callsite.ConstructorSite$ConstructorSiteNoUnwrapNoCoerce.callConstructor(ConstructorSite.java:102)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallConstructor(CallSiteArray.java:57)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:230)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:242)
	at WebServer.main(WebServer.groovy:12)
Caused by: java.util.concurrent.TimeoutException
	... 19 more

How can I check whether my Couchbase server cluster is correctly configured to allow remote access? Any idea what I forgot?

I think what could help is adding timestamps to your logs and then cranking them up to TRACE and post the logs again (maybe as a gist or so). Then we can see how far we get into the bootstrap process and/or where it starts hanging.

Here is the gist with the entire log: https://gist.github.com/sarbogast/ab794c0ee82137230420

Here is some thing I found.

The server config seems to use the private IP addrs for the nodes in the cluster: /10.13.169.218, but you are bootstrapping against node1.compute-1.amazonaws.com/ip1 and others.
Could it be that they are not reachable from your clients? If I’m not mistaken when running on EC2 you want your nodes listen on the external ip addresses.

Can you verify that?

1 Like

Indeed, that seems to be the issue. Traun is testing a way to make his setup listen to public IPs instead of private ones.

was there a solution for this? running into the same issue