"The bucket requested does not exist" from a particular server

Hi,

We currently have 6 servers and we are in an auto-scaling group in aws.

We’ve noticed that we get “The bucket requested does not exist” error time to time from just one server.

I’ve created a very simple test script and ran it on two servers. Server A works just fine, but Server B does not. We think that both servers have the exact same configuration.

This is my test script (test.php)

<?php

$cluster = new CouchbaseCluster('mycouchbase')
$bucket = $cluster->openBucket('mybucket');

Server A: strace -e trace=connect php test.php

connect(3, {sa_family=AF_LOCAL, sun_path="/tmp/.newrelic.sock"}, 22) = 0
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.xxx.8.2")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr(".103")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr(".126")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr(".185")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr(".212")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr(".222")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr(".246")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr(".30")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr(".35")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr(".35")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr(".35")}, 16) = 0
+++ exited with 0 +++

Server B: strace -e trace=connect php test.php

connect(3, {sa_family=AF_LOCAL, sun_path="/tmp/.newrelic.sock"}, 22) = 0
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.xxx.8.2")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("xx.xxx.xx.222")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("xx.xxx.xx.246")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("xx.xxx.xx.30")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("xx.xxx.xx.35")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("xx.xxx.xx.103")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("xx.xxx.xx.126")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("xx.xxx.xx.185")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("xx.xxx.xx.212")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("xx.xxx.xx.246")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("xx.xxx.xx.246")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.201.8.2")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("xx.xxx.xx.212")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("xx.xxx.xx.222")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("xx.xxx.xx.246")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("xx.xxx.xx.30")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("xx.xxx.xx.35")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("xx.xxx.xx.103")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("xx.xxx.xx.126")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("xx.xxx.xx.185")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("xx.xxx.xx.246")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("xx.xxx.xx.246")}, 16) = 0

Fatal error: Uncaught exception 'CouchbaseException' with message 'The bucket requested does not exist' in [CouchbaseNative]/CouchbaseBucket.class.php:74
Stack trace:
#0 [CouchbaseNative]/CouchbaseBucket.class.php(74): _CouchbaseBucket->__construct('.hue.r...', 'catalog', '')
#1 [CouchbaseNative]/CouchbaseCluster.class.php(61): CouchbaseBucket->__construct('co.r...', 'catalog', '')
#2 /home/a/test.php(4): CouchbaseCluster->openBucket('catalog')
#3 {main}
  thrown in [CouchbaseNative]/CouchbaseBucket.class.php on line 74
+++ exited with 255 +++

Looks like Server B tries to use 8091 port then fails.

Could someone please shed light on me?

@moon0326 can you describe which version the two servers are on and can you make sure they have all the ports open and are reachable from the clients? that is 11210, 8091, 8092 and 8093?

Hi @daschl,

They are all 4.1. I need to check when I get into office tomorrow, but as you see from the trace I pasted, ports are open.

Thanks,
Moon

It seems weird, also double check the bucket names and passwords as well as the Ip addresses. Since from an SDK point of view there is no difference and they have a “fallback” to 8091 config loading when its not accessible over 11210. So there is something going on on the other cluster that doesn’t allow the client to fetch a proper configuration like on the other cluster.

Yeah, I found it very weird as well. Both servers are configured exactly the same as far as I’m aware (we use script to do so).

From both servers, telnet output is the same.

telnet xxxxx.xxx.xxxx 11210 - connected
telnet xxxxx.xxx.xxxx 8091 - connected
telnet xxxxx.xxx.xxxx 8092 - connected
telnet xxxxx.xxx.xxxx 8093 - connection refused

As far as I know, 8093 does not have to be open unless I want to connect over http, which I don’t want to. Am I correct on that?

I don’t think we are seeing connection issues from the log (?) I pasted. That is why I thought it is weird.

@daschl is there anything I can provide more for you? This is something out of my scope. I don’t see any reason why it’s behaving as I described above. I would really appreciate your help.

@moon0326 one other thing - could you try a simple hello world with a java application (or alternatively go, .net or nodejs)? Just to rule out anything client specific. Note that also a new php sdk has been released, would be good to test too.

If you want to make N1QL queries you need to have port 8093 open. 11210 -> kv & configs, 8091 fallback configs & management, 8092 views, 8093 n1ql.

@daschl tested with node sdk and received the same error message. Well, I’m not even making n1ql with the test script…

@moon0326 then it looks like its environmental and not a SDK issue if both can’t connect - something must be different in your environment. Can you crank up the logging level for your node app as described here: http://developer.couchbase.com/documentation/server/4.5/sdk/nodejs/collecting-information-and-logging.html

and show us the output?

@daschl thank you for the follow up. I will test with logging level when I get a chance. We’ve removed test servers and the rest of the servers work fine for now.

This is log from node

0ms [I0] {1594} [INFO] (instance - L:347) Version=2.5.6, Changeset=fa6ce043d715a3b0fd25712673c4763f782b08ee
0ms [I0] {1594} [INFO] (instance - L:348) Effective connection string: http://myserver.rest/catalog?console_log_level=5&client_string=couchnode%2F2.2.1. Bucket=catalog
0ms [I0] {1594} [DEBUG] (instance - L:219) Applying initial cntl client_string=couchnode/2.2.1
0ms [I0] {1594} [DEBUG] (instance - L:66) Adding host myserver.rest:8091 to initial HTTP bootstrap list
0ms [I0] {1594} [DEBUG] (instance - L:66) Adding host myserver.rest:11210 to initial CCCP bootstrap list
0ms [I0] {1594} [DEBUG] (confmon - L:89) Preparing providers (this may be called multiple times)
0ms [I0] {1594} [DEBUG] (confmon - L:99) Provider FILE is DISABLED
0ms [I0] {1594} [DEBUG] (confmon - L:96) Provider CCCP is ENABLED
0ms [I0] {1594} [DEBUG] (confmon - L:96) Provider HTTP is ENABLED
0ms [I0] {1594} [DEBUG] (confmon - L:99) Provider MCRAW is DISABLED
0ms [I0] {1594} [TRACE] (confmon - L:292) Start refresh requested
0ms [I0] {1594} [TRACE] (confmon - L:271) Current provider is CCCP
0ms [I0] {1594} [INFO] (cccp - L:118) Requesting connection to node myserver.rest:11210 for CCCP configuration
0ms [I0] {1594} [DEBUG] (lcbio_mgr - L:416) <myserver.rest:11210> (HE=0x13a8b50) Creating new connection because none are available in the pool
0ms [I0] {1594} [DEBUG] (lcbio_mgr - L:321) <myserver.rest:11210> (HE=0x13a8b50) Starting connection on I=0x13f5f70
0ms [I0] {1594} [INFO] (connection - L:450) <myserver.rest:11210> (SOCK=0x13aa9b0) Starting. Timeout=2000000us
1ms [I0] {1594} [TRACE] (connection - L:344) <myserver.rest:11210> (SOCK=0x13aa9b0) Received completion handler. Status=0. errno=0
1ms [I0] {1594} [INFO] (connection - L:116) <myserver.rest:11210> (SOCK=0x13aa9b0) Connected
1ms [I0] {1594} [DEBUG] (connection - L:123) <myserver.rest:11210> (SOCK=0x13aa9b0) Successfuly set TCP_NODELAY
1ms [I0] {1594} [DEBUG] (lcbio_mgr - L:271) <myserver.rest:11210> (HE=0x13a8b50) Received result for I=0x13f5f70,C=0x13aa9b0; E=0x0
1ms [I0] {1594} [DEBUG] (lcbio_mgr - L:223) <myserver.rest:11210> (HE=0x13a8b50) Assigning R=0x1391c20 SOCKET=0x13aa9b0
1ms [I0] {1594} [DEBUG] (ioctx - L:101) <myserver.rest:11210> (CTX=0x13df450,unknown) Pairing with SOCK=0x13aa9b0
2ms [I0] {1594} [WARN] (negotiation - L:468) <myserver.rest:11210> (SASLREQ=0x13df1b0) SASL auth failed with STATUS=0x20
2ms [I0] {1594} [ERROR] (negotiation - L:182) <myserver.rest:11210> (SASLREQ=0x13df1b0) Error: 0x2, SASL Step Failed
2ms [I0] {1594} [ERROR] (cccp - L:133) <NOHOST:NOPORT> Got I/O Error=0x2
2ms [I0] {1594} [INFO] (confmon - L:202) Provider 'CCCP' failed
2ms [I0] {1594} [DEBUG] (confmon - L:236) Will try next provider in 0us
2ms [I0] {1594} [DEBUG] (ioctx - L:151) <myserver.rest:11210> (CTX=0x13df450,sasl) Destroying. PND=0,ENT=1,SORC=1
2ms [I0] {1594} [TRACE] (confmon - L:271) Current provider is HTTP
2ms [I0] {1594} [TRACE] (htconfig - L:388) Starting HTTP Configuration Provider 0x13b3b10
2ms [I0] {1594} [INFO] (connection - L:450) <myserver.rest:8091> (SOCK=0x13aa9b0) Starting. Timeout=2000000us
2ms [I0] {1594} [TRACE] (connection - L:344) <myserver.rest:8091> (SOCK=0x13aa9b0) Received completion handler. Status=0. errno=0
2ms [I0] {1594} [INFO] (connection - L:116) <myserver.rest:8091> (SOCK=0x13aa9b0) Connected
2ms [I0] {1594} [DEBUG] (connection - L:123) <myserver.rest:8091> (SOCK=0x13aa9b0) Successfuly set TCP_NODELAY
2ms [I0] {1594} [DEBUG] (htconfig - L:339) Successfuly connected to REST API myserver.rest:8091
2ms [I0] {1594} [DEBUG] (ioctx - L:101) <myserver.rest:8091> (CTX=0x13df1b0,unknown) Pairing with SOCK=0x13aa9b0
3ms [I0] {1594} [TRACE] (htconfig - L:235) <myserver.rest:8091> Received 185 bytes on HTTP stream
3ms [I0] {1594} [WARN] (htconfig - L:146) <myserver.rest:8091> Got 404 on config stream. Assuming terse URI not supported on cluster
3ms [I0] {1594} [TRACE] (htconfig - L:235) <myserver.rest:8091> Received 31 bytes on HTTP stream
3ms [I0] {1594} [TRACE] (htconfig - L:235) <myserver.rest:8091> Received 216 bytes on HTTP stream
3ms [I0] {1594} [ERROR] (htconfig - L:138) <myserver.rest:8091> Got 404 on config stream. Assuming bucket does not exist as we've tried both URL types
3ms [I0] {1594} [ERROR] (htconfig - L:159) <myserver.rest:8091> Got non-success HTTP status code 404
3ms [I0] {1594} [DEBUG] (ioctx - L:151) <myserver.rest:8091> (CTX=0x13df1b0,bc_http) Destroying. PND=0,ENT=1,SORC=1
3ms [I0] {1594} [INFO] (confmon - L:202) Provider 'HTTP' failed
3ms [I0] {1594} [TRACE] (confmon - L:226) Maximum provider reached. Resetting index
3ms [I0] {1594} [ERROR] (bootstrap - L:111) Failed to bootstrap client=0x1382900. Code=0xa, Message=No more bootstrap providers remain

Hi @daschl

Do you have any clues with my log?

Thanks,
Moon

from this log it looks like authentication failed! double check your bucket name / passwords!

The latter message indicates your bucket name is probably wrong :slight_smile:

Hi @daschl

Thank you for the reply.

The problem is…that’s the whole reason why I posted this question. Like my original post noted, I’m using the SAME script in two different servers. As you see, the script is really simple. Connect then open bucket.

  • I’m using the same script. I’m 99.9% positive that I’m not making any mistake with my test script. They are three lines.
  • They are connecting to the same server. I hard-coded the server address for testing
  • There is no password on the bucket.
  • Again, same bucket name. I hard-coded them.

I even tested the same script in node and php.

maybe @mnunberg, @avsej or @brett19 can help you dig deeper on this.

Alternatively I can help you triage with the java SDK and trace log enabled, we need to go to the packet level on this I think to figure out whats going on. Another option would be for you to capture a tcpdump and provide it to us so we can look with wireshark?

I will try java and tcpdump tomorrow. Thank you for the help :slight_smile:

So if you try java use a logger and enable TRACE level logging. if you do this you’ll see the raw packet infos in the logs, that’s what we need :slight_smile:

Are you sure Server B is part of the same cluster? :slight_smile: