Client-Side timeout after bucket has been open for some time

I am using v2.3.2

I have several node API servers running and after a period of idle time when a request comes in I am getting the error Client-Side timeout exceeded for operation. Inspect network conditions or increase the timeout when an bucket.get() is performed.

In these APIs the bucket is opened on the first request and then cached for use in subsequent requests.

I have enabled debugging LCB_LOGLEVEL=5 and the log is:

6193764ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 2500 ms. This is not an error

// API Request is received

Tue, 11 Apr 2017 00:53:24 GMT sess:middleware Valid JWT:  { session: '1dd08ad66780a76482d7893dc6229dabec0023fdf9b05282f6c64a2ce693e104f2db57c65579aead0d1472e6988a26785111f2b5c2a7b726ad841d08ea1401c2',
  iat: 1491871993 }
6196267ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 1249 ms. This is not an error

// Attempt to get document from the bucket

6197517ms [I0] {1} [WARN] (server - L:343) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Failing command (pkt=0x7faa380639c0, opaque=18, opcode=0x0) with error LCB_ETIMEDOUT (0x17)
Tue, 11 Apr 2017 00:53:26 GMT sess:middleware Invalid session:  Could not find: session::1dd08ad66780a76482d7893dc6229dabec0023fdf9b05282f6c64a2ce693e104f2db57c65579aead0d1472e6988a26785111f2b5c2a7b726ad841d08ea1401c2 CouchbaseError: Client-Side timeout exceeded for operation. Inspect network conditions or increase the timeout


6197524ms [I0] {1} [TRACE] (confmon - L:252) Refreshing current cluster map
6197524ms [I0] {1} [ERROR] (server - L:418) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Server timed out. Some commands have failed
6197524ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 2500 ms. This is not an error
6197524ms [I0] {1} [INFO] (confmon - L:145) Not applying configuration received via CCCP. No changes detected. A.rev=23, B.rev=23
6197524ms [I0] {1} [TRACE] (confmon - L:239) Attempting to retrieve cluster map via CCCP
6197524ms [I0] {1} [INFO] (cccp - L:137) Re-Issuing CCCP Command on server struct 0x7faa3800e240 (couchbase-dev:11210)
6199524ms [I0] {1} [ERROR] (cccp - L:160) <NOHOST:NOPORT> Could not get configuration: LCB_ETIMEDOUT (0x17)
6199524ms [I0] {1} [INFO] (confmon - L:177) Provider 'CCCP' failed
6199524ms [I0] {1} [TRACE] (confmon - L:201) Maximum provider reached. Resetting index
6200023ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 1 ms. This is not an error
6200024ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 0 ms. This is not an error
6200024ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 0 ms. This is not an error
6200024ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 0 ms. This is not an error
6200024ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 0 ms. This is not an error
6200024ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 0 ms. This is not an error
6200024ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 0 ms. This is not an error
6200024ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 0 ms. This is not an error
6200024ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 0 ms. This is not an error
6200024ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 0 ms. This is not an error
6200024ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 0 ms. This is not an error
6200024ms [I0] {1} [TRACE] (server - L:422) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Scheduling next timeout for 0 ms. This is not an error
6200024ms [I0] {1} [WARN] (server - L:343) <couchbase-dev:11210> (SRV=0x7faa3800e240,IX=0) Failing command (pkt=0x7faa380639c0, opaque=19, opcode=0xb5) with error LCB_ETIMEDOUT (0x17)
6200024ms [I0] {1} [INFO] (bootstrap - L:164) Not requesting a config refresh because of throttling parameters. Next refresh possible in 7499ms or 99 errors. See LCB_CNTL_CONFDELAY_THRESH and LCB_CNTL_CONFERRTHRESH to modify the throttling settings

Perhaps there’s a firewall somewhere that’s black holing idle connections? Since this is nodejs, a quick workaround is to issue a simple command every few seconds or so (or better yet, a stat command which will broadcast to all the servers). This will ensure that there’s no timeout.

Thanks. Implemented a polling process to keep the connection alive and is working.
Appears there is an issue with the docker overlay network and idle connections.