Latency (16s) on Client Responses

We are support customers…

We’re seeing some integration issues (under load) where our clients are not seeing a response for 16s. Doing some performance analysis, we’ve got the following trace:

CouchbaseClient.KeyExists:unknown (0ms self time, 16349 ms total time)
CouchbaseClient.Observe:unknown (0ms self time, 16349 ms total time)
ObserveHandler.HandleMasterOnlyInCache:unknown (0ms self time, 16349 ms total time)
CouchbaseNode.ExecuteObserveOperation:unknown (0ms self time, 16349 ms total time)
CouchbaseNode.Execute:unknown (0ms self time, 16349 ms total time)
SocketPool.Release:unknown (0ms self time, 16349 ms total time)
Monitor.Enter:unknown (16349ms self time, 16349 ms total time)

It appears that the code is stuck on the same lock as SocketPool.Dispose() which does a loop of 2^1 + … + 2^13 = 16.382s. That likely explains the time. In any event, it looks like that SocketPool should never be Dispose() in normal processing - our service is not going down.

Can you please provide additional info, especially:

  • Which version are you using exactly.
  • What kind of load are you applying?
  • Are you doing any kind of rebalance or is the cluster in a unhealthy state? (node failed over,…)

Thank you!

We are using Client 1.3.9 (also saw this issue with 1.3.8).

Load is heavy (frequency) but light (size of responses). We are hitting Couchbase approximately 50-100 times per second. The data we are pulling from Couchbase is fairly static and can be large, so the code preference is to verify the existing (locally cached) data’s CAS value before doing a pull of the full data element, which can vary greatly in size (2k - 600k).

We are NOT doing any sort of rebalance / failover / etc. https://www.couchbase.com/issues/browse/NCBC-655 describes similar behavior, but we are not in any Rebalance / health-related operations.