Couchbase java client v2.2.3 performance issue

query
connections

#1

Hi,

We are running a CB java client v1.4.5 and try to upgrade to v2.2.3. However, we observe some performance degradation for v2.2.3. Specifically, we see 20% throughput drop for write-only, read-only, and 90% read - 10% write workloads.

For the v2.2.3, the API bucket.get, bucket.insert, and bucket.upsert are used to send the workloads. We run a multi-threaded application and ensure the connection is shared by all the threads instead of one thread per connection.

Has anyone done similar performance tests? What factors might cause this degradation?

Thanks a lot.


#2

Can you tell us a bit about the code-- are you using asynchronous operations? Also, a bit more about the workload generation?

First, about the code, note that in the 1.4 client some operations like .get() are synchronous where other operations like .set() are asynchronous (returning a Future). If you port from 1.4 to 2.2 and replace simple .set() calls with simple .upsert() calls, you’re going from asynchronous to synchronous operations. and that would have a big difference.

In 2.x, we changed to separating the synchronous and asynchronous APIs pretty cleanly because the 1.x would frequently confuse people. The new SDK is more clear about what you’re doing.

Second, we have been doing our own performance testing and one thing that we’ve noted is that compared to the 1.4 client we do have some intermittently higher latencies. The median and 95th percentile latencies are pretty much the same.

The reason I asked about this in context of workload is that if the workload is a set of simple loops doing synchronous operations and occasionally hits these latencies, you’d see a drop in throughput. If, on the other hand, you’re doing event driven operations or you’re running a more normal app server with an interactive workload of a lot of users occasionally making calls, the overall difference would be negligible since it’d be a small bit of latency that occasionally affects one user. To give you an idea of this difference, it’s 595µsec compared to 797µsec at 99th percentile but that gets up to 36ms at max. These are things a real person probably wouldn’t notice, but a tight loop in a test could see a big swing in throughput.

We have some thoughts in mind on how to get back to consistently low latencies, but it will probably take a bit of time.


#3

Thanks for the prompt reply.

We are using synchronous calls in v2.2.3, async call in v1.4.5.

One thing I am not sure the performance is even in a asynchronous call in v1.4.5, we wait for the return of results like the following. I guess if I wait for the results returned from aync call, it is the same as I make a sync call.

=============== below is our environment ===============
We are using the YCSB benchmark framework. The v1.4.5 and v2.2.3 are wrapped as plugins in YCSB. And yes, it is essentially a loop that sends synchronous requests.

We are using the synchronous APIs. 10 threads send requests as much as possible.


#4

Here is the code for using those clients.

v1.4.5

v2.2.3