getBulk performance issue


#1

Hello,

I’m experiencing a strange behaviour of the getBulk() that has a great impact on performances.

I’m working with a 3 nodes Couchbase (V2.5.1 state id: 033-010-125) cluster. I’ve created a bucket with a per node ram quota of 2048MB with 2 replica copies (so basically all the documents are replicated on the each node).

I’m running couchbase-client V1.4.8 using JDK1.7.0_65.

I’ve created eight small documents (~200Bytes) and I’ve experienced the following performance issue:

List<URI> baseList = new ArrayList<URI>();
baseList.add(new URI("http://myCbServer_node1:8091/pools"));

CouchbaseClient client = null;
try {
	client = new CouchbaseClient(baseList,"myBucket","","");
	
	long t0 = System.currentTimeMillis();
	client.get("someKey");
	System.out.println(System.currentTimeMillis() - t0);
} finally {
	if(client != null) {
		client.shutdown();
	}
}

the client.get(“someKey”) here returns in 1 or 2ms (< 10ms) which is what I’m expecting. However if I simply do:

long t0 = System.currentTimeMillis();
client.getBulk(new String[]{"someKey"});
System.out.println(System.currentTimeMillis() - t0);

the client.getBulk(new String[]{“someKey”}) returns in 200ms which is clearly unacceptable.

I’ve then tried something like this:

long t0 = System.currentTimeMillis();
client.getBulk(new String[]{"someKey","someOtherKey"});
System.out.println(System.currentTimeMillis() - t0);

and I get 1 to 3ms response time…

This behaviour is deterministic for a particular key (it happens every time). According to what I saw on the internet, my cluster doesn’t seem to be in a bad shape.

Here are the client logs in DEBUG:

for the client.get() (9ms):

17:38:24.902 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - No selectors ready, interrupted: false
17:38:24.902 [main] DEBUG c.c.client.CouchbaseConnection - Added Cmd: 0 Opaque: 24 Key: CONF_local_AITA-FMG to {QA sa=..., #Rops=0, #Wops=0, #iq=1, topRop=null, topWop=null, toWrite=0, interested=0}
17:38:24.902 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Handling queue
17:38:24.903 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - Setting interested opts to 4
17:38:24.903 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Done dealing with queue.
17:38:24.903 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selecting with delay of 1000ms
17:38:24.903 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selected 1, selected 1 keys
17:38:24.903 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Handling IO for:  sun.nio.ch.SelectionKeyImpl@1e86471 (r=false, w=true, c=false, op={QA sa=..., #Rops=0, #Wops=1, #iq=0, topRop=null, topWop=Cmd: 0 Opaque: 24 Key: someKey, toWrite=0, interested=4})
17:38:24.903 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.binary.GetOperationImpl - Transitioned state from WRITE_QUEUED to WRITING
17:38:24.903 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - After copying stuff from Cmd: 0 Opaque: 24 Key: someKey: java.nio.DirectByteBuffer[pos=43 lim=16384 cap=16384]
17:38:24.904 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.binary.GetOperationImpl - Transitioned state from WRITING to READING
17:38:24.904 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - Finished writing Cmd: 0 Opaque: 24 Key: someKey
17:38:24.904 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - Wrote 43 bytes
17:38:24.904 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - Setting interested opts to 1
17:38:24.904 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Done dealing with queue.
17:38:24.904 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selecting with delay of 1000ms
17:38:24.905 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selected 1, selected 1 keys
17:38:24.905 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Handling IO for:  sun.nio.ch.SelectionKeyImpl@1e86471 (r=true, w=false, c=false, op={QA sa=..., #Rops=1, #Wops=0, #iq=0, topRop=Cmd: 0 Opaque: 24 Key: someKey, topWop=null, toWrite=0, interested=1})
17:38:24.905 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Read 260 bytes
17:38:24.905 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.binary.GetOperationImpl - Reading 24 header bytes
17:38:24.905 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.binary.GetOperationImpl - Reading 236 payload bytes
17:38:24.907 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.binary.GetOperationImpl - Transitioned state from READING to COMPLETE
17:38:24.907 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Completed read op: Cmd: 0 Opaque: 24 Key: someKey and giving the next 0 bytes
17:38:24.907 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - Setting interested opts to 0

and the client.getBulk() (216ms):

17:38:24.908 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Done dealing with queue.
17:38:24.908 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selecting with delay of 1000ms
17:38:24.910 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - No selectors ready, interrupted: false
17:38:24.911 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Handling queue
17:38:24.911 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - Setting interested opts to 4
17:38:24.911 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Done dealing with queue.
17:38:24.911 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selecting with delay of 1000ms
17:38:24.911 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selected 1, selected 1 keys
17:38:24.911 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Handling IO for:  sun.nio.ch.SelectionKeyImpl@1e86471 (r=false, w=true, c=false, op={QA sa=..., #Rops=0, #Wops=1, #iq=0, topRop=null, topWop=Cmd: -1 Opaque: -1 Keys: CONF_local_AITA-FMG, toWrite=0, interested=4})
17:38:24.912 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.binary.MultiGetOperationImpl - Transitioned state from WRITE_QUEUED to WRITING
17:38:24.912 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - After copying stuff from Cmd: -1 Opaque: -1 Keys: someKey: java.nio.DirectByteBuffer[pos=67 lim=16384 cap=16384]
17:38:24.912 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.binary.MultiGetOperationImpl - Transitioned state from WRITING to READING
17:38:24.912 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - Finished writing Cmd: -1 Opaque: -1 Keys: someKey
17:38:24.912 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - Wrote 67 bytes
17:38:24.912 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - Setting interested opts to 1
17:38:24.912 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Done dealing with queue.
17:38:24.912 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selecting with delay of 1000ms
17:38:24.913 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selected 1, selected 1 keys
17:38:24.914 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Handling IO for:  sun.nio.ch.SelectionKeyImpl@1e86471 (r=true, w=false, c=false, op={QA sa=..., #Rops=1, #Wops=0, #iq=0, topRop=Cmd: -1 Opaque: -1 Keys: someKey, topWop=null, toWrite=0, interested=1})
17:38:24.914 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Read 260 bytes
17:38:24.914 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.binary.MultiGetOperationImpl - Reading 24 header bytes
17:38:24.914 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.binary.MultiGetOperationImpl - Reading 236 payload bytes
17:38:24.914 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - Setting interested opts to 1
17:38:24.914 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Done dealing with queue.
17:38:24.914 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selecting with delay of 1000ms
17:38:25.122 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selected 1, selected 1 keys
17:38:25.122 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Handling IO for:  sun.nio.ch.SelectionKeyImpl@1e86471 (r=true, w=false, c=false, op={QA sa=..., #Rops=1, #Wops=0, #iq=0, topRop=Cmd: -1 Opaque: -1 Keys: someKey, topWop=null, toWrite=0, interested=1})
17:38:25.122 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Read 24 bytes
17:38:25.122 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.binary.MultiGetOperationImpl - Reading 24 header bytes
17:38:25.122 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.binary.MultiGetOperationImpl - Reading 0 payload bytes
17:38:25.123 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.binary.MultiGetOperationImpl - Transitioned state from READING to COMPLETE
17:38:25.123 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Completed read op: Cmd: -1 Opaque: -1 Keys: someKey and giving the next 0 bytes
17:38:25.123 [Memcached IO over {MemcachedConnection to ...}] DEBUG n.s.m.p.b.BinaryMemcachedNodeImpl - Setting interested opts to 0
17:38:25.123 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Done dealing with queue.
17:38:25.123 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selecting with delay of 1000ms

The issue seems located here:

17:38:24.914 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selecting with delay of 1000ms
17:38:25.122 [Memcached IO over {MemcachedConnection to ...}] DEBUG c.c.client.CouchbaseConnection - Selected 1, selected 1 keys

which seems related to an NIO select in MemcachedConnection (the same as the get() operation?):

...
getLogger().debug("Selecting with delay of %sms", delay);
assert selectorsMakeSense() : "Selectors don't make sense.";
int selected = selector.select(delay);
...
getLogger().debug("Selected %d, selected %d keys", selected,

However if we look at the logs carrefully it seems the document is returned properly with a good response time but an extra query with an empty payload is performed after which takes 200ms.

Do you know what could went wrong? and how to investigate this issue?

Regards


#2

In your code there is certain startup time involved, since the client needs to create connections and everything related.

Can you please run the two ops one after each other in a loop with a delay in between and extract the numbers? In my tests bulk loading was not slower than the other.
Basically something along the lines of

    final CouchbaseClient client = new CouchbaseClient(hosts, bucket, password);

    while(true) {
        Thread.sleep(1000);

        long st = System.nanoTime();
        client.get("foo");
        long en = System.nanoTime();

        System.err.println(TimeUnit.NANOSECONDS.toMicros(en - st));

        st = System.nanoTime();
        client.getBulk(new String[]{"foo"});
        en = System.nanoTime();

        System.err.println(TimeUnit.NANOSECONDS.toMicros(en - st));
    }

Please let me know what numbers you get on a longer timeframe.


#3

We have similar issue.
We have cluster for 5 nodes and when run test from app server response times varies considerably during load test.
Initially response times are in 10 - 50 milliseconds but as the load increases we see that getting document from Couchbase increases.
To check if response times are consistent we wrote small test code as below and ran it as single test.

Following is test code written.

public boolean checkIfExist(String key) {
	for(int i=0; i<50000; i++) {
        if (!tagRepository.exists(key)) { // We use spring Repository
            throw ApiServerException.notFound(ApiErrorCode.KEY_NOT_FOUND, "Key with id=%s not found", key);
        }
	}
    return true;
}

Response times vary from 90 ms to 4000 ms.
We use Spring data as wrapper around Couchbase client.
Spring data version is 1.2.1 release.
Couchbase client 1.4.4
Couchbase Server 3.0.2

We do create connections to client during startup using spring configuration.

When we checked with our DBA team they shared the CB stats and they all show there is nothing wrong with the Couchbase cluster itself as all Get calls were served within micro to milliseconds.
Is this issue with Client SDK ? Or is there any tweak needed on tcp layer on app servers ?


#4

Well, your code is different from the original poster because exists on the repository does a single-get lookup and not a bulk operation.

With the information you’ve provided it is very hard to determine what’s going on. I recommend you do try the vanilla couchbase SDK as well to see if it is in the spring module or somewhere else. Also, a high candidate is a flaky network, you might want to look much closer at that. Also, run with GC logs enabled and see if you have higher GC pauses which give you this variability.

If you have more information to share please do!


#5

Thanks for your reply, I’ve performed the test in a loop in a delay and get the following results:

22317
218174
1584
201142
1626
202431
1537
199914
2479
203590
1414
201280
1225
204847
1309
209374
1275
209520
1235
210954
2440
205339

So the issue doesn’t seem related to statup time.

Any ideas?


#6

@kenny_strazz are you able to run it against a local CB server and see if you get the same issues? Trying to nail it down to some root causes.


#7

I did not have the issue locally last time I’ve checked however I was using a single node so I’m not sure this is relevant.

I’ve tried with a CouchBase 3.0.2 cluster (with the same client library) and response time are much better (average < 5ms) and get and getBulk have similar response time hopefully so I think I’ll will upgrade my clusters.

I’ll try to setup a local cluster with 3 nodes and see if I can reproduce the issue with 2.5.1 and I’ll let you know.