KV Pool not using all event loop threads

Hi,

Machine on which running the test:

RHEL 7.1 
CPU: 1 Core 
RAM: 4 GB

I am using the JAVA SDK 2.7.16, Vert.x and Guice. I am using event loop for KV and IO pool.

        log.info("{}: {}","Couchbase native transport", Epoll.isAvailable());
        ThreadFactory ioPoolThreadFactory = new DefaultThreadFactory("cb-io", true);
        EventLoopGroup ioPoolEventLoop = Epoll.isAvailable() ? new EpollEventLoopGroup(DefaultCoreEnvironment.IO_POOL_SIZE > 3 ? DefaultCoreEnvironment.IO_POOL_SIZE : 3, ioPoolThreadFactory) : new NioEventLoopGroup(DefaultCoreEnvironment.IO_POOL_SIZE > 3 ? DefaultCoreEnvironment.IO_POOL_SIZE : 3, ioPoolThreadFactory);
        ThreadFactory kvIoPoolThreadFactory = new DefaultThreadFactory("cb-kv-io", true);
        EventLoopGroup kvIoPoolEventLoop = Epoll.isAvailable() ? new EpollEventLoopGroup(DefaultCoreEnvironment.KEYVALUE_ENDPOINTS > 3 ? DefaultCoreEnvironment.KEYVALUE_ENDPOINTS : 3, kvIoPoolThreadFactory) : new NioEventLoopGroup(DefaultCoreEnvironment.KEYVALUE_ENDPOINTS > 3 ? DefaultCoreEnvironment.KEYVALUE_ENDPOINTS : 3, kvIoPoolThreadFactory);
        couchbaseEnvironment = couchbaseEnvironment = DefaultCouchbaseEnvironment
                .builder()
                .autoreleaseAfter(Duration.ofMinutes(5).toMillis())
                .ioPool(ioPoolEventLoop, new IoPoolShutdownHook(ioPoolEventLoop))
                .kvIoPool(kvIoPoolEventLoop, new IoPoolShutdownHook(kvIoPoolEventLoop))
                .callbacksOnIoPool(true)
                .build();

I can see in the logs that it is creating three KV threads.

09:01:16.088 [cb-kv-io-3-1] INFO  com.couchbase.client.core.node.Node - Connected to Node pdcbcl110035.test.com
09:01:16.294 [cb-kv-io-3-1] INFO  c.c.c.c.config.ConfigurationProvider - Selected network configuration: default
09:01:16.359 [cb-kv-io-3-1] INFO  c.c.c.c.config.ConfigurationProvider - Opened bucket pace
09:01:16.365 [cb-kv-io-3-1] INFO  com.couchbase.client.core.node.Node - Disconnected from Node pdcbcl110035.test.com
09:01:16.786 [cb-kv-io-3-2] INFO  com.couchbase.client.core.node.Node - Connected to Node lpdospdbb50430.test.com
09:01:16.835 [cb-kv-io-3-3] INFO  com.couchbase.client.core.node.Node - Connected to Node lpdospdbb50431.test.com
09:01:16.871 [cb-kv-io-3-1] INFO  com.couchbase.client.core.node.Node - Connected to Node lpdospdbb50432.test.com

I am getting the details of the document from the bucket using :

public Flowable<JsonDocument> get(String key, String traceId) {
    final long startTime = System.currentTimeMillis();
    return RxJavaInterop.toV2Flowable(primaryAsyncBucket.get(key).doOnNext(t -> log.info("name=\"get\" event_id=\"{}\" key=\"{}\" timeTaken={}", traceId, key, System.currentTimeMillis() - startTime)));
}

I ran the test with 100 users using wrk and for all the requests, the logs shows the same eventloop thread being used:

09:17:45.305 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=23
09:17:45.305 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=23
09:17:45.305 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=23
09:17:45.305 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=23
09:17:45.305 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=23
09:17:45.306 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=24
09:17:45.306 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=15
09:17:45.306 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=15
09:17:45.306 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=15
09:17:45.306 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=14
09:17:45.306 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=14
09:17:45.306 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=14
09:17:45.306 [cb-kv-io-3-2] INFO  c.a.p.p.couchbase.CouchbaseRead - name="get" event_id="test_5" key="ARNG_CMP_ARR::440873b7-de9e-4adb-aeb7-ddd6cda41dd6::V1" timeTaken=14

Server log KV thread count

[pace]$ grep 'cb-kv-io-3-1' out.log | wc -l
5
[pace]$ grep 'cb-kv-io-3-2' out.log | wc -l
57093
[pace]$ grep 'cb-kv-io-3-3' out.log | wc -l
1

Is this the expected behaviour ?

Can we get someone to please help on this.

@graham.pople @daschl

@himanshu.mps to how many couchbase server nodes are you connecting to and how many services (only kv?) are you using? We do not control the selecting of the event loop directly, this is done by netty under the covers.

@daschl I have a 3 node cluster with 8 cpu and 64 gb ram at each node. I am only doing KV operations. The question is more on the lines that if there are three or more epoll threads created, the bundled Netty should use round robin or some algorithm to distribute the load. The issue I am facing is that one event looop is able to give 60 msec for 100 concurrent users. As the number of users increase, the kv starts taking long time.

If I fetch different keys, all event loop are getting utilizied.