Couchbase Threads are in BLOCKED state

We are doing the load testing of our application , which uses the Couchbase DB. We are taking the thread dumps at regular intervals. In all the thread dumps after making call to couchbase threads are in BLOCKED state. We are not understating this behaviour

Below is the configuration of couchbase environment

  DefaultCouchbaseEnvironment.builder().kvTimeout(5000).connectTimeout(7500)
                .viewTimeout(7500).queryTimeout(7500).disconnectTimeout(10000)
                .retryStrategy(BestEffortRetryStrategy.INSTANCE).build();

And we are doing simple CRUD operations like

                JsonDocument doc =
                        JsonDocument.create("1236",
                                JsonObject.create().put("name", "test"));
bucket.insert(doc);

Below is the stackstrace what we obtained from Jstack

Thread 4257: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
 - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long) @bci=20, line=215 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(int, long) @bci=139, line=1037 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(int, long) @bci=25, line=1328 (Compiled frame)
 - java.util.concurrent.CountDownLatch.await(long, java.util.concurrent.TimeUnit) @bci=10, line=277 (Compiled frame)
 - com.couchbase.client.java.util.Blocking.blockForSingle(rx.Observable, long, java.util.concurrent.TimeUnit) @bci=32, line=72 (Compiled frame)
 - com.couchbase.client.java.CouchbaseBucket.get(java.lang.String, long, java.util.concurrent.TimeUnit) @bci=17, line=118 (Compiled frame)
 - com.couchbase.client.java.CouchbaseBucket.get(java.lang.String) @bci=9, line=113 (Compiled frame)

All our documents are in KBs. We have a cluster with 3 nodes. Java sdk version- 2.3.4

Please guide us in understanding the behaviour.

Thanks

From what I see there, that’s actually fairly normal. You’re using the synchronous API, which blocks during IO. If you want to not block that thread, there is an asynchronous API.

Part of the reason I say “it’s fairly normal” is that with nearly any process on a computer, when you profile you’ll find the thread waiting on IO. That’s because IO is one of the slower things on computers. In our SDK, the IO is handled in a separate thread for efficiency, so the thread is blocking waiting for that IO to be completed by a different thread.

If you are trying to achieve higher throughput or more efficient resource usage, you may want to see if you can add parallelism. That’s where the async API will help you.

Thanks for the input.
We changed the way we were using SDK, now everything made async.
TPS of the application is raised to very good level.

Below is the code snippet , please suggest further improvements can be done

For upserting

 Observable.just(metaId)
                    .flatMap(docId -> bucket.async()
                            .upsert(JsonDocument.create(metaId, JsonObject.fromJson(docContent))))
                    .toBlocking().singleOrDefault(null); 

For checking exists

Observable.just(formattedId).flatMap(docId -> bucket.async().exists(docId))
                .toBlocking().singleOrDefault(Boolean.FALSE);

For removing

 Observable.just(counterDocId).flatMap(documentId -> bucket.async().counter(documentId, -1))
                .toBlocking().singleOrDefault(null);

For bulk operation (get)

Observable.from(metaIds).flatMap(id -> bucket.async().get(id)).map(document ->
        {
            //code to converting json doc to entity
            
        }).toList().toBlocking().singleOrDefault(Collections.emptyList());

Note that everything but the bulk get is exactly the same as our blocking operations in the first place, so I expect only a perf improvement on the bulk get (since this is where you are executing N ops in parallel and benefit from the async execution)

One other thing you can improve - instead of using JsonObject.fromJson(docContent) you can use the RawJsonDocument which accepts the docContent as a raw JSON string in the first place. So you save an unnecessary encoding and decoding step, you can use the same for get of course as well (bucket.async.get(id, RawJsonDocument.class). less gc, better perf :slight_smile:

1 Like

Hi,

We are currently on Java SDK Version 2.5.9 and observing hang on the following stacktrace:

    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
	at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
	at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:73)
	at com.couchbase.client.java.CouchbaseBucket.query(CouchbaseBucket.java:659)
	at com.couchbase.client.java.CouchbaseBucket.query(CouchbaseBucket.java:581)

Observing degradation following by increasing parallelism and seems that some limited threads on SDK is configured. Could you please explain what is the policy for the IO threads creation? Is it created on demand, like each IO demanding thread will open it’s own IO thread in SDK or there is some another driver policy?

Hi @medvedev,

The IO threads are limited to the number of cores in the system and parallelism more than the system’s capacity would not show better results. IO-pool threads by default are shared with kv and other services, it is also possible to split the thread pool to different services by setting on the environment builder and see if it helps, also epoll event loop group could have better results in linux systems.

kvIoPool(new NioEventLoopGroup(ioPoolSize()/2, DefaultThreadFactory("cb-kv-io", true), .. )
queryIoPool(new NioEventLoopGroup(ioPoolSize()/2, DefaultThreadFactory("cb-query-io", true), .. )