CouhbaseCluster vs CouchbaseAsyncCluster

dhawalschumi · September 23, 2015, 9:24am

Hello,

I am using Couhbase SDK 2.1.2. I am planning to move from CouchbaseCluster to CouchbaseAsyncCluster.

The difference what found when i open a bucket. CouchbaseCluster.openBucket(username,password)
returns a Bucket and then after we used to get async bucket by doing bucket.async() and CouchbaseAsyncCluster gives me Observable when i open a bucket.

So what is the difference between bucket.async() and a bucket obtained via CouchbaseAsyncCluster i.e Observable ?

Also when i detailed down the code there is fixed threadpool(cb-core) of 2 threads for CouchbaseAsyncCluster called as disruptorExecutor which is inturn used by request and response disruptors(Something similar to LMAX disruptor).

I need a bit of explanation for how the CouchbaseCluster and CouchbaseAsyncCluster works internally. And what performance overhead/gain is expected in switching from CouchbaseCluster to CouchbaseAsyncCluster?

Thanks,
Dhawal Patel

daschl · September 23, 2015, 9:33am

@dhawalschumi the only difference is that you can deal with an observable right away. The CouchbaseCluster is actually just a sync wrapper around CouchbaseAsyncCluster, so functionally there is no difference. Both are available and you can pick what fits your needs.

The two threads you found there are internal request and response handler threads, nothing that needs your concern directly as an app developer (unless you want to learn the internals). There is no overhead/gain by choosing one or the other because of the reasoning written above.

Does that answer your question?

dhawalschumi · September 23, 2015, 9:45am

Hello,

Not quite sure. Because my cb-core threads are totally idle. And cb-io threads are live always. So does moving to AsyncCluster will help or not? And cb-core threads are used only by AsyncCluster.

Thanks,
Dhawal Patel

daschl · September 23, 2015, 11:16am

@dhawalschumi that is fine if they are idle, that means you are not putting much traffic over it. The cb-io threads are alive because the use NIO underneath which has a different model of “sleeping” on channels.

that is not quite true, the cb-core threads are used by the underling core-io module, which happens to talk to the AsyncCluster. Again, the CouchbaseCluster is just a synchronous wrapper around the async one.

What problem are you trying to solve?

dhawalschumi · September 23, 2015, 1:58pm

Thanks,

I am performing a load test over my application and i found that cb-core are running idle. So i went into the details of SDK. Even i found that the computations threadpool is idle(mainly because we dont actually do N1QL/queries over it).

So i want to basically utilise those threads. So what will be a better approach for this? Shall i start observing on computations pool when i am doing internal computations before and after fetching the data, and let the fetch work on io pool?

Thanks,
Dhawal Patel

daschl · September 23, 2015, 4:27pm

Looks like there is a different bottleneck somewhere in your system. Can you share your load test code?

dhawalschumi · September 28, 2015, 9:06am

Hi,

Below is the load test code.

LegacyDocument legacyDocument = couchManager.getDataBucket(requestType)
					.flatMap(new Func1<AsyncBucket, Observable<? extends LegacyDocument>>() {
						@Override
						public Observable<? extends LegacyDocument> call(AsyncBucket bucket) {
							return bucket.get(key, LegacyDocument.class);
						}
					}).timeout(couchbaseGetTimeOut, TimeUnit.MILLISECONDS)
					.onErrorReturn(new Func1<Throwable, LegacyDocument>() {
						@Override
						public LegacyDocument call(Throwable t1) {
							LoggerUtil.logErrorMessage(logger,
									"TimeOut Occured while SingleGet from Couchbase as LegacyDocument");
							return null;
						}

					}).map(new Func1<LegacyDocument, LegacyDocument>() {
						@Override
						public LegacyDocument call(LegacyDocument doc) {
							return doc;
						}
					}).toBlocking().single();

dhawalschumi · September 28, 2015, 9:13am

Above is the code that we are using to do gets from couchbase. The above code gets executed on cb-io threadpool. And my cb-core pool is unused/idle always. So what needs to be done for this, i want to utilize(add more traffic) on those threads as well.?

daschl · September 28, 2015, 9:27am

Okay so first, the code above (the callbacks) will be executed on the cb-computation pool. The cb-io threads will handle the actual IO load but will pass it over to the cb-compuations once done and this is where the callbacks are executed. Keep in mind that its not about utilizing specific threads as much as possible. cb-core are not doing much work and if they are close to idle its fine. Its just an indication that this codepath is not doing the bulk of the work.

Does getDataBucket always return the same Bucket instance, so do you share it across your application?
What purpose does the last map serve?

Btw, in your case here there is absolutely no benefit over executing it with a blocking request right away (since you are loading the doc and then blocking on it). If your app code really only does this, there is no need to tap into the async world.

Since you are basically blocking on each call, how many threads are you firing this stuff into? (your app threads)
If you want to achieve better batching, you can think about using Observable.from(), pass a list of IDs and return the list of IDs. This is one of the easiest ways to get better performance.

dhawalschumi · September 28, 2015, 10:20am

Hi,

getDataBucket always returns same bucket instance.
The app fires around 200 threads under peak load. Average 120-140 threads
We get list of keys to fetch data. And we return a map of key value. So the return map is the map of request keys and values against the keys.

Now i think we are getting to the bottom of the issue.

Observations -

In my current system(Sync Cluster) the RxComputationThreadPool and cb-computations threadpool is unused and both are in idle/parked state.

As per your previous comment i am not able to see any kind of work done by the computation pool.

Thanks,
Dhawal

daschl · September 28, 2015, 10:44am

Can you either put breakpoints or print inside the callback threads where they are executed (Thread.currentThread().getName())?
Also, are you configuring the CouchbaseEnvironment differently than the default settings somehow?

dhawalschumi · September 28, 2015, 11:29am

Hi,

So i think the Sync Cluster is not using the computations pool.

Couchbase Environment Config -

couchbaseEnvironment = DefaultCouchbaseEnvironment.builder()
.kvEndpoints(kvServiceEndpoints) - 4 EndPoints
.bufferPoolingEnabled(false)
.ioPoolSize(ioPoolSize) - 4 threads
.computationPoolSize(cmoputationPoolSize) - 4 Threads
.retryStrategy(FailFastRetryStrategy.INSTANCE)
.keepAliveInterval(keepAliveInterval * 1000).build();

Observations after printing out the thread names -

AsyncCluster is doing computations on the cb-computations pool.
SyncCluster is doing computations on the caller thread.

Below is the code that we used for the Sync Cluster

LegacyDocument legacyDocument = couchManager.getDataBucket(requestType).async()
.get(key, LegacyDocument.class)
.timeout(couchbaseGetTimeOut, TimeUnit.MILLISECONDS)
.onErrorReturn(new Func1<Throwable, LegacyDocument>() {
@Override
public LegacyDocument call(Throwable t1) {
LoggerUtil.logErrorMessage(logger,“TimeOut Occured while SingleGet from Couchbase as LegacyDocument”);
return null; }
})
.map(new Func1<LegacyDocument, LegacyDocument>() {
@Override
public LegacyDocument call(LegacyDocument doc) {
return doc;
}
})
.toBlocking()
.single();
if(legacyDocument != null){
response = (String)legacyDocument.content();
}

The initial finding is that the .async() wrapper is not using the computations pool. As RxJava computes the callbacks on the defined computations pool(cb-compuations in our case).

Thanks,
Dhawal Patel