TimeOutException on bucket.get()

For a simple bucket .get () , we are get TimeOutExceptions.

j ava.lang.RuntimeException: java.util.concurrent.TimeoutException
at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:75) ~[java-client-2.2.5.jar:na]
at com.couchbase.client.java.CouchbaseBucket.get(CouchbaseBucket.java:129) ~[java-client-2.2.5.jar:na]
at com.couchbase.client.java.CouchbaseBucket.get(CouchbaseBucket.java:124) ~[java-client-2.2.5.jar:na]

We increased the kvTimeout to 10 secs.
CouchbaseEnvironment couchbaseEnvironment = DefaultCouchbaseEnvironment
.builder()
.autoreleaseAfter(couchbaseAutoReleaseAfter)
.kvTimeout(TIMEOUT_10_SEC)
.build();

We are still getting Timeouts.

What can you say about the underlying environment? Are the app servers and the cluster in the same LAN? Are any of them VMs? In particular, are any on cloud instances like EC2 micro instances?

Another recommendation is that you can increase the log level if you’d like to see details as things progress. That may help you identify the issue.

Timeouts are happening on the production environment.
Where should the logs be increased?

Can you reproduce the problem in isolation and capture a detailed SDK log of it?
See http://developer.couchbase.com/documentation/server/4.5/sdk/java/collecting-information-and-logging.html

You’d have to configure logging level to be the most detailed, TRACE in Log4J for instance.

Since it is on the production environment, we dont want to increase the log level. Thinking off on a tangent,
we are using
bucket
.async()
.upsert
.flatmap
.toList()
.toBlocking
.single

Does toBlocking cause the time out?
I tried removing the toBlocking , and used a forEach(the documenatation says forEach is blocking). But i see that the performance is higher with forEach.(tested on loacl env) Can i substitute forEach instead of toBlocking?Am i missing something?

ForEach in this case is non blocking and asynchronous, meaning that the rest of your code will continue executing immediately…
What is the flatmap used for?

bucket.async().
from(keyMap)
. flatmap(upsert )
.flatmap(update N1Ql quey)
.flatmap(upsert)
.groupBY()
.forEach()

When i removed toBlocking (), i had seen the the record were not upserted in Couchbase(checked in Couchbase in 2 mins after the code ran). That was expected, because it was async. When i added the groupBy and forEach, i could see the reocrds are inserted imediately.I thought it was because the forEach was blocking.
So is using forEach better than toBlocking? Does forEach have any other catches?

The groupBy seems to group all the records, so does that mean all my keys from the map are processed by couchbase by the time the forEach is done ?

I need to have the resultSet whether the upsert was a sucess or failure store it and use it later to store the result in Couhbase.

I don’t think groupBy is used correctly / doing what you think it’s doing here. It’ll create an Observable of GroupedObservables according to a criterion that produces a “key” for each group.
You then need to subscribe to all of these GroupedObservable, on which you can call getKey() to see which group the represent.
The caveat is that GroupedObservable can cache values, so you need to subscribe to all of them even if you’re not interested in a particular group.

However, they will emit values as they come, won’t wait for all values to be emitted before emitting something.

This short example demonstrates the use of groupBy (which, again, I don’t think you need):

Observable.interval(0, 1, TimeUnit.SECONDS)
        .take(10)
        .groupBy(i -> i % 2)
        .flatMap(group -> group.getKey() == 0L ? group : group.ignoreElements())
        .subscribe(System.out::println,
            e -> e.printStackTrace(), () -> System.out.println("DONE"));

Thread.sleep(10000);

This produces two groups: even numbers and odd numbers. It then subscribes to the two groups (using flatMap) but ignores the “odd” group’s elements. Prints out:

0 //immediately
2 //after 2 seconds (1 is emitted after 1sec but ignored in the flatMap)
4
6
8 //after 8 seconds...
// 9 is emitted after 9s, ignored, and then all the streams complete
DONE

:warning: Note that forEach is just an alias for the more commonly used subscribe!

Trying to reproduce the problem and capture the logs in the lower environments.

We use logback . Using “TRACE” level for the class that is sets the DefaultCouchbaseEnvironment should be good?

Do you also want to add runtimeMetricsCollectorConfig ,
Also the .defaultMetricsLoggingConsumer(false, CouchbaseLogLevel.TRACE).

Yes, TRACE level is good. Using .defaultMetricsLoggingConsumer(true, CouchbaseLogLevel.TRACE) on the environment could also dump useful information.
Hopefully you can reproduce your original issue in 1 quick run. Don’t forget to attach the code you used for testing :wink:

1 )We have the Network metrics and Run Time metrics printed every one hour. In one the clusters , we see “UpsertRequests” only for 3 nodes out of the 4 CoubaseServer nodes.In the other 2 clusters, we see “UpsertRequests” for 4 nodes. is that something to be looked at?

2)We see this log often. Is it something to be looked at?
08:59:40.664 [cb-core-3-2] DEBUG c.c.c.c.c.refresher.CarrierRefresher - Could not fetch config from bucket “bucket-name " against “server-name”/162.140.107.13”.
com.couchbase.client.core.RequestCancelledException: Could not dispatch request, cancelling instead of retrying.
at com.couchbase.client.core.retry.RetryHelper.retryOrCancel(RetryHelper.java:51) [core-io-1.2.5.jar:na]
at com.couchbase.client.core.RequestHandler.onEvent(RequestHandler.java:206) [core-io-1.2.5.jar:na]
at com.couchbase.client.core.RequestHandler.onEvent(RequestHandler.java:76) [core-io-1.2.5.jar:na]
at com.couchbase.client.deps.com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128) [core-io-1.2.5.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1157) [na:1.7.0]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627) [na:1.7.0]
at com.couchbase.client.deps.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) [core-io-1.2.5.jar:na]
at java.lang.Thread.run(Thread.java:801) [na:1.7.0]
08:59:40.664 [putations-50] DEBUG c.c.c.c.config.ConfigurationProvider - New Bucket “bucket-name” config proposed.
08:59:40.665 [putations-50] DEBUG c.c.c.c.c.refresher.CarrierRefresher - Completed refreshing config for bucket “bucket-name”

08:59:40.664 [cb-core-3-2] DEBUG c.c.c.c.c.refresher.CarrierRefresher - Could not fetch config from bucket "bucket-name " against "server-name"/162.140.107.13". com.couchbase.client.core.RequestCancelledException: Could not dispatch request, cancelling instead of retrying.

I think you should look deeper into that particular server you don’t see requests for in your 4 node setup. I bet it is the same one as the server mentioned in the exception above, and that there is something wrong with it.

Have you had a look at that particular server’s logs?

In the metrics I do see request going to that server/node though.Where can i post the logs

What is the exact log are we looking for that server?

Could you post an example of metrics dump here? (if you want, edit out the IPs/hostnames by replacing them each with its own fake hostname)

I’m not sure which logs exactly would be best to look at… maybe @ingenthr or @daschl have an idea?

hostname3/555.555.555.13:11210":{“BINARY”:{“TouchRequest”:{“SUCCESS”:{“metrics”:{“min”:362,“max”:364,“count”:1,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:364,“90.0”:364,“95.0”:364,“99.0”:364,“99.9”:364}}}},“KeepAliveRequest”:{“SUCCESS”:{“metrics”:{“min”:268,“max”:827,“count”:61,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:440,“90.0”:548,“95.0”:573,“99.0”:577,“99.9”:827}}}},“GetBucketConfigRequest”:{“SUCCESS”:{“metrics”:{“min”:344,“max”:2932,“count”:119,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:561,“90.0”:675,“95.0”:720,“99.0”:1531,“99.9”:2932}}}}}},“hostname1/555.555.555.11:11210”:{“BINARY”:{“TouchRequest”:{“SUCCESS”:{“metrics”:{“min”:466,“max”:468,“count”:1,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:468,“90.0”:468,“95.0”:468,“99.0”:468,“99.9”:468}}}},“KeepAliveRequest”:{“SUCCESS”:{“metrics”:{“min”:278,“max”:1179,“count”:60,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:438,“90.0”:536,“95.0”:1028,“99.0”:1146,“99.9”:1179}}}},“GetBucketConfigRequest”:{“SUCCESS”:{“metrics”:{“min”:380,“max”:3588,“count”:116,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:569,“90.0”:679,“95.0”:733,“99.0”:3162,“99.9”:3588}}}}}},“event”:{“name”:“NetworkLatencyMetrics”,“type”:“METRIC”},“hostname3/555.555.555.13:8093”:{“QUERY”:{“KeepAliveRequest”:{“SUCCESS”:{“metrics”:{“min”:378,“max”:2113,“count”:119,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:557,“90.0”:675,“95.0”:790,“99.0”:1466,“99.9”:2113}}}}}},“hostname3/555.555.555.13:8092”:{“VIEW”:{“KeepAliveRequest”:{“SUCCESS”:{“metrics”:{“min”:724,“max”:1785,“count”:120,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:1003,“90.0”:1155,“95.0”:1245,“99.0”:1564,“99.9”:1785}}}}}},“hostname1/555.555.555.11:8092”:{“VIEW”:{“KeepAliveRequest”:{“SUCCESS”:{“metrics”:{“min”:765,“max”:1654,“count”:120,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:987,“90.0”:1163,“95.0”:1179,“99.0”:1253,“99.9”:1654}}}}}},“hostname2/555.555.555.12:11210”:{“BINARY”:{“KeepAliveRequest”:{“SUCCESS”:{“metrics”:{“min”:230,“max”:1318,“count”:119,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:399,“90.0”:481,“95.0”:507,“99.0”:643,“99.9”:1318}}}},“ObserveRequest”:{“SUCCESS”:{“metrics”:{“min”:305,“max”:307,“count”:1,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:307,“90.0”:307,“95.0”:307,“99.0”:307,“99.9”:307}}}},“UpsertRequest”:{“SUCCESS”:{“metrics”:{“min”:483,“max”:933,“count”:2,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:485,“90.0”:933,“95.0”:933,“99.0”:933,“99.9”:933}}}},“GetRequest”:{“SUCCESS”:{“metrics”:{“min”:415,“max”:417,“count”:1,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:417,“90.0”:417,“95.0”:417,“99.0”:417,“99.9”:417}}}}}},“hostname0/555.555.555.10:8092”:{“VIEW”:{“KeepAliveRequest”:{“SUCCESS”:{“metrics”:{“min”:679,“max”:1818,“count”:120,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:962,“90.0”:1073,“95.0”:1114,“99.0”:1802,“99.9”:1818}}}}}},“hostname2/555.555.555.12:8093”:{“QUERY”:{“KeepAliveRequest”:{“SUCCESS”:{“metrics”:{“min”:372,“max”:2162,“count”:119,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:548,“90.0”:626,“95.0”:696,“99.0”:888,“99.9”:2162}}}}}},“hostname0/555.555.555.10:11210”:{“BINARY”:{“ReplaceRequest”:{“SUCCESS”:{“metrics”:{“min”:311,“max”:380,“count”:2,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:313,“90.0”:380,“95.0”:380,“99.0”:380,“99.9”:380}}}},“KeepAliveRequest”:{“SUCCESS”:{“metrics”:{“min”:203,“max”:1368,“count”:60,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:411,“90.0”:485,“95.0”:532,“99.0”:634,“99.9”:1368}}}},“ObserveRequest”:{“SUCCESS”:{“metrics”:{“min”:282,“max”:284,“count”:1,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:284,“90.0”:284,“95.0”:284,“99.0”:284,“99.9”:284}}}},“UpsertRequest”:{“SUCCESS”:{“metrics”:{“min”:389,“max”:557,“count”:3,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:518,“90.0”:557,“95.0”:557,“99.0”:557,“99.9”:557}}}},“GetRequest”:{“SUCCESS”:{“metrics”:{“min”:236,“max”:514,“count”:6,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:309,“90.0”:444,“95.0”:514,“99.0”:514,“99.9”:514}}}},“GetBucketConfigRequest”:{“SUCCESS”:{“metrics”:{“min”:323,“max”:5406,“count”:125,“timeUnit”:“MICROSECONDS”,“percentiles”:{“50.0”:532,“90.0”:663,“95.0”:720,“99.0”:3620,“99.9”:5406}}}}}}}

We have this kind of log printed every one hour.