KV upsert throwing TimeoutException [Couchbase 4.5]

java
query

#1
  1. java SDK version - 2.3.1
    Couchbase server version 4.5.0
    startup parameter:-Xms50G -Xmx50G

  1. DefaultCouchbaseEnvironment

    kvTimeout(15000)
    .keepAliveInterval(30000).maxRequestLifetime(50000)
    .socketConnectTimeout(3000)
    .retryStrategy(BestEffortRetryStrategy.INSTANCE)
    .requestBufferSize(256 * 1024)
    .responseBufferSize(256*1024)
    .kvEndpoints(5)
    .connectTimeout(10000)
    .build();


  1. Exception details

    2016-08-04 14:15:25.838 [pool-4-thread-48] ERROR MyCouchbaseConnection:210 - java.util.concurrent.TimeoutException key:CITY_RESULT_CNTF110000_201608041415
    java.lang.RuntimeException: java.util.concurrent.TimeoutException
    at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:71) ~[java-client-2.3.1.jar:?]
    at com.couchbase.client.java.CouchbaseBucket.upsert(CouchbaseBucket.java:354) ~[java-client-2.3.1.jar:?]
    at com.couchbase.client.java.CouchbaseBucket.upsert(CouchbaseBucket.java:349) ~[java-client-2.3.1.jar:?]
    at cn.com.cennavi.mfs.couchbase.MyCouchbaseConnection.set(MyCouchbaseConnection.java:208) [MAFS.jar:?]
    at cn.com.cennavi.mfs.core.Encoder.CNTFEncoder.execute(CNTFEncoder.java:962) [MAFS.jar:?]
    at cn.com.cennavi.mfs.core.listener.EncodeListener.run(EncodeListener.java:76) [MAFS.jar:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]


*The key data size is 3379632byte


THANKS


#2

Can you reproduce this with a single upsert? If so, I think it could help to get the SDK logs, at TRACE level. This will include a dump of all packets sent/received, which could indicate if the server responded, and where time was spent.
See the documentation for logging


#3

Thank you for your reply.
I have solved the question.
through change the DefaultCouchbaseEnvironment:(kvEndpoints and observeIntervalDelay )
.kvTimeout(15000)
.keepAliveInterval(30000)
.maxRequestLifetime(50000)
.socketConnectTimeout(3000)
.observeIntervalDelay(Delay.exponential(TimeUnit.MILLISECONDS, 1000, 300,100))
.retryStrategy(BestEffortRetryStrategy.INSTANCE)
.requestBufferSize(163848)
.responseBufferSize(16384
8)
.kvEndpoints(60)
.connectTimeout(10000)
.build();

thanks


#4

Ah, were you using ReplicateTo and/or PersistTo (other than NONE) when doing the upsert?


#5

sorry ,This is the first time I use the java SDK,I just use like this( i think is NONE):

public int set(String strKey, int iTimeOut, Serializable obj) {
	int iRet = 1;
	SerializableDocument tem = null;
	try {
		SerializableDocument s = SerializableDocument.create(strKey, iTimeOut, obj);
		tem = this.cbBucket.upsert(s);
	} catch (Exception e) {
		log.error(e.getMessage()+" key:"+strKey, e);
		iRet = 0;
	}
	return iRet;
}

right?


#6

yeah, that code defaults to PersistTo.NONE and ReplicateTo.NONE. So I’m puzzled as to why the change to observeIntervalDelay would change anything, as that delay isn’t used…

Also, 60 kv endpoints seem a little high. It means 60 TCP connections, where 1 should usually be able to handle several thousand upserts / second easily.

I’m still interested in the TRACE log.

Also since this is a Serializable object you’re dealing with, I wonder if the time it takes to serialize it could grow very large. Could you try measuring / profiling the time it takes for the object to be serialized and deserialized?

The TranscodingUtils class has serialize and deserialize methods that you could measure in a small unit test for example, with an instance of one of your large objects that timed out.


#7

Ah,Thanks for your explanation of the observeIntervalDelay.

I set 60kv endpoints because we have 50-70 threads ;Within 10 seconds a thread will consume 2-3M data from the couchbase cluster and insert 2-3M data into the couchbase cluster .

public int set(String strKey, int iTimeOut, Serializable obj) :obj is byte[] with data size is 2-3M.

first the obj is Map object,but the size is bigger than 20M ,so i change the obj to String with zip compressed.
byte[] serialize or deserialize is very fast and less then 100ms

thanks again.


#8

Getting these too, all the time. But since they are just INFO level logging (with stacktrace sadly) I just capture them and have a simple retry algorithm in place. It’s all pretty random