RequestCancelledInFlight exceptions periodically on view queries


#1

I have some code that is using the couchbase java client (version 2.0.2) that is periodically making view queries (every 5 minutes) against a cluster of 3 nodes. So within an hour, it will make 12 total queries. Of those 12, I’d say it fails about a quarter of the time with the following error:

com.couchbase.client.core.RequestCancelledException: Request cancelled in-flight.
at com.couchbase.client.core.endpoint.AbstractGenericHandler.handleOutstandingOperations(AbstractGenericHandler.java:241)
at com.couchbase.client.core.endpoint.AbstractGenericHandler.handlerRemoved(AbstractGenericHandler.java:223)
at com.couchbase.client.core.endpoint.view.ViewHandler.handlerRemoved(ViewHandler.java:456)
at com.couchbase.client.deps.io.netty.channel.DefaultChannelPipeline.callHandlerRemoved0(DefaultChannelPipeline.java:526)
at com.couchbase.client.deps.io.netty.channel.DefaultChannelPipeline.callHandlerRemoved(DefaultChannelPipeline.java:520)
at com.couchbase.client.deps.io.netty.channel.DefaultChannelPipeline.remove0(DefaultChannelPipeline.java:350)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.teardown0(AbstractChannelHandlerContext.java:104)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.teardown(AbstractChannelHandlerContext.java:89)
at com.couchbase.client.deps.io.netty.channel.DefaultChannelPipeline.teardownAll(DefaultChannelPipeline.java:753)
at com.couchbase.client.deps.io.netty.channel.DefaultChannelPipeline.fireChannelUnregistered(DefaultChannelPipeline.java:742)
at com.couchbase.client.deps.io.netty.channel.AbstractChannel$AbstractUnsafe$6.run(AbstractChannel.java:606)
at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
at com.couchbase.client.deps.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)

None of my nodes are currently down nor have any of them been down when these exceptions have happened. I do use a timeout on the Observable from the view query with a value of 10 seconds but I don’t think that’s the issue. Any thoughts on what’s causing this?


#2

I’m quite sure this goes away with 2.0.3 - can you try? We retry those explicitly for views now - but I wonder why they happen so often for you… Can you run it dor a while with DEBUG enabled and post the logs? Maybe that draws some conclusions.


#3

@daschl, I plan on upgrading to 2.0.3 and deploying a patch to prod tomorrow. I’ll also try and enable debug logging to see if that provides any additional clarity in case the upgrade does not fix the issue. I’ll respond back again when we deploy tomorrow and let you know how things look.


#4

Sounds great. If you’re curious here is the change in 2.0.3: https://github.com/couchbase/couchbase-java-client/blob/2.0.3/src/main/java/com/couchbase/client/java/view/ViewRetryHandler.java#L78


#5

@daschl, after deploying the patch to production it appears as if this issue has gone away. I’m going to keep any eye on it for a little longer just to make sure it doesn’t crop up again. Thanks for the help with this.