Possible bug: ArithmeticException in Java Core Client

java.lang.ArithmeticException: / by zero
	at com.couchbase.client.core.service.strategy.RoundRobinSelectionStrategy.select(RoundRobinSelectionStrategy.java:38)
	at com.couchbase.client.core.service.PooledService.send(PooledService.java:282)
	at com.couchbase.client.core.service.ManagerService.send(ManagerService.java:27)
	at com.couchbase.client.core.node.Node.send(Node.java:383)
	at com.couchbase.client.core.node.RoundRobinLocator.dispatchUntargeted(RoundRobinLocator.java:207)
	at com.couchbase.client.core.node.RoundRobinLocator.dispatch(RoundRobinLocator.java:124)
	at com.couchbase.client.core.Core.send(Core.java:339)
	at com.couchbase.client.core.Core.send(Core.java:312)
	at com.couchbase.client.scala.manager.ManagerUtil$.$anonfun$sendRequest$1(ManagerUtil.scala:43)
	at reactor.core.scala.publisher.SMono$.$anonfun$defer$1(SMono.scala:1491)
	at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:44)
	at reactor.core.publisher.Mono.subscribe(Mono.java:4400)
	at reactor.core.publisher.Mono.subscribeWith(Mono.java:4515)
	at reactor.core.publisher.Mono.toFuture(Mono.java:4920)
	at reactor.core.scala.publisher.SMono.toFuture(SMono.scala:1376)
	at reactor.core.scala.publisher.SMono.toFuture$(SMono.scala:1374)
	at reactor.core.scala.publisher.ReactiveSMono.toFuture(ReactiveSMono.scala:8)
	at com.couchbase.client.scala.manager.bucket.AsyncBucketManager.create(AsyncBucketManager.scala:34)
	at com.couchbase.client.scala.manager.bucket.BucketManager.create(BucketManager.scala:35)

Related code:

Couchbase.Cache.get()
      .scalaCluster
      .buckets
      .create(CreateBucketSettings(
        name = randomizedBucketName,
        ramQuotaMB = 1024,
        flushEnabled = Some(false),
        numReplicas = Some(0),
        replicaIndexes = Some(false),
        bucketType = Some(BucketType.Couchbase),
        ejectionMethod = Some(EjectionMethod.ValueOnly),
        maxTTL = None,
        compressionMode = None,
        conflictResolutionType = None,
        minimumDurabilityLevel = Some(Durability.Disabled)
      )).get

Library versions:

image

JDK:

image

Scala version 2.13.8.

Code running against an installation of Couchbase EE 7.0.3 on Kubernetes.

The issue is not reproducible in subsequent runs, however, it happens “sometimes”. If it is a transient network error of some kind, I suspect that this is still a some kind of a bug.

Hi @zoltan.zvara
That looks to be happening because there’s no endpoint (node connection) to send the request to at the point of the RoundRobinSelectionStrategy.select(). It’s somewhat unclear how this can be happening on a quick inspection of the code, since there is protection against this at higher layers - e.g. this method should not even be getting called if there are no endpoints; we’ll need to do some more investigation to know for sure. But certainly, it’s a bug to see a ArithmeticException, I’ve raised JVMCBC-1073 for it and thanks for bringing this to our attention.

As for workarounds, doing a cluster().waitUntilReady(Duration.ofSeconds(30)) call first to ensure the endpoint connections are definitely established, might help. But I doubt it, since I think the issue is actually endpoints disappearing. As you say, I suspect the underlying cause is a transient network error (though of course we should handle such things gracefully). If you turn up logging, do you see (perhaps occasionally) lines about connection states changing from CONNECTED to DISCONNECTED state?

1 Like

@graham.pople Yes, there are CONNECTED/DISCONNECTED states sometimes. This issue happens in a testing environment where CI/CD runs about 1300 tests in parallel on the same database. The database could be overloaded momentarily, thus, most probably dropping one or two connections.

We are waiting for the cluster until it is ready, before moving on with accessing collections.

FYI we have just experienced the same issue when running load test on a Java app that reads from Couchbase.
Java version: 11
Couchbase client version: 3.2.6
Couchbase server version: Community Edition 6.5.1

It’s worth checking the server logs, especially memcached.log, for clues as to why the connection is being dropped, since that looks the most likely root cause (though as above, the SDK should be able to handle that gracefully). You could also try the latest server version, 7.1.0.