Cbes-consul groups error

icon-kj · April 22, 2022, 11:46am

I get the following error when running “cbes-consul groups” in k8s from the connector container.
cbes runs fine.

2022-04-22 11:36:26,384 main DEBUG LoggerContext[name=8bcc55f, org.apache.logging.log4j.core.LoggerContext@119f1f2a] started OK.
Exception in thread “main” picocli.CommandLine$ExecutionException: Error while running command (com.couchbase.connector.elasticsearch.cli.GroupsCommand@3d51f06e): com.orbitz.consul.ConsulException: Error connecting to Consul
at picocli.CommandLine.executeUserObject(CommandLine.java:1948)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2314)
at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172)
at picocli.CommandLine.parseWithHandlers(CommandLine.java:2559)
at picocli.CommandLine.parseWithHandler(CommandLine.java:2494)
at com.couchbase.connector.elasticsearch.cli.ConsulCli.main(ConsulCli.java:110)
Caused by: com.orbitz.consul.ConsulException: Error connecting to Consul
at com.orbitz.consul.AgentClient.ping(AgentClient.java:67)
at com.orbitz.consul.Consul$Builder.build(Consul.java:685)
at com.couchbase.connector.cluster.consul.ConsulContext.(ConsulContext.java:44)
at com.couchbase.connector.elasticsearch.cli.GroupsCommand.run(ConsulCli.java:302)
at picocli.CommandLine.executeUserObject(CommandLine.java:1939)
… 8 more
Caused by: com.orbitz.consul.ConsulException: Error pinging Consul: Service Unavailable
at com.orbitz.consul.AgentClient.ping(AgentClient.java:63)
… 12 more

cbes runs fine and connections to ElasticSearch and Couchbase are fine.
any tips how the connector in k8s communicates with consul in same k8s cluster?
versions:
ES = 7.13
CB = 6.6
Connector = 4.3.5

david.nault · April 22, 2022, 6:09pm

Hi icon-kj,

That’s interesting… I can’t think of a reason why cbes-consul groups would fail while other cbes-consul commands would succeed. I would look into whether the Consul Agent running on the node is healthy, and whether the Consul cluster itself is healthy. “Service Unavailable” looks like it came from an HTTP response, which would indicate a Consul server/agent is running, but unable to service requests.

As a workaround, you can get see the groups by browsing the Consul Key/Value service and looking for keys that start with “couchbase/cbes/” and end with “/config”.

Thanks,
David

icon-kj · April 24, 2022, 9:41pm

I got it to run with cbes-consul

now i am getting a consul health.service error. Looks like it cannot find the leader.

21:40:22.782 [OkHttp http://r1-consul-server:8500/…] ERROR c.o.c.c.ConsulCache - Error getting response from consul for health.service “dev3-es7”, will retry in 10000 MILLISECONDS
com.orbitz.consul.ConsulException: Consul cluster has no elected leader
at com.orbitz.consul.cache.ConsulCache$1.onComplete(ConsulCache.java:159) [consul-client-1.3.3.jar:?]
at com.orbitz.consul.util.Http$1.onResponse(Http.java:79) [consul-client-1.3.3.jar:?]
at retrofit2.OkHttpCall$1.onResponse(OkHttpCall.java:129) [retrofit-2.5.0.jar:?]
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203) [okhttp-3.12.12.jar:?]
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) [okhttp-3.12.12.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
21:40:28.178 [nioEventLoopGroup-2-2] INFO c.c.c.d.CheckpointService - Getting current seqnos took 2.822 ms

any ideas on this one?

david.nault · April 25, 2022, 4:48am

Hi icok-kj. This GitHub issue suggests disabling streaming as a workaround: com.orbitz.consul.ConsulException: Consul cluster has no elected leader · Issue #457 · rickfast/consul-client · GitHub

Thanks,
David

icon-kj · May 3, 2022, 11:01pm

Thanks, I tried this however it does not seem to take affect on Consul via the helm.
I will raise ticket with Consul.

Do you know why when I have more than 1 pod, the non leader cbes-consul connector pods reboot continuously, is this by design?