Cbimport fails to connect with alternate address

Hello,

I am trying to insert data into a Couchbase server running in Kubernetes. I have set up alternate addresses on my CB server and can access the cluster outside of the Kubernetes network using Web Interface and the Java SDK.

However, when I try to use the cbimport tool inside a docker container on my local machine, I get connection errors such as:

Trying to connect with k8s-ip:mgmt

failed to connect to any of the specified hosts – plan.(*data).execute() at data.go:72

Trying to connect with k8s-ip:kv

WARN: Fail to create Couchbase sink, could not get Couchbase Server version – couchbase.CreateCouchbaseSink() at sink.go:38
Json import failed: EOF

The mgmt and kv ports are what I get from the /pools/default/nodeServices endpoint. And I can connect to the same CB using Java SDK. So I am sure that the Kubernetes CB is set up correctly. The same cbimport command works against a bare-metal CB instance.

Does cbimport support the alternate addresses introduced in 6.5?

cbimport doesn’t support alternate addressing. In fact IP based alternate addressing with NodePorts is actively discouraged because it is insecure and not highly-available.

If you can have your cbimport instance use a DNS resolver that can forward DNS requests for your cluster to Kubernetes’ DNS server, then it will work. See https://docs.couchbase.com/operator/2.0/tutorial-remote-dns.html for a tutorial on how you could configure this.

Thanks for your response. This is part of our automation testing pipeline. We install our API + Couchbase to Kubernetes for a branch-based isolated testing environment. We also wanted to seed the Couchbase with some data for the automation testing to run with. cbimport was the first choice.

However, since it does not work with alternate addresses. We wrote a python script that runs in the pipeline and inserts data to the CB node using Python SDK and IP based alternate addressing.

This setup is mostly trivial with other databases because node-port to port mapping does not break their clients. However, CB was really hard to setup. Will you keep supporting IP based alternate addressing? Or should we start looking for an alternative way to connect our local machines/Gitlab CI to the Couchbase running in K8S?

I expect other databases don’t do client-side sharding, and therefore suffer performance penalties. That’s the key challenge here, and has been since I started.

Have you considered a more cloud native approach to seeding? You can run a cURL command in an init container within a Couchbase pod, then in the main container just run cbimport with the downloaded data. This then just becomes a Kubernetes Job that you can poll for completion.

Hello @YigitcanUCUM_TY ,

Simon 's recommendation of run everything inside container is the best way forward.

That said I wanted to understand why cbimport does not work in your enviorment. cbimport should work with alternate addresses. To help me debug this further could you explain how the enviorment is setup, including the network, where cbimport was executed, what version of cbimport and finally the full command executed.

Kind regards,
Patrick

Thanks, both of you guys for the answers. I am open to suggestions on how we can make our setup better. This is what our situation looks so far:

  • We have an automation test project inside Gitlab. This is where the data we want to insert to CB resides.
  • Gitlab CI pipeline is triggered manually or when there is a change to the API we are testing. There are 2 jobs inside this pipeline regarding Couchbase:
    1-) A job that uses kubectl, deploys a custom couchbase:enterprise-6.5.1 image to our Kubernetes cluster. It exposes every CB port as NodePort with a service. And then calls /node/controller/setupAlternateAddresses/external with the External Kubernetes IP and NodePorts.
    2-) A Job that uses the couchbase:enterprise-6.5.1 tries to use cbimport with external Kubernetes IP and NodePorts of mgmt/kv services.

These are the calls:
(Job #1) CI Runner --(setup alternate address with rest api)–> Kubernetes External IP + Node Port of 8091 (works)
(Job #2) CI Runner --(cbimport)–> Kubernetes External IP + NodePort of 8091,11210 (fails)

CI Runners here are inside another Kubernetes cluster. The second job fails because it cannot connect to the CB cluster. The errors we get are included in my first message. I checked the cbimport version inside the enterprise-6.5.1 image and it outputs: “cbimport version 6.5.1-6299 (unknown)”

I tried different combinations while trying to make cbimport work e.g.:

cbimport json --cluster “kube-ip:node-port-8091”
cbimport json --cluster “http://kube-ip:node-port-8091”
cbimport json --cluster “couchbase://kube-ip:node-port-8091”
cbimport json --cluster “kube-ip:node-port-11210”

None of these combinations worked. Our current fix is using Python SDK (couchbase==2.5.10) instead of cbimport. So the same setup with the cbimport replaced with Python SDK works. Job #2 becomes:

(Job #2) CI Runner --(Python SDK)–> Kubernetes External IP + NodePort 11210

As far as using cloud native approach, we have multiple projects like this. Each have their own data set which our QA Engineers manage by hand. This data lives alongside the test code they write and maybe versioned. So we would have to make these available to the Kubernetes Job/Pods.

Since CI pipeline already has the latest data, it seemed easier to just send the data with a CB tool.