Running the Kafka Connector in distributed mode


#1

I have the Couchbase Kafka connector working in standalone mode on a 3 node Kafka cluster. I am trying to understand how to make the connector resilient to a loss of a node. If I close the ssh session in which I started the standalone connector my consumer no longer receives messages.

I assume I need to start the connector in distributed mode but I am unclear as to the best way to accomplish that and to keep the connector running if a node of the cluster went down. Any guidance would be appreciated.


#2

Can you share the scripts that you are using to start the Kafka Connect cluster and the couchbase-connector?


#3

This is what I am running which is pretty much straight from the example on the Couchbase Kafka connector page.

./connect-standalone.sh /opt/kafka/config/connect-standalone.properties /etc/kafka-connect-couchbase/quickstart-couchbase.properties

I have modified the quickstart-couchbase.properties to connect to the Couchbase cluster and that is working ok.

It seems like what I need to do is run connect-distributed.sh (i.e.
connect-distributed /etc/kafka/connect-distributed.properties) and then somehow start the Couchbase connector. I am not sure if you need to use the Rest interface to do that and if you do I don’t know the command to post to get it started.

I can set up a start up script to run connect-distributed.sh so that it starts when a node starts, but I am also unclear as to whether or not anything needs to be done to keep the Couchbase connector up. I guess that as long as some of the nodes of the cluster are up and running the distributed connector service that the Couchbase connector should be fine and when a node is restarted it will simply join back in and share the work automatically.


#4

Yes, you have to run the connect-distributed.sh script in every node you want to be part of the Kafka Connect cluster. After that is needed use the REST interface (very unfriendly by the way :frowning:) to execute the connector in the Kafka Connect cluster. Fortunately, there is a cli tool (https://github.com/datamountaineer/kafka-connect-tools) that help us to interact with the REST interface.


#5

Thanks for the reply and the tip on the kafka-connect tools. For others who may be wondering here is what you need to post to the Rest interface to get the connector running.

image

Also one thing I noticed that if you do a Get and pull back the config (xx.xx.xx.xx:8083/connectors/test-couchbase/config)
on the connector it returns the password of the password protected bucket you may have specified in the connector.


#6

@cbarrett Regarding the exposure of the bucket password, this is a known issue with the Kafka Connect REST server. There’s some discussion here; might be worth weighing in.


#7

As a workaround for the password exposure issue in the Kafka Connect framework, version 3.3.0 of Kafka Connect Couchbase supports specifying the password using an environment variable: KAFKA_COUCHBASE_PASSWORD

If set, this environment variable takes precedence over the password in the connector config.