Couchbase kafka-connector environments

tkeble · October 23, 2015, 1:45pm

Hi,

We have been implementing a solution using the couchbase-kafka-connector, and have been eagerly awaiting the next release after 1.0.0. I have just upgraded to 1.1.0, and found that the static methods for creation of the CouchbaseKafkaConnector class have changed quite significantly, causing some problems.

In version 1.0.0 the main way of creating a connector was:

CouchbaseKafkaConnector connector = CouchbaseKafkaConnector.create(DefaultCouchbaseKafkaEnvironment environment, String couchbaseHost, String bucketName, String bucketPass, String zookeeperHost, String topic)

This allowed for you to create multiple connectors using a single couchbase environment.

In version 1.1.0 this has been removed (despite another construction method just being marked as deprecated, the above was entirely removed), and now the only way to create a connector is:

CouchbaseKafkaConnector connector = CouchbaseKafkaConnector.create(CouchbaseKafkaEnvironment environment)

and all of the bucket credentials, host etc have been moved inside this environment. This means that if you want to use the connector for multiple buckets and/or topics (bearing in mind the one-to-one mapping of bucket to topic within a connector) you are required to create multiple environments. As per all of the couchbase documentation, creating multiple environments is both frowned upon and likely to cause issues downstream. I have a situation where i could be initiating a very large number of connectors (order or 10’s - 100’s) and therefore it strikes me as unlikely to be a good idea to have this number of environments.

A further problem with this is that I would like to reuse the same couchbase environment that i use for my couchbase drivers, so it would be good if the DefaultCouchbaseKafkaEnvironment could be constructed from an instance of a CoreEnvironment.

Could someone from Couchbase please advise me on:
a) The reason for this architecture change
b) The reasons for advising to only have one environment, and what issues i may face from having multiple, particularly with regards to compute resources etc e.g. is my use case now invalid/impossible with this architecture?
c) What the best way for me to proceed is

Many thanks,
Tom

WillGardella · October 28, 2015, 12:57am

Hi Tom - I’ve asked the developer to respond. It may take a bit of time for him to get to this.

avsej · November 12, 2015, 6:26pm

The reason was to simplify client initialization, because of assumption, that the connector will be used in seprate process or even in sharded mode, so unlikely will listen multiple buckets

It is still possible and in version 1.2.0 I will add corresponding constructor to override Couchbase and Kafka credentials. But to really share resources with java environment (which like kafka environment object inherits from core environment), you have to manually transfer pools and schedulers. We do not have easy way to clone only core part between environments.