Multiple Clusters and Cluster Connection

DChee · November 6, 2020, 4:10am

Seeking advice if Multiple Clusters are supported with Community Edition via XDCR with bidirectional replication and are there options to connect and work with Multiple Clusters via .NET SDK.

Is it possible to setup a load-balanced multi-cluster hostname record for use against the Cluster connection string instead of defining all the seed nodes IP address or hostnames across all the Multi-Clusters with bidirectional replication?

Example:
Instead of:
var cluster = await Cluster.ConnectAsync(“seednodeAcluster1.example.com,seednodeBcluster2.example.com”, “username”, “password”)
is it possible with:
var cluster = await Cluster.ConnectAsync(“loadbalanced-mclusterhostname.example.com”, “username”, “password”)
where
loadbalanced-mclusterhostname.example.com is network load-balanced and will resolve to either
seednodeAcluster1.example.com
seednodeBcluster2.example.com

Appreciate any advice or information.

Thank you.

AV25242 · November 6, 2020, 8:39pm

@DChee as far as I can tell on the connectivity from SDK, you will have to have separate cluster objects managed one for each cluster.

jmorris · November 6, 2020, 8:48pm

I am not sure exactly about a load-balancer, but you can use DNS-SRV records and connect using them: Managing Connections | Couchbase Docs

-Jeff

btburnett3 · November 6, 2020, 8:59pm

@DChee

Based on your description, I think you’re looking for an automated failover between two separate clusters. This is not something currently supported directly by the .NET SDK. You must allocate separate Cluster objects for each cluster you are connecting to, and any failover handling would need to be your own internal logic.

Note: You should not put nodes from two different clusters in the same cluster connection string, even if they are replicated. The end result of this would be that the SDK would, at random, be connected to either one cluster or the other. This would then tend to cause data consistency issues as replication between the clusters isn’t instantaneous.

DChee · November 16, 2020, 12:48am

Thanks for the advice, was wondering:
If the internal logic approach to attempt handing automated failover via using a form of load-balancer hostname for 2 or more clusters with bi-directional replication, in which always only 1 cluster connection details are returned and used in the connection string, would that work and avoid the data inconsistency issues from non-instantaneous replication?

btburnett3 · November 16, 2020, 5:03pm

No, I don’t think that would be sufficient. The idea of a multi-cluster failover is actually very complicated, and far beyond the scope of my knowledge. I just know that it isn’t nearly as simple as that.

In particular, for most use cases, you shouldn’t have different instances of your application talking to different clusters that are replicated. All instances should point to the same cluster or replication delays will cause data inconsistencies. Couchbase treats the mutation of an individual document as atomic, and applications generally expect this behavior. But as soon as you are connecting to multiple clusters document mutations are no longer atomic. One app instance could write the document, and the other could read it and get the old value. Any failover strategy needs to be based on all apps connecting to a primary cluster, and falling back to the secondary cluster on failure, rather than randomly picking one or the other.

Additionally, the way Couchbase bootstraps would prevent the load balancer approach from working. It would pick one cluster or the other on application startup. There would be no failover when the cluster went down, unless you restarted your application.