No single point of failure?

md1 · January 14, 2015, 1:02am

I was testing XDCR 3.0.2 and had a 2 node cluster 2-way XDCR to another 2 node cluster.

I think this may be a single point of failure and is open for a discussion.

2 nodes (Cluster #1)1 A/B <-> (Cluster #2) A/B

Now when I create the XDCR link, the data does catch up during copy, but the limit of XDCR is a single IP address to a destination?

This means a single node in cluster #1 is sending to a single node cluster #2 and vice-versa.

So what happens when any node on either side of XDCR goes down? No we have lost XDCR completely. Is the backup plan requiring a third cluster so that there are 2 paths ultimately to each cluster, the direct path and the indirect thru the non-failed link and intermediate cluster.

Or is there a way to set up XDCR so that any node in cluster #1 will XDCR to any node in cluster #2.

If not, we have a single point of failure on a 2 cluster XDCR.

Your comments appreciated,

cihangirb · January 14, 2015, 1:53am

Hi @md1,
Couchbase and all SDKs can take a single IP address and discover all the other nodes within a cluster. So in XDCR when you enter in a single IP, we are using that to discover the cluster topology but we don’t use that as a funnel for replication. If a node fails, typically auto failover kicks in a fails over the node and XDCR will pick up intelligently from where it left off with the new cluster topology.
Hope that clears the confusion.
thanks
-cihan