How getFromReplica works ? between nodes or between Data Centers?

ekaterina.mayer6 · September 21, 2018, 1:00am

I have One DataCenter with 3 nodes,
and another one DataCenter with cluster and 3 nodes placed in AWS, for example.
1DC and 2DC are linked with bi-directional replication.

When bucket.get(key) fails with connection timeout, ( Am I right that it fails only if all 3 nodes are failed? )
then we can try to get object as bucket.getFromReplica(key. ReplicaMode). Where it tries to get object from?
from another Data Center or from another node inside of current data center ?
I think it is the first, if within one cluster works auto-failover.
But here I have a question too: we do not specify host of another DC, only hosts of these 3 nodes inside of one cluster. So how it gets the host of replica of another Data Center ?

matthew.groves · September 28, 2018, 8:36pm

Hi @ekaterina.mayer6,

If you are trying to get document X, document X could be on a node that goes down, then the get will fail. getFromReplica will attempt to get a replica from a node within the same cluster. It will not attempt to get it from a different data center. Here’s an example in the documentation to help get you started: https://docs.couchbase.com/java-sdk/2.6/failure-considerations.html#devguide-replica-read

If an entire data center goes down, then your application can switch over to another data center. This is known as “multi-cluster awareness”, and you can read more about it here: https://blog.couchbase.com/couchbase-high-availability-disaster-recovery-java-multi-cluster-aware-client/

ekaterina.mayer6 · September 29, 2018, 6:02pm

Sorry, I do not get it,
Why do we need invoke “getFromReplica” when one node fails, if Auto failover takes care of it by itself ? We just specify all hosts during cluster/bucket creation.

So, as I understood auto-failover resolve such failoure within cluster,
and I do not get, why do we need “getFromReplica” then.
Thanx

matthew.groves · September 29, 2018, 7:43pm

@ekaterina.mayer6,

You are correct. After auto-failover is finished, you don’t need to getFromReplica.

ekaterina.mayer6 · October 1, 2018, 12:53am

Then I am totally confused

Why do we need getFromReplica at all ?
In which cases auto-failover won’t help ?
What is the slight difference between these two approaches ?
Thanx for help,

matthew.groves · October 1, 2018, 1:09pm

@ekaterina.mayer6,

Auto-failover is not turned on by default, not everyone uses it. Also, auto-failover is not instantaneous, so during the failover period you may need/want to still use getFromReplica to avoid disruption.

ekaterina.mayer6 · October 1, 2018, 5:59pm

Thank you, Matthew. you really helped me. appreciate it
Where I can find examples of how to turn it on in Java. ?

Also, auto-failover is not instantaneous, so during the failover period

Where I can find a more detailed info regarding this thing?
I could not find such nuance in Automatic Failover | Couchbase Docs

matthew.groves · October 1, 2018, 6:41pm

@ekaterina.mayer6,

What are you trying to turn on in Java? Auto-failover? I’m not sure if you can manage failover with the SDK… maybe @daschl would know?

But you can do it:

With the REST API: Auto-Failover | Couchbase Docs
or through the CLI: setting-autofailover | Couchbase Docs
or through the UI: General | Couchbase Docs

From the Configure Auto-Failover section:

Timeout . The number of seconds that must elapse, after a node or group has become unavailable, before auto-failover is triggered. The default is 120.

In the latest versions of Couchbase, you can enable “fast failover”, which means you can take the timeout value down to 5 seconds (see this blog on fast failover).