How to connect to multiple buckets and sync data between them


#1

Hi all,
As my question title, my application need to connect and sync data between many buckets.
Thanks !


#2

Hey, I don’t get the use case, you are just making the data redundant.
If you want to do so for fault tolerance then you can easily go for the Replica settings of the bucket.
Could you please provide a use case, it would be very helpful.


#3

@shiv4nsh
Currently, My application is connecting to a bucket, but the bucket somtime down, so my application cant be saved and read data from it.I want to connect to many buckets and sync data between them, if one of bucket is down, i can use another bucket with data has been sync.
Thanks


#4

what do you mean exactly by “the bucket sometimes down”? If the bucket is down, it’s probably true for other buckets in the same node (node failure rather than a strange bucket-level failure, in which case as @shiv4nsh stated what you probably need is node-level replication to ensure availability of your data.
Have a look at replica configuration.

Note: The Java SDK can optionally even read from a replica directly rather than waiting for the ops to failover a node in trouble.


#5

@simonbasle thanks for your suggest, as you mentioned :

So, is this process automatic ?
Thanks !


#6

No, this is not automatic. Here is what should happen when a node crashes, on a cluster where inter-node replication has been set to 1, 2 or 3:

  • A node crashes: SDK sees exceptions (most probably timeouts or connection exceptions)
  • Someone or something should failover the node (mark it as bad). It can be the ops, an in-house monitoring scripts or (in the case of a first failure and with a minimum delay of 30s) the auto-failover feature baked into couchbase…
  • From that point, one of the replicas is promoted and start serving the dataset of the crashed node in place of the crashed node (no more errors in the SDK)
  • The node should be repaired or replaced and the cluster rebalanced by the ops so that it is in a homogenous healthy state again (all data is replicated at configured level, all subsets of data are equally replicated, etc…)

The use for getFromReplica() is to manually cover for the window at the beginning where a node crashes but hasn’t yet been failed over (so it still has the full range of replicas, none of them having been promoted). So you catch errors, you fallback to a getFromReplica().

@daschl any gotchas or anything to add?