How the number of replicas apply to the RAM

nekomatic · September 28, 2018, 12:08pm

Hi All,
Assuming I have a bucket which is set up to keep 2 replicas (or 3 copies) does this also apply to the RAM? If yes, does commiting to memory-only mean that before the acknowledgement is retuned to the client Couchbase ensures the data is distributed across 3 machines’ RAM?

matthew.groves · September 28, 2018, 9:07pm

Hi @nekomatic,

I’m going to need more information on what you mean by “memory only”. At the SDK level you can specify durability options ‘replicate_to’ and ‘persist_to’.

But at a high-level, Couchbase has a memory-first architecture. Before anything is written anywhere else, it is first written to RAM. If you specify those durability options, the SDK will do some extra checks/polling to confirm if the document has been persisted/replicated to your specifications. However, if those checks fail, it doesn’t necessarily mean that the replication/persistence won’t ultimately succeed.

nekomatic · October 1, 2018, 10:11am

Hi @matthew.groves, I need to find out if setting a number of replicas is equivalent to storing same number of copies of the commited document on separate nodes when we chose replicate_to approach - does couchbase wait for all replicas to be complete before sending acknowledgement back to the client?
I assume replicate_do does not wait for persisting the data to the disc?

matthew.groves · October 1, 2018, 1:26pm

@nekomatic,

This is my understanding, what I wrote about in the “durability” section of this blog post on ACID properties of Couchbase. Maybe @ingenthr can clarify further details:

When you create a bucket, you specify the number of replica copies that you want across your cluster. Couchbase Server itself will take care of the replication, no matter what you tell the SDK when you’re mutating data.

At the SDK level, if you specify a ‘replicate_to’, then the SDK will do some polling to make sure the replicas have been copied. It will return an error if the polling determines that the specified number of replicas have been made.

If you don’t specify a replicate_to, Couchbase will still attempt the replication. But the SDK won’t do any polling after the data has been acknowledged.

If you do specify a replicate_to, technically it doesn’t wait for all replicas to be complete. It will keep checking, return success when it sees all the replicas, but would timeout if it keeps coming up empty. Couchbase Server itself will keep trying replication though, and may eventually succeed (despite the SDK initially returning an error).

That’s correct. It only waits for replication. If you want to also check for persistance, you can use the persist_to option.

ingenthr · October 2, 2018, 4:02am

Correct, the API waits for all replicas to receive a copy before returning control to your application. In the case this doesn’t complete or some kind of failure interrupts things, control is returned to the app that has more context on compensating.

Part of the philosophy here is that “durable” has potentially different meanings on a distributed system. For instance, if one node has it on disk, but that node isn’t available or is on a non-accessible network drive, would someone wait to get it or switch to a replica node?

With persist_to and replicate_to, control is up to the app developer/deployer, who likely has more context about what is important for that data modification and works with their own system.