How to guarantee persistence

Hankman · April 19, 2018, 12:00pm

I have a scenario where a record needs to be guaranteed to be in its latest state or fail entirely.

It seems persist_to could guarantee this if the number specified is equal to the number of nodes in the cluster, is that correct?
Does this also hold for XDCR?
And finally, this solution would require “persist_to” to be set to the total number of nodes, which means the program needs to know how many nodes there are in the cluster. Is there a more elegant solution that doesn’t involve the program knowing how many nodes there are in the cluster?

I imagine if persist_to is anything lower than the number of nodes, there exists a possibility that all the nodes that persisted the change are failing and the nodes which didnt replicate it atall yet provide an old state of the record, which we don’t ever want to happen once couchbase sends an “OK” to our program, right?

matthew.groves · April 24, 2018, 6:40pm

Hi @Hankman,

Couchbase is a strongly consistent database. If you are using the “get” operation (as opposed to “get from replica” or an index), the document you get will always be in its latest state.

My understanding is that behind the scenes, persist_to is implemented with polling. The SDK will check to see if the document was persisted to (saved to disk) on the number of nodes you specified. It is entirely possible, for instance, but unlikely, that persist_to could result in success=false, but the document still gets persisted.

I don’t think persist_to (or replicate_to) works with XDCR. It only applies to the cluster you are writing to.

The most replicas you can have is 3, so persist_to maximum is 4. It’s not strictly related to the number of nodes you have. You could have 100 nodes, and the maximum you can set persist_to is still 4. So it’s more important that you know how many replicas you are expecting to be created (which is a bucket setting).

ingenthr · April 24, 2018, 7:09pm

Note that we’re intentionally ambiguous about how the check for persistence/replication is implemented as we may change it. Also, a replicate_to_remote or some similar XDCR request has come up before.

Question: how would you use that @Hankman? Latencies may be very high at times. This is particularly true when there may be intermittent failures, congestion. Would you want your application to block for that long? Related, since it’s asynchronous, whwat would you have your application do if the upsert() operation happens locally but doesn’t replicate within the amount of time you’re willing to wait?

Thanks in advance for any more information-- it helps us when we’re planning out future features like this.

Hankman · April 25, 2018, 8:20am

Hey, thank you for your help.

Our use case is as follows:
We have a server that generates a sequence of events at a high rate.
The events are sent to others servers and asynchronously stored to DB.
We have to avoid under any circumstances that we ever emit different events with the same sequence number (or skip a sequence number).
To achieve this we use a lockfile that is generated on startup, and deleted on graceful server shutdown (it waits for all the in-flight events to finish being written to DB).
If the lockfile does not already exist on startup, we have to be 100% sure that the DB will either be able to deliver all and any of the previous events or give us an error (so we can fix the whole system through manual intervention).
If the lockfile does exist already on startup, we know that the DB might not contain all entries and we need manual intervention on the whole system.
On startup we retrieve the last event with REQUEST_PLUS to be sure the index picked it up as well.

This “100% or die” requirement is what I am trying to solve.
If couchbase’s automatic-failover is deactivated then storing the events on shutdown with persist_to=1 and waiting for them to all return with “success” before deleting the lockfile would be enough to fulfill the requirement, right?
What if automatic-failover is activated? Does persist_to=4 then fulfill the requirement (if the replica count is set to 3)?

And finally, if it is possible to fulfill the requirement, we could replace the lockfile with a “performed clean shutdown” entry in the databases instead, which is removed during startup before operation is resumed, so no system state information is stored on the servers’ local hdd.
Is there a way to guarantee that a XDCR setup has all the events persisted with the same “100% or die” requirement before the “performed clean shutdown” entry is replicated to the other datacenter?

ronnc · July 3, 2019, 3:18am

Hi @ingenthr,

Re: guarantee replication across XDCR.

I work in a large financial institution and RPO zero (zero data lost when a site is wiped out) gets toss around quite often because we can afford to build low latency network between sites with synchronous disk write across sites and it is easy for infrastructure team to recommend database solutions system that provides RPO zero for all types of transactions because no caveats need to be given. It’s a big hammer approach.

In reality, most data doesn’t need to have RPO zero. It would be very handy if couchbase sdk offer a mode that guarantee replication across configured XDCR. For certain type of transaction, if we are prepared to/required to wait then that would make the solution more adaptable as we can then choose appropriate level of consistency with tradeoff in the time that the transaction might take.

If there are failure in site replication then we’d expect couchbase to rollback the data that was persisted locally.