Low level error checking and correction in Couchbase (specifically ForestDB)

Nirvana · June 17, 2015, 12:32am

Greetings–

My question is regarding replication and the possibility of data errors. Specifically, consider these hypotheticals: (assume a 5 machine cluster with n=3 (two replicas) and non ECC RAM)

– A cosmic ray hits a node while an update to a document is still queued in RAM before being persisted.
– An uncaught error occurs while reading a document from disk (say a ray hits the buffer)
– One of these events happens on a replicated version of the data verses a master node.

Is there any kind of error correction and detection code embedded with the data as it is persisted to the drive? Something like Reed Solomon or maybe something as simple as a hash of the data stored with the data. (Thus if a bit is flipped the hash will be invalidated and you’ll know there’s been a problem.)

Secondly, if there is such a feature, what happens when a problem is detected? Is the document compared to replicas?

IF this is answered in documentation somewhere I apologize, please just point me to it. I looked at the ForestDB github page but that was about it.

Thanks

pvarley · June 17, 2015, 12:44am

Hey Nirvana,

I believe some of the answers to your questions are in the ForestDB documentation.