Data consistency

yev · October 25, 2016, 11:37pm

With lack of a “transaction” when data is committed to the data store, especially into multiple entities, there is a possibility of an inconsistent state due to exceptions, etc. What approaches do people take to handle these scenarios?

matthew.groves · October 28, 2016, 2:25pm

This is a very common question coming from developers who are used to RDBMS and are considering moving to document databases.

When writing/updating a single document, that transaction is atomic, so you don’t need anything special to handle that. Often times, a data model that spans multiple relational tables can be collapsed/denormalized into a single document, so you need to rely less often on transactions.

In the case that you really do need a multiple-document transaction, currently the only thing I can think of is following a two-phase commit pattern (here’s an example: http://docs.couchbase.com/developer/dev-guide-3.0/transactional-logic.html)

Note that this isn’t just true for Couchbase, but for all the popular document databases and nosql databases that I can think of.

yev · October 28, 2016, 3:41pm

Hi Matthew, thanks for the response. I certainly understand the gap between relational to NoSQL. My case is definitely not unique, perhaps it’s just massaging the workflow or looking at the data again.

I have the following

Parent object with some metadata and a summary of its children of type A. Also has children of type B and C
Object A with its own metadata, could be millions of these, for each parent though it’s possible to have ~ 100. I made these as it’s own document
Object B and Object C with own metadata could be lots of these on their own but upwards of maybe 10-50 as children of the Parent. B and C are also their own documents.

During creation of the Parent its children also get created and this is where the potential issue comes in as all the Documents are created. I could bring some of the objects into the Parent but I’d rather not risk the 20 MB limit and it’s easier to manage queries and aggregations on a documents vs collection items.

I am open though.