Couchbase, Distributed Data and Transactional Issues


#1

I am debating moving an architecture to couchbase off of MySQL and I had some questions that I was hoping could be answered here.

The system in question has a set of nodes that run in Amazon and scale dynamically depending upon load on the system. To proxy the database and remove most of the read load, we use Hazelcast. The hazelcast cluster gets loaded with the data from the RDBMS on demand for the most part. Then when data gets altered it is wrapped in a transaction and written to the database and then the updated versions are stored in the database. i.e:

  1. Read X
  2. Read Y
  3. Lock X and Y
  4. Start DB Transaction
  5. Modify X and Y
  6. Commit the Transaction
  7. If error then rollback transaction and evict records from Hazelcast.
  8. If success then put the newly saved versions in hazelcast.

Although the system works fairly well there are some sticking points. For example, any query that is not indexed has to go read the keys of the candidate entities from the database and then load the entities on demand. There is no way to query Hazelcast and have that flow right through to the database.

There are compelling features of couchbase that might be applicable to our situation but there are some potential problems I see.

  1. Without Transactions that span documents, I dont think this could be safely done. I have to guarantee X and Y are both changed or not at all. Furthermore, X and Y cannot be fused into the same document in any manner. They must be entirely independent yet transactional. I cannot conceive of how this can be done without cross document transactions.

  2. The hazelcast cluster keeps essential data in memory and rapidly speeds up the application since 90% of accesses to the system are reads. I worry that if we don’t have a memory cache we might face bottleneck issues in a strongly real-time system

Can anyone offer in any insight on these issues? Thanks for your time.