Storing indexes on temporary storage

#1

For anyone hosting in the cloud (EC2 or Azure), you know that page blob storage and Couchbase aren’t exactly a match made in heaven :wink: We have an application that relies heavily on views and with CB 4.0 now released, we only see our index usage increasing over time with SQL for Documents.

Azure provides temporary local storage as both SAS disks and SSD, both of which have much lower latency than page blob. We are very interested in configuring the index path on the temporary storage, but we don’t know how Couchbase will react if the indexes don’t exist after a reboot (which is the case during a dealloc/alloc event on the Azure side).

With the Azure Linux Agent, I can ensure that the appropriate directories and permissions are assigned, and I can even create empty files in that would help. Has anyone tried this, or does anyone know if it’s on the Couchbase development roadmap to make indexes semi-permanent (i.e. auto-rebuild if missing when the couchbase service starts)?

Speaking in terms of probability/uptime, the “temporary” storage really isn’t very “temporary”. It only removes the temp storage upon a de-allocation/allocation event, or hardware failure. Both of which will force a reboot of our couchbase nodes anyway.

If you really wanted to extend this functionality, the couchbase services could have a nightly (scheduled) job that persists the temp index storage to page blob storage. Then during the Couchbase service start-up sequence, if the index files were missing it could automatically restore from persistent copies, saving time by only re-indexing what changed the last backup.

Thoughts? The only other option is premium storage which is great, but considerably more expensive. Storing indexes on temp storage would be a great use of existing resources that you already get for free with each provisioned VM.

Measure caching of GSIs
#2

Hi @jeffhoward001, I believe temp storage is removed with every reboot on Azure. The reboots happens quite a bit given the fabric upgrades that sweep. These upgrades move the VMs around thus TEMP is reset. do you know if this changed?

If you enable auto failover, the restarts of the node would cause nodes to failover. If you add these nodes back in through “add node” or with “add back with full recovery”, we will rebuild the data replicas and views. However this method would impact both view and data that reside on the node and you won’t be able to isolate the impact to just views.

would that work?

#3

Thanks for the reply cihangirb. So if I understand what you’re saying, using auto-failover would hard-fail the node if Azure forced a shutdown. Then we could re-add the node using the “add back with full recovery” option.

To confirm, “add back with full recovery” option basically deletes and recreates all the data on the failed node using replica data from the other nodes. That would work, but we find that the node recovery/re-balance operations are very expensive in our environment (meaning it takes a long time, and there is downtime for write operations during the recovery).

Also, there’s an ever so slight chance that all our nodes in Azure could go down at the same time. Azure says that’s not supposed to happen if you properly group your servers in a cloud service, however it indeed happened to us in the West region last fall.

To make sure I understand how the indexing engine works: if the couchbase service is restarted, and the index files don’t exist when the couchbase service is starting, will couchbase attempt to rebuild them or will the service hang?

#4

I worked on Azure during its incubation at Microsoft so has been a while since I looked under the hood but failure of all nodes at once should not happen, as long as the upgrade and failure domains are assigned correctly to your service. I know there are simpler concepts with the new fabric that let you express this easier as well but I am likely out of date on the latest development on Azure.

We do not test this condition heavily in our suites. The method I described is a supported path to rebuild the index file. If we have lost the directories, the engine will try to compensate and silently fix and rebuild but I would not recommend relying on this at all times.
thanks
-cihan