Bulk update / Batch update for millions of records and Handle the exception

Hi CB,
I am doing some analysis where i have to update single field for fetched millions of records using Ni1QL query.

  • To considering the performance issue , would not prefer single update at a time , hence thinking to perform bulk / batch update.
  • Also thinking what if any CB exception occurred while updating the document ( Ex: If update would be perform on 5 records and exception occurred , do we need to re perform the operation or there is a way to handle it.
  • Also would like to add one more scenario when insertion/update would happen at the same document by other process can we blocking the document …?

So far i am reading and found the below info but not sure about it.
Any suggestion / input would be really appreciated.
https://developer.couchbase.com/documentation/server/3.x/developer/java-2.0/documents-bulk.html

Hi Mehul,

The JavaSDK approach as documented in the link will provide great performance, and, as you have rightly pointed out, will need to be expanded to take care of any update exception. We plan to expand the example in the future.

However there is also another approach that you can consider. This is shown in our Couchbase performance dashboard that we run on a weekly basis. Here is the blog that provides more detail about these tests: https://blog.couchbase.com/ycsb-json-benchmarking-json-databases-by-extending-ycsb/
The specific performance test case that is similar to what you are looking for is the N1QL/YCSB/Workload-E http://showfast.sc.couchbase.com/#/timeline/Linux/n1ql/ycsb/Plasma

In this test, the range query is randomized between 0…20M docs, then each document is then UPSERTed by meta().id . Tests are run for both Memory Optimized Index (MOI) and Plasma (Standard Storage Engine) with different cluster size configuration.

Please reach out to me directly if you need any additional information.

-binh

Thank you so much we will get back to you