we will need to do regular updates on data, and thinking to use UPDATE query to do so. that involves the number of documents from 10 to 20k or more per query. the number of docs will vary. is there any concerns or suggestions or issue with performance that anyone can think of or know about?
would it be any performance issue if we run the UPDATE query with 20k or more documents per time? or we should batch them into say, 5000 per query…
thanks in advance!
example of query can be:
set archived = true
where published_date = ‘20220101’;
N1QL UPDATE will be full document update (i.e. get the whole document KV, compute , write again with cas). Depends on load it can take time.
If 10K it should be fine. If need you can use LIMIT and repeat with request_plus same until 0 mutations.
If you have huge documents and with EE you can use eventing Examples: Using the Eventing Service | Couchbase Docs
Thanks for your prompt reply, @vsr1. If the document sizes are not that big, is any benefits to use eventing service? or using N1QL is fine?
N1QL should be fine. try out
@vsr1 just curious. a JSON field, which definition is better for index wise? will “null” work well for indexing? thanks.
where ifmissingornull(message, 0) != 0
where ifmissingornull(message, null) is not null
First you choose xxx such that it is not part of the documents. example you expect message is string then choose number. avoid null because it can be part of documents.
If you can use index condition then use where ifmissingornull(message, xxx) != xxx
If it must be in index key then use ifmissingornull(message, xxx)