Issues with USE KEYS at high load

Hello,
If I have array of doc ids what is better from below.

  1. Create a for loop to get all docIds using bucket.Get
  2. Use N1QL with USE KEYS array
  3. Use N1QL with meta().id IN array

We recently moved from 1st option to 2, assuming Couchbase internal implementation would be better than us however at peak load we are getting lot of errors like below.

log.Println("query error: ", err.Error())

2021/11/11 20:30:32 query error | {"statement":"select meta().id, type, title, subtitle, coverImage, `language`, shortDescription, amount, isUpcoming, isDraft, pages, copiesTaken, paperBookURL, showPaperBookInApp  FROM vidya USE KEYS $keys","client_context_id":"ba4b3711-e57d-4a46-a438-d525466c05a9"}

Is there a better way to find why there is query error.

Below is vmstat at the time when we get error (Lot of CPU and RAM to spare)

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 1691320 325504 8293900    0    0     0  1004 5993 8817  8 11 75  0  5

We are using Couchbase 6.6.6 CE and GoSDK 2.x

Please guide us how can we avoid this issue at sudden burst?

If you already have document keys and not doing aggregates, joins then use option 1 with asynchronous gets in parallel.

Option 2, the data needs to go through 2 hops (DATA Node to Query Node to Client) vs option 1 Data Node to Client.
Option 3, Which required index scan in addition to option 2.

You did not mentioned what the error is.
Also number of cores for Query service is limited to 4 in CE version. That can increase latency of queries. If use option 1 there is no such limit, also help query service to use for other queries.

https://docs.couchbase.com/server/current/introduction/editions.html

Query Node: The Community Edition comes with limited concurrency and parallelism and supports a maximum of 4 cores per node.

Thanks for quick reply.

Even I am not sure what the error is. I am printing err.Error() where err is returned from cluster.Query function of Couchbase Go SDK
That err.Error() prints just a statement. How do I get actual error?

Curently max parallelism for Query is set to 1. Does that mean it will only use 1 core?

No.
Query service is threaded application, it uses 4 cores. You can use many statements in parallel. checkout GOMAXPROCS in runtime package - runtime - Go Packages

max parallelism is different. It it tells when possible given query run certain operations in parallel. Even 1 query runs certain operations in parallel.

For example: Depends on query it divides them into various units of work (Check out Operators in EXPLAIN, Filter, ProjectionInitial, …). Each operator runs on separate thread at given time operator runs on single document. But all operators runs in parallel. max_parallism tells how many copies if operator can be run simultaneously by default 1.

2 Likes

This is a maybe an oversight in the SDK where we just return “query error” if we cannot parse the response body in the expected JSON format when query responds with an error http status code. I actually ran into this problem myself a couple of weeks ago whilst working on something and I have already raised Loading.... This is currently scheduled for the December SDK release but I’ll try to get it into this week’s release if I can.

If you can enable SDK verbose (DEBUG) logging then you will at least be able to see the status code that query is returning to us.

1 Like

Hi @prvaghasiya,

Although you currently use the CE, Edition in EE there is an additional service called Eventing which by default responds to every mutation (INSERT, UPADARE, and DELETE) in real-time on any document in a given bucket (or collection > =7.0) by running a JavaScript lambda.

You say your using 6.6 here is an Eventing Function and some typical 6.6 performance numbers (which can be quite dramatic). The numbers are form a single 12 core onprem node running all services.

// 2020-09-28T06:29:06.674-07:00 [INFO] "Deleted from travel-sample via src_bucket KV op took 1 ms."
// 2020-09-28T06:29:07.094-07:00 [INFO] "Deleted from travel-sample via src_bucket N1QL USE KEYS 420 ms."
// 2020-09-28T06:29:08.063-07:00 [INFO] "Deleted from travel-sample via src_bucket N1QL where clause on type and id 969 ms."
// 2020-09-28T06:29:09.008-07:00 [INFO] "Deleted from travel-sample via src_bucket N1QL where clause on id 945 ms."

function OnUpdate(doc, meta) {
    if (doc.type !== 'route') return;
    var tbeg, millis

    tbeg = Date.now();
    delete src_bkt["route_10001"]; // 1 ms.
    log(`Deleted from travel-sample via src_bucket KV op took ${Date.now() - tbeg} ms.`);

    tbeg = Date.now();
    N1QL("DELETE FROM `travel-sample` USE KEYS ('route_10002')"); // 420 ms.
    log(`Deleted from travel-sample via src_bucket N1QL USE KEYS ${Date.now() - tbeg} ms.`);

    tbeg = Date.now();
    N1QL("DELETE FROM `travel-sample` WHERE type = 'route' AND id = " + 10003); // 969 ms.
    log(`Deleted from travel-sample via src_bucket N1QL where clause on type and id ${Date.now() - tbeg} ms.`);

    tbeg = Date.now();
    N1QL("DELETE FROM `travel-sample` WHERE type = 'route' AND id = " + 10004); // 945 ms.
    log(`Deleted from travel-sample via src_bucket N1QL where clause on id  ${Date.now() - tbeg} ms.`);
}

Eventing supports direct lookup by keys in 6.6 via an alias to a bucket that is exposed as a simple JavaScript MAP (see src_bkt in the example above) and also integrates N1QL s allowing the “best of both world” if your application is architected to leverage the asynchronous real time nature of the Eventing service.

By all means look at the 6.6 Eventing documentation Eventing Service: Fundamentals | Couchbase Docs (and read though an example or two) to determine if the flexibility of Eventing might be a reason to upgrade. If you have any lingering questions feel free to DM me.

Best

Jon Strabala
Principal Product Manager - Server‌