Optimize N1ql query to match embedded array elements

ashish_pathak · March 9, 2021, 11:57am

Hi,

I have documents like below:
{
“type”: “tp1”,
“colors”: [
“Blue:10”,
“Yellow:20”,
“Green:*”,
],
}
The elements in the colors array are not fixed and may increase or decrease.

I am creating below N1QL query:

SELECT b1.* from bucket as b1 where type = “tp1” AND (ARRAY_CONTAINS(b1.colors, “Blue:10”) OR ARRAY_CONTAINS(b1.colors, “Blue:")) AND (ARRAY_CONTAINS(b1.colors, “Yellow:10”) OR ARRAY_CONTAINS(b1.colors, "Yellow:”)) AND (ARRAY_CONTAINS(b1.colors, “Green:10”) OR ARRAY_CONTAINS(b1.colors, “Green:*”))

The index created for this is:
CREATE INDEX idx ON bucket(colors) WHERE (type = “tp1”)

Couchbase version 6.5.1

This query is slow and I want to know if there is a faster way to get the documents?
In addition, I have the privilege of modifying ‘colors’ array to optimize performance. Can you suggest?

vsr1 · March 9, 2021, 2:14pm

CREATE INDEX `idx` ON `bucket` ( DISTINCT `colors` ) WHERE ( `type` = “tp1”);
SELECT b1.* 
FROM  bucket AS b1
WHERE b1.type = "tp1" AND ANY v IN b1.colors SATISFIES v  IN [ "Blue:10", "Blue"] END

https://docs.couchbase.com/server/current/n1ql/n1ql-language-reference/indexing-arrays.html

ashish_pathak · March 9, 2021, 4:44pm

let’s say we have docs like
{
“type”: “tp1”,
“colors”: [
“Blue:10”,
“Yellow:20”,
“Green:*”,
],
}

and another doc as
{
“type”: “tp1”,
“colors”: [
“Blue:10”,
“Yellow:20”,
“Green:20”,
],
}

and in API i/p we have Blue:10, Yellow:20 and Green:40. since we query as
SELECT b1.* from bucket as b1 where type = “tp1” AND (ARRAY_CONTAINS(b1.colors, “Blue:10”) OR ARRAY_CONTAINS(b1.colors, “Blue: ")) AND (ARRAY_CONTAINS(b1.colors, “Yellow:20”) OR ARRAY_CONTAINS(b1.colors, "Yellow: ”)) AND (ARRAY_CONTAINS(b1.colors, “Green:40”) OR ARRAY_CONTAINS(b1.colors, “Green:*”)) we expect the 1st document to be retrieved.

We basically need to match the whole array, and want a more efficient query.

vsr1 · March 9, 2021, 8:01pm

CREATE INDEX `idx` ON `bucket` ( DISTINCT `colors` ) WHERE ( `type` = “tp1”);

SELECT b1.* 
FROM  bucket AS b1
WHERE b1.type = "tp1"
      AND ANY v IN b1.colors SATISFIES v  LIKE "Blue:%" END
     AND EVERY v1 IN  ["Blue:%","Yellow:%", "Green:%"] SATISFIES (ANY c IN b1.cloros SATISFIES c LIKE v1 END) END;

use one of the color in ANY v IN b1.colors SATISFIES v LIKE “Blue:%” END use prefix search, So that it can use indexScan. After that use EVERY clause to match all the possible colors to search. Adjust conditions based on what you want

ashish_pathak · March 10, 2021, 7:15am

Hi,

Thanks you for the reply.

We checked the query above and it fails to satisfy our requirement as it returns both documents.
Since we have Green: 40 in input, our requirement is to retrieve only the doc that has Green:*.
* doesn’t mean any field value but a field that actually exists with *.

vsr1 · March 10, 2021, 3:36pm

CREATE INDEX `idx` ON `bucket` ( DISTINCT `colors` ) WHERE ( `type` = "tp1");

SELECT b1.* 
FROM  bucket AS b1
WHERE b1.type = "tp1"
      AND ANY v IN b1.colors SATISFIES v  LIKE "Blue:%" END
      AND (ARRAY_CONTAINS(b1.colors, “Blue:10”) OR ARRAY_CONTAINS(b1.colors, “Blue: ")) AND (ARRAY_CONTAINS(b1.colors, “Yellow:20”) OR ARRAY_CONTAINS(b1.colors, "Yellow: ”)) AND (ARRAY_CONTAINS(b1.colors, “Green:40”) OR ARRAY_CONTAINS(b1.colors, “Green:*”))