Query on indexed array gives unexpected result

Dear all,

I have two similar queries for which I get different (unexpected) result.
I try to be clear as possible. The 2 queries have the same field “hit by” the where condition, that is an array called searckKeyA.

query 1 - results OK, but slow
select * from products where "photofast" in searchKeyA;

query 2 - results KO, but fast, indexed
select * from products where ANY v IN searchKeyA satisfies v = "photofast" end;

With the query word “photofast” the first query fetches 3 documents (that’s OK), the second only 1.
I double-checked to be sure that the array field “searchKeyA” contains “photofast” in every of the 3 documents.

I think the reason may be hidden beyond the array index I built on searchKeyA.

create index idx on products(all array v FOR v IN searchKeyA END, searchKeyA) using gsi;

(older index, same result - using "distinct")
create index idx on products(distinct array v FOR v IN searchKeyA END, searchKeyA) using gsi;

The plans of the 2 queries are different because of this specific index I described above.
Query 1 is executed as a PrimaryScan; Query 2 has the following plan instead:

"plan": {
  "#operator": "Sequence",
  "~children": [
    {
      "#operator": "DistinctScan",
      "scan": {
        "#operator": "IndexScan",
        "index": "idx",
        "index_id": "ee3c38798f820ce6",
        "keyspace": "products",
        "namespace": "default",
        "spans": [
          {
            "Exact": true,
            "Range": {
              "High": [
                "successor(\"photofast\")"
              ],
              "Inclusion": 1,
              "Low": [
                "\"photofast\""
              ]
            }
          }
        ],
        "using": "gsi"
      }
    },
    {
      "#operator": "Fetch",
      "keyspace": "products",
      "namespace": "default"
    },
    {
      "#operator": "Parallel",
      "~child": {
        "#operator": "Sequence",
        "~children": [
          {
            "#operator": "Filter",
            "condition": "any `v` in (`products`.`searchKeyA`) satisfies (`v` = \"photofast\") end"
          },
          {
            "#operator": "InitialProject",
            "result_terms": [
              {
                "expr": "self",
                "star": true
              }
            ]
          },
          {
            "#operator": "FinalProject"
          }
        ]
      }
    }
  ]
},
"text": "\nselect *\nfrom products\nwhere ANY v IN searchKeyA satisfies v = \"photofast\" end;"

Shouldn’t the result set be the same for the 2 queries?
Can you tell me what makes this difference? Any suggestion would be appreciated.

PS: the 2 queries give the same result set (3 rows, correct) when I drop “idx”

Thank you, my best regards.

Stefano

This may help you.
I queried for searchKeyA of the 3 documents I need to fetch. When using “idx”, only the 3rd document is retrieved.

[
  {
	"searchKeyA": [
	  "nilox",
	  "chiavette",
	  "usb",
	  "chiavetta",
	  "photofast",
	  "memoria",
	  "esterna",
	  "per",
	  "smartphone",
	  "128gb",
	  "mcg3u3bk128gb"
	]
  },
  {
	"searchKeyA": [
	  "nilox",
	  "chiavette",
	  "usb",
	  "chiavetta",
	  "photofast",
	  "memoria",
	  "esterna",
	  "per",
	  "smartphone",
	  "32gb",
	  "mcg3u3bk32gb"
	]
  },
  {
	"searchKeyA": [
	  "nilox",
	  "lettori",
	  "memory",
	  "card",
	  "lettore",
	  "photofast",
	  "per",
	  "smartphone",
	  "4kireader"
	]
  }
]

Check see if index caught up and no pending items.
Check the indexer.log any skipped items due too size of index keys.

Dear @vsr1,

Thank you for your suggestion.
If I try to access the indexer.log file I get a permission denied error, even using linux bash.

Stefano

Check with user who had permission to that files. @deepkaran.salooja

Hi @deepkaran.salooja ,

I don’t have any permission to read file starting from path /opt/couchbase/var/ , even if folder has read+write+execute permission on it.

No problems at all on other folders inside /opt/couchbase/ : I can easily read, download, copy files, etc…

Let me know if it’s better to move the topic into a new one, so we can help other devs…

Thank you,

Stefano

Look for any skipped documents in the indexer. It will log documents meta().id and with info

Post the sample document, Index,Query

Hi @vsr1,

Unfortunately I have 2 replicas and in the first checks on log files I didn’t see anything because my index was on node :31 and indexer.log logs only node:35.
Sorry about that, I made you lose your time.

Now, I moved the index also on node:35 and I can finally see 7762 Error rows:

[Error] Encoded array key too long. Length of key = 31028, Limit = 30978

Can I increase index dimensions? :slight_smile:

My very best regards, thank you again for your support.

Stefano

@deepkaran.salooja will help here how to increase keysize.

One option will be make remove serachKeyA from index. This may make query uncovered.
create index idx on products(distinct array v FOR v IN searchKeyA END) using gsi;

1 Like

@stefano.redaelli, what version of Couchbase are you using?

Hello @deepkaran.salooja,

Thank you for your reply, I’m using 4.6.2-3905 Enterprise Edition (build-3905)

Stefano

@stefano.redaelli, In that case you can remove searchKeyA from the index as separate key. Checkout Implicit Covered Array Index https://developer.couchbase.com/documentation/server/current/n1ql/n1ql-language-reference/indexing-arrays.html

Thank you @deepkaran.salooja,

I’m trying it now.

Stefano

Dear @vsr1, @deepkaran.salooja,

Now the query result is ok and fast :smile:
Thank you for making me aware of index dimensions and how to check the log.

My best regards, have a nice day

Stefano