N1QL - Array of string that are keys to another documents

scheok · April 19, 2018, 12:46am

Hi all,

I went through the forum (and Google) trying to look for an example query but was not successful. The closest is related to a single value retrieved from an array. However, I want the entire array used for the key lookup.

Here is an example:

{
a : [“111”, “222”, “333”]
}

“111”, “222”, and “333” are document ID for other documents.

If I want to:
select * from default use keys (select raw a from default …);

I will receive the error:
“Missing or invalid primary key [111 222 333] of type []interface {}.”

If I then run:
select * from default use keys (select raw a[0] from default …);

Then this is equivalent to:
select * from default use keys (“111”);

Is there any way to use the entire array?

Thank you.
Regards, Steve

atom_yang · April 19, 2018, 4:17am

try this

SELECT META(a).id, a.* 
  FROM default b 
  JOIN default a ON KEYS (b.a)
 WHERE b.a IS NOT MISSING

vsr1 · April 19, 2018, 4:26am

@atom_yang suggestion is best one.

Your query returns error because a is ARRAY and subquery can produce multiple rows, So Subquery results are ARRAY of ARRAYs. USE KEYS expecting string or ARRAY of strings. You can FLATTEN ARRAY of ARRAYs to ARRAY as follows

select * from default use keys ARRAY_FLATTEN((select raw a from default …),2);

scheok · April 19, 2018, 4:01pm

Thank you both @atom_yang and @vsr1.

I tried @atom_yang suggestion, but do I need an index? (I’m trying to avoid that…) This was my intended follow-up question on whether or not I can simply JOIN the results together. However, I wasn’t sure if I would need an index on b.a (using the above example where the keys are). Since the query didn’t work, I did create a simple index (create index idx_test on default (a);) and that works. Is there a way to avoid the index?

@vsr1 suggestion will work without the index.

vsr1 · April 19, 2018, 4:13pm

Your bucket might have different type of documents. You need to aware of the what you are doing. Without secondary index it uses primary index to get keys (i.e subquery). It may impact the performance

scheok · April 19, 2018, 4:37pm

Correct, that bucket will indeed have different documents, and I do see your point.

If I were to change what @atom_yang provided to:

 SELECT META(a).id, a.* 
  FROM default b USE KEYS 'myKey1'
  JOIN default a ON KEYS (b.a)
 WHERE b.a IS NOT MISSING

The above will then work without an index.

Both query have different explain plan. @atom_yang query requires a join and filter. Your query uses an array function but not really reflected in the explain plan. Which would be more efficient? In the RDBMS side of things, we usually try to avoid function calls as it is not always scalable (e.g., when you start to have too many b.a (from above or just too many values) whereas JOINs will usually get your answer quicker/more efficient.

vsr1 · April 19, 2018, 4:40pm

EXPLAIN will not include subqueries at present (Unless those are in FROM clause).
I would prefer @atom_yang approach, But you can try both and see which one performs better for your data model and size of data in the bucket.