Array operation slow of n1ql

s.tedrahedron · March 13, 2018, 2:58am

Hi
I don’t know why my query is so slow when I unnest an array. I am using couchbase-server 5.0.
I have a bucket called “bkarr”. The structure of data is very simple.

{
  "s":  an interger 
  "arr":[  {"t":1, "v":2}, {"t":2, "v":9} ... ]
}

The length of “arr” is 1000,000. Yes, it’s somehow large.
An index is created on “s”.
I inserted 100 entries.
However the following query is not as fast as I excepted.

select s, array i for i in arr when i.t < 10 end
from bkarr
where s is not missing
limit 1

It took 4 seconds.

Also the query blow never give a response.

select count(items),sum(items.v)
from bkarr
unnest arr as items
where bkarr.s = 60

I should select the “arr” from the row with s = 60 right?
Calculating the sum should not be that slow.

vsr1 · March 13, 2018, 3:10am

ARRAY construct is looping operation. i.e. it is needs to loop Million times and construct new array depends on qualifying condition. That is why it takes long.

Unnest is nothing but JOIN between original document with each element.
In your case the original document need to repeat Million times and add the new field one at a time.
As your original document itself so big repeating same document again Million times will take time on top you are doing aggregation.

Is bkarr.s = 60 is fixed or the value changes time to time.

s.tedrahedron · March 13, 2018, 3:15am

bkarr.s = 60, 60 is a variable, it is an input value.

vsr1 · March 13, 2018, 3:20am

Try this without Unnest

CREATE INDEX ix1 ON bkarr(s);
select SUM(ARRAY_COUNT(a)), SUM(ARRAY_SUM(a))
FROM bkarr
LET a = ARRAY i.v FOR i IN arr END
WHERE bkarr.s = 60;

s.tedrahedron · March 13, 2018, 3:26am

Yes, It becomes much faster.
However my goal is to get “group by” against the array, like below:

select count(items),sum(items.v)
from bkarr
unnest arr as items
where bkarr.s = 60
group by round(item.t / 1000)

I don’t know how to achieve it without using unnest. Is there any solution?

vsr1 · March 13, 2018, 3:48am

Using subquery expressions without Unnest. Doing grouping of array for each row and then doing across rows.

CREATE INDEX ix1 ON bkarr(s);
SELECT d.g, SUM(d.a) , SUM(d.s)
FROM ARRAY_FLATTEN(  (SELECT  RAW (SELECT g, COUNT(1) AS a , SUM(a.v) AS s
                                 FROM b.arr AS a LET g = ROUND(a.t/100)
                                 GROUP BY g)
                     FROM bkarr AS b
                     WHERE  b.s = 60)
     , 2) AS d
GROUP BY d.g;

s.tedrahedron · March 13, 2018, 3:52am

Thanks, it works fine now.
Anyhow it seems that array is not a ideal solution for Millions datum.