Array operation slow of n1ql

n1ql

#1

Hi
I don’t know why my query is so slow when I unnest an array. I am using couchbase-server 5.0.
I have a bucket called “bkarr”. The structure of data is very simple.

{
  "s":  an interger 
  "arr":[  {"t":1, "v":2}, {"t":2, "v":9} ... ]
}

The length of “arr” is 1000,000. Yes, it’s somehow large.
An index is created on “s”.
I inserted 100 entries.
However the following query is not as fast as I excepted.

select s, array i for i in arr when i.t < 10 end
from bkarr
where s is not missing
limit 1

It took 4 seconds.

Also the query blow never give a response.

select count(items),sum(items.v)
from bkarr
unnest arr as items
where bkarr.s = 60

I should select the “arr” from the row with s = 60 right?
Calculating the sum should not be that slow.


#2

ARRAY construct is looping operation. i.e. it is needs to loop Million times and construct new array depends on qualifying condition. That is why it takes long.

Unnest is nothing but JOIN between original document with each element.
In your case the original document need to repeat Million times and add the new field one at a time.
As your original document itself so big repeating same document again Million times will take time on top you are doing aggregation.

Is bkarr.s = 60 is fixed or the value changes time to time.


#3

bkarr.s = 60, 60 is a variable, it is an input value.


#4

Try this without Unnest

CREATE INDEX ix1 ON bkarr(s);
select SUM(ARRAY_COUNT(a)), SUM(ARRAY_SUM(a))
FROM bkarr
LET a = ARRAY i.v FOR i IN arr END
WHERE bkarr.s = 60;

#5

Yes, It becomes much faster.
However my goal is to get “group by” against the array, like below:

select count(items),sum(items.v)
from bkarr
unnest arr as items
where bkarr.s = 60
group by round(item.t / 1000)

I don’t know how to achieve it without using unnest. Is there any solution?


#6

Using subquery expressions without Unnest. Doing grouping of array for each row and then doing across rows.

CREATE INDEX ix1 ON bkarr(s);
SELECT d.g, SUM(d.a) , SUM(d.s)
FROM ARRAY_FLATTEN(  (SELECT  RAW (SELECT g, COUNT(1) AS a , SUM(a.v) AS s
                                 FROM b.arr AS a LET g = ROUND(a.t/100)
                                 GROUP BY g)
                     FROM bkarr AS b
                     WHERE  b.s = 60)
     , 2) AS d
GROUP BY d.g;

#7

Thanks, it works fine now.
Anyhow it seems that array is not a ideal solution for Millions datum.