Performance on the IN operator

n1ql

#1

select * from bucket where field in [“a”,“b”,“c”]

vs

select * from bucket where field in [“a”,“b”,“c”, “d”, “e”, “f”, “g”, “h”, “i”, “j”, “k”, “l”, “m”, “n”, “o”, “p”, “q”, “r”, “s”, “t”, “u”, “v”, “w”, “x”, “y”, “z”, “1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”]

has an substantial increase in performance, from 10ms to 100ms. is there any other way to optimize this query or write it differently?


#2

You can create index on field.
CREATE INDEX ix1 ON bucket(field);

If number of fields increased IndexScan needs to look for those values in index and results can increase which in turn increases latency. In addition In evaluated one by one.


#3

I already have an index, where its like this:
create index ix1 on bucket(blah,blahblah,some,other,things,field,morefield,something)

where field is the one im using for the in operator.

Are you saying i should have a separate index on field itself?

PS: The query is more like this:
select something from bucket where blah = ‘1’ and blahblah = ‘2’ and some = ‘3’ and other = ‘4’ and things = ‘5’ and field in [“a”,“b”,“c”, “d”, “e”, “f”, “g”, “h”, “i”, “j”, “k”, “l”, “m”, “n”, “o”, “p”, “q”, “r”, “s”, “t”, “u”, “v”, “w”, “x”, “y”, “z”, “1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”] and morefield = ‘6’


#4

If you already have like above index is good. You can check EXPLAIN and profiling. There is no further optimization.

https://blog.couchbase.com/optimize-n1ql-performance-using-request-profiling/


#5

For those interested, i found an optimization that seems to work for now:

use CONTAINS

i.e.
select something from bucket where blah = ‘1’ and blahblah = ‘2’ and some = ‘3’ and other = ‘4’ and things = ‘5’ and CONTAINS(“abcdefghijklmnopqrstuvwxyz0123456789", field) and morefield = ‘6’


#6

If field is single character and want to match ascii character or number you can also try this
(filed BETWEEN “a” AND “z” OR field BETWEEN “0” AND “9”)