N1QL - Array with Function Base Index


#1

Using the travel-sample, here is portion of the route document:

{
  "airline": "AS",
  "airlineid": "airline_439",
  "destinationairport": "PBG",
  "distance": 322.6135705757482,
  "equipment": "SF3",
  "id": 11600,
  "schedule": [
    {
      "day": 0,
      "flight": "AS783",
      "utc": "22:15:00"
    },
    {
      "day": 0,
      "flight": "AS337",
      "utc": "13:03:00"
    },
    {
      "day": 0,
      "flight": "AS834",
      "utc": "16:09:00"
    },
    {
      "day": 1,
      "flight": "AS228",
      "utc": "23:53:00"
    },
...
...
...
}

What I want to do is index both the “day” and “flight” but “flight” needs to be case insensitive. What I have tried is to create this index:

create index index2 ontravel-sample(distinct array [day, lower(flt.flight)] for flt in schedule END) where type = "route" using gsi;

However, this query does not return anything (e.g., not even the above document):

select * fromtravel-sampleuse index (index2 using gsi) where (any flt in schedule satisfies [day, lower(flt.flight)] = [0, "as783"] END) and type = "route";

Also, what if I want to extend this so that the flight can be partially searched? For example, the N1QL would then look similar to:

select * fromtravel-sampleuse index (index2 using gsi) where (any flt in schedule satisfies [day, lower(flt.flight)] = [0, "as7%"] END) and type = "route";

It’s like I’m trying to use a “LIKE” but in a situation that doesn’t allow me to?


#2

both in create index and query it needs flt.day because day coming from flt.

create index index2 on `travel-sample`(distinct array [flt.day, lower(flt.flight)] for flt in schedule END)
 where type = "route";
select * from `travel-sample` use index (index2 using gsi) 
where (any flt in schedule satisfies [flt.day, lower(flt.flight)] = [0, "as783"] END) and type = "route";

The strategy used above can be used doing array comparison and you can’t use for LIKE. LIKE can be used only on strings.

If you use following IndexScan done on wide range of keys but also applying on individually it eliminates any false positives.

x LIKE "as7%"  means  x >="as7" AND  x < "as8"
x >=  like_prefix("as7%") AND x <  like_stop("as7%")
select like_prefix("as7%"), like_stop("as7%");

select * from `travel-sample` use index (index2 using gsi)
 where (any flt in schedule satisfies [flt.day, lower(flt.flight)] >=  [0, "as7"] AND  [flt.day, lower(flt.flight)] <  [0,"as8"] AND flt.day = 0 AND lower(flt.flight) LIKE  "as7%" END) and type = "route";

As array index can be used as single key index you can index one key from array others can be appiled post index scan.

 create index index2 on `travel-sample`(distinct array lower(flt.flight) for flt in schedule END)
     where type = "route";
    select * from `travel-sample` use index (index2 using gsi) 
    where (any flt in schedule satisfies flt.day= 0 AND  lower(flt.flight) LIKE "as7%" END) and type = "route";

#3

Thanks @vsr1.

The trick with the x >= "as7 " and x < "as8" is quite neat (or even extend that to non-numbers/alphabets) and even the explain plan is similar for the index scan. However, that might make the query quite long/hard to maintain. What was the reason for the inclusion of the last AND lower(flt.flight) LIKE "as7%"?


#4

Array comparison vs individual field comparisons are different you don’t want to get wrong results.
Example: [1,“a3”] > [0,“a7”] is true because when comparing arrays it compares 0th elements if condition is true/false it never compares rest of elements (only it moves to next element when equal)
In this case the index keys and values that pushed to indexer needs to be superset and other condition applies individual parts and eliminates those extra items produced by indexscan

x = 1 , y = "a3"
[x,y] >= [0,"a7"] is true
x >= 1 AND y > "a7" is false
Index key is as array  the array comparison predicate is used to do indexselection/indexscan
which will have  false positives and individual predicates will eliminate those false positives.

Also check out WHEN clause (Example C2,Q2 in the following link) https://developer.couchbase.com/documentation/server/current/n1ql/n1ql-language-reference/indexing-arrays.html