I am mapping usecase to a publicly executable example using bucket
travel-sample. I am only considering documents with
type="route" for my usecase.
Let’s say that I have a Web portal where I am looking at all the flights. I have given following abilities to the user:
- Select any airline
- Select time range of the flight
- Order by minimum distance of the route possible for the flight
I want to show only one record of a flight. Let’s say I just fetch flight code and minimum distance for now. The query looks like
SELECT s.flight, MIN(t.distance) AS minimumDistance
FROM `travel-sample` AS t
USE INDEX (abc)
UNNEST schedule AS s
WHERE t.type = "route"
AND s.utc < "12:00:00"
GROUP BY s.flight
ORDER BY MIN(t.distance)
Running this command without any new index takes
5.6s. I created an optimized index for the query as follows, new run time is
CREATE INDEX abc ON `travel-sample`(ALL(schedule), airline, distance) WHERE type="route";
Is there a way to further optimize my query?
Will we be unable to index object elements inside an array? In the above example, you can see that I want to search for “utc” of the schedule. In the documentation, I see that we can only using element matches on nested arrays, not elements inside an object inside the array.
Have you tried running the index advisor on your query in the playground @vsr1 linked to?
A suggested index may be:
CREATE INDEX adv_ALL_schedule_utc_airline_type ON `travel-sample`(ALL ARRAY `s`.`utc` FOR s IN `schedule` END,`airline`) WHERE `type` = 'route'
which does index the object fields of the array elements.
Does such an index help with your query’s performance?
No, this index does not help the query. The time taken is still 2.1s.
The main problem is that the index suggestion is taking care of individual element in the array, effectively using array comparison. I need to use more than 1 element at once. I am filtering by utc, but grouping by flight. In my actual usecase, I am also using multiple filters on the array elements.
At present Array Indexing key can only index one element only.
If you need more than one those must be done post indexScan.
Also checkout Composite index from an array of objects
Thanks. This suggestion worked enough to control the timing, although I wouldn’t say that the query is fully optimized.