Indexing of deeply nested object


#1

I have deeply nested object. I calculate bunch of stats based on that object. => multiple separate N1QL queries in Eventing ( around 100 queries). The performance is very slow ~15-30 sec to create the output document. I am looking into indexes to improve performance.

Question:

  1. Is it better to create separate index for each query vs creating 1 index that covers all attributes and nested items?
  2. What is the best way to inspect performance of Eventing function, time it runs?

Thank you


#2

The following articles have information.

https://blog.couchbase.com/create-right-index-get-right-performance/
https://blog.couchbase.com/a-guide-to-n1ql-features-in-couchbase-5-5-special-edition/


#3
  1. Eventually, one index should be shared by more than one query. But, during experimentation create the ideal index for each query and then combine them using the principles (specially rule #11) in the article above.
  2. Best way to inspect performance of individual queries is to enable profiling and do it via the couchbase web-console. It has graphical way to see the performance easily.

More stuff on the way for analysis within N1QL in the upcoming release of Couchbase.

https://blog.couchbase.com/on-par-with-window-functions-in-n1ql/
https://blog.couchbase.com/json-to-insights-fast-and-easy/


#4

Golf sample is interesting. I see only 1 primary index in the article.

Question:

  1. Were there additional indexes to improve performance of that query?
    I have similar deeply nested structure and trying to boost the performance with Covered indexes. The challenge is that when query is complex and nested only primary index is picked up.

Query I am trying to add covered index:

SELECT RAW ROUND(AVG(result.speed),2) 
    FROM array_flatten(
		(SELECT RAW (SELECT  shots.speed
        FROM p.shot shots
        WHERE p.context.tournament.tournYear = 2019
			AND p.context.tournament.Id = 999
			AND p.context.match.tourId = 'MS023'
			AND shots.shotType='SERVE'
			AND p.type='point'
			AND p.shot is not missing)
    FROM points p
    ),2) as result;

Index command:

CREATE INDEX avg_speed ON points (context.tournament.tournYear, context.tournament.Id, context.match.tourId,
(ARRAY {s.shotType} FOR s in shot END)) WHERE (`type` = 'point' AND shot IS NOT MISSING);

When I run explain command, I see only primary index being used.
If I use explain command on inner part of query, I do see new index being used:

SELECT  shot
        FROM points p
        WHERE p.context.tournament.tournYear = 2019
			AND p.context.tournament.Id = 999
			AND p.context.match.tourId = 'MS023'
			AND p.type='point'
			AND p.shot is not missing

Question:
2. Is it possible to use index other than primary on deeply nested queries with deeply nested objects?


#5

You don’t have WHERE clause on FROM points p without it it uses primary index.

CREATE INDEX avg_speed ON points (context.tournament.tournYear, context.tournament.Id, context.match.tourId,
ARRAY_SUM(ARRAY shots FOR shots IN shot WHEN shots.shotType='SERVE' END)) WHERE (`type` = 'point' AND shot IS NOT MISSING)

SELECT  RAW ROUND(AVG(ARRAY_SUM(ARRAY shots FOR shots IN p.shot WHEN shots.shotType='SERVE' END)),2)
FROM points p
WHERE p.context.tournament.tournYear = 2019
      AND p.context.tournament.Id = 999
      AND p.context.match.tourId = 'MS023'
      AND p.type='point'
      AND p.shot is not missing);