Index specific items inside an array not working

patrick.kramer · February 14, 2019, 5:03pm

I am trying to index specific fields inside of an array instead of the entire array. I have the following index:

create index `idx_aggregates` ON `ug01`
  (
  	array
  	{
  		a.avg, 
  		a.`start`, 
  		a.`end`,
  		a.measurement
  		
  	} for a in aggregates when a.measurement = 't' end	
  )
  where (type = "aggregates" and aggregates is valued);

select meta(doc).id as docId,
array { a.avg, a.measurement} for a in aggregates end as aggs
from ug01 doc
where doc.type = 'aggregates' and doc.aggregates is valued and a.measurement = 't'

When running that query, I am getting : "No index available on keyspace ug01 that matches your query. "

If I add the entire “aggregates” array to the index, the query works, but I was trying to avoid doing that.

What am I doing wrong here?

vsr1 · February 14, 2019, 5:17pm

The index does not qualify. Index key must be present in the predicate. This is complex not sure there is easy way to do that.

CREATE INDEX `idx_aggregates` ON `ug01` (
        ARRAY_LENGTH(ARRAY 1 FOR a IN aggregates WHEN a.measurement = 't' END),
        ARRAY { a.avg, a.`start`, a.`end`, a.measurement } FOR a IN aggregates WHEN a.measurement = 't' END
  ) WHERE (type = "aggregates");

SELECT META(doc).id AS docId,
       ARRAY { a.avg, a.`start`, a.`end`, a.measurement } FOR a IN doc.aggregates WHEN a.measurement = 't' END
FROM ug01 AS doc
WHERE doc.type = 'aggregates' AND
      ARRAY_LENGTH(ARRAY 1 FOR a IN doc.aggregates WHEN a.measurement = 't' END) > 0;

patrick.kramer · February 14, 2019, 5:28pm

Is there a better way to do what I am trying to do? It seems like this would be a common thing to do without the array_length code

vsr1 · February 14, 2019, 5:35pm

Index selection is based on predicate you need to write predicate that can be compared to relation operator to produce bool. There is way you can do with your ARRAY key.

If you don’t want covering index you can do this.

CREATE INDEX `idx_aggregates` ON `ug01` (   DISTINCT ARRAY a.measurement FOR a IN aggregates END  ) WHERE (type = "aggregates");

SELECT META(doc).id AS docId,
       ARRAY { a.avg, a.`start`, a.`end`, a.measurement } FOR a IN doc.aggregates WHEN a.measurement = 't' END
FROM ug01 AS doc
WHERE doc.type = 'aggregates' AND   ANY a IN doc.aggregates SATISFIES  a.measurement = 't' ;

Check out Understanding IndexScans https://blog.couchbase.com/n1ql-practical-guide-second-edition/

patrick.kramer · February 14, 2019, 5:59pm

Thanks, I’ll read through that.

The examples don’t really explain what this is really doing: DISTINCT ARRAY a.measurement FOR a IN aggregates END

Does that create an index with an array that only has the values of “a.measurement” in it? Like:
[ { measurement : “t” }]

It is similar to this on the N1QL docs page: CREATE INDEX idx_sched ON travel-sample ( DISTINCT ARRAY v.flight FOR v IN schedule END );

vsr1 · February 14, 2019, 6:16pm

Yes. It is called Array index. https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/indexing-arrays.html

If you want you can use this approach which is more efficient. First index key will be used to index selection, 2nd index key will be used for projection and it is covered.

CREATE INDEX `idx_aggregates` ON `ug01` (
        FIRST true FOR a IN aggregates WHEN a.measurement = 't' END,
        ARRAY { a.avg, a.`start`, a.`end`, a.measurement } FOR a IN aggregates WHEN a.measurement = 't' END
  ) WHERE (type = "aggregates");

SELECT META(doc).id AS docId,
       ARRAY { a.avg, a.`start`, a.`end`, a.measurement } FOR a IN doc.aggregates WHEN a.measurement = 't' END
FROM ug01 AS doc
WHERE doc.type = 'aggregates' AND
      (FIRST true FOR a IN aggregates WHEN a.measurement = 't' END) == true;

patrick.kramer · February 14, 2019, 6:27pm

Wow, those are super fast. I don’t understand them completely but thank you.