Meta().id filtering as part of indexing

unhuman · January 10, 2018, 6:27pm

We have a document key structure that looks something like: env::application::datatype::id for many of our documents. Many of our N1QL indexes, however, merely inspect document contents for indexing. My question is would it help us (performance wise) to split on the meta().id as part of there where clause… So, something like:

CREATE INDEX index_idx ON bucketName(x, y, z) WHERE (split(meta().id), “::”)[1] == ‘desiredApplication’) AND (split(meta().id), “::”)[2] == ‘desiredType’) AND type.application == ‘desiredApplication’ AND type.dataType == ‘desiredType’

versus

CREATE INDEX index_idx ON bucketName(x, y, z) WHERE type.application == ‘desiredApplication’ AND type.dataType == ‘desiredType’

Yes, I realize that the example has duplicate checks and is a bit contrived, but hoping to illustrate what I’m after…

Thanks!

vsr1 · January 11, 2018, 3:53am

It depends on data model and workload. Any predicate as part of the Index WHERE clause needs to be present in Query WHERE clause for index to chose.

If Index Condition is complex then query WHERE clause also makes complex.

You can also try something like below.

CREATE INDEX index_idx ON bucketName(x, y, z) WHERE META().id LIKE "env::application::datatype::%";
CREATE INDEX index_idx ON bucketName(x, y, z) WHERE META().id LIKE "%::application::datatype::%";

If you have too many mutations, documents are big index where clause uses fields inside document, Projector might use high CPU because it needs to UnMarshall document and evaluate fields. If it is META().id No UnMarshall is required.

unhuman · January 11, 2018, 4:48pm

Thanks for the reply. We have indeed found that building an index always makes our data nodes go nuts for CPU. We tend to split the index because it seems to work better than a leading %.