I am trying to understand how Couchbase index choice works, because a specific use case is acting in a confusing manner.
The desired behavior is to provide prefix and suffix wildcard searching on document names for documents of a given subtype (case insensitive). Our doc ID format is
<doc type>---<doc name>. An example query would be for
meta().id like 'template---%option%', where I would want it to match
template---Connor option 1,
template--- option 2,
template---Trevor Option, etc.
We initially tried to use this index:
CREATE INDEX doc_ids ON bucket ( DISTINCT (ARRAY array_element FOR array_element IN SUFFIXES(LOWER(META().id)) END) ) WHERE (LOWER(META().id) LIKE "template---%")) AND NOT CONTAINS(META().id, "<AUDIT>")
select meta().id from bucket where lower(meta().id) like "template---%options%" and not contains(meta().id, "<AUDIT>")
However, it does not use the secondary index, and instead uses the primary index.It also takes a full second to run. I don’t understand why the secondary index isn’t chosen. Furthermore, if I change the select statement to
select * from bucket, OR if I just add
use index (doc_ids), it happily uses the secondary index and runs in 5 ms, indicating that the index is sufficient.
When I changed the index definition statement to add the full meta().id:
CREATE INDEX doc_ids ON bucket ( meta().id, DISTINCT ( ARRAY array_element FOR array_element IN SUFFIXES(LOWER(META().id)) END) ) WHERE (LOWER(META().id) LIKE "template---%")) AND NOT CONTAINS(META().id, "<AUDIT>")
it works as I expect: all queries use the secondary index.
- How is the select clause affecting the index choice?
- Why is it choosing a less efficient index?
- Why do I need to add meta().id to the index’ definition?
Any help or explanatory articles appreciated. Thanks!