Slow N1QL performance on covered queries

Hello!

We have a single node cluster and about 160,000 documents in our bucket. All of our N1QL queries are running slowly. The following query, although it is fully covered with an index, takes much longer to run than expected (~2 seconds).

The query:

SELECT per.nameFirst, per.nameLast, per.dateOfBirth
FROM deco_db_deco per
WHERE per.type = ‘person’ AND LOWER(per.nameLast) like '%smith%'
ORDER BY per.nameLast desc

Here is the explain:

EXPLAIN:
[
{
“plan”: {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Sequence”,
"~children": [
{
"#operator": “IndexScan”,
“covers”: [
“cover ((per.type))”,
“cover ((per.nameFirst))”,
“cover ((per.nameLast))”,
“cover ((per.dateOfBirth))”,
“cover ((per.ssn))”,
“cover ((per.client_ids))”,
“cover ((per.address))”,
“cover ((meta(per).id))”,
“cover ((meta(per).id))”
],
“filter_covers”: {
“cover ((per.type))”: “person”
},
“index”: “search_persons_all_index”,
“index_id”: “9fd4b2dae726acfe”,
“keyspace”: “deco_db_deco”,
“namespace”: “default”,
“spans”: [
{
“Exact”: true,
“Range”: {
“High”: [
“successor(“person”)”
],
“Inclusion”: 1,
“Low”: [
"“person”"
]
}
}
],
“using”: “gsi”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Filter”,
“condition”: “((cover ((per.type)) = “person”) and (lower(cover ((per.nameLast))) like “%smith%”))”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “cover ((per.nameFirst))”
},
{
“expr”: “cover ((per.nameLast))”
},
{
“expr”: “cover ((per.dateOfBirth))”
}
]
}
]
}
}
]
},
{
"#operator": “Order”,
“sort_terms”: [
{
“desc”: true,
“expr”: “cover ((per.nameLast))”
}
]
},
{
"#operator": “FinalProject”
}
]
},
“text”: “SELECT per.nameFirst, per.nameLast, per.dateOfBirth \n\tFROM deco_db_deco per \n\tWHERE per.type = ‘person’ AND LOWER(per.nameLast) like ‘%smith%’\n\tORDER BY per.nameLast desc”
}
]

Does anyone have any thoughts or ideas as to why this seemingly simple and indexed query takes so long to run?

Thanks.

  1. The query uses LIKE with leading % on LOWER(per.nameLast) and can’t push the predicate to the Indexer. It means it needs to get all the qualified items of type = “person” and apply the predicate
  2. Also it is doing sort
  3. The index is wide.

If you are using 4.5.1 or above check out SUFFIXES() for LIKE queries https://blog.couchbase.com/n1ql-functionality-enhancements-in-couchbase-server-4-5-1/
https://dzone.com/articles/a-couchbase-index-technique-for-like-predicates-wi

CREATE INDEX ix1 ON default(DISTINCT ARRAY s FOR s IN SUFFIXES(LOWER(nameLast)) END, nameLast, nameFirst, dateOfBirth) WHERE type = "person";

SELECT per.nameFirst, per.nameLast, per.dateOfBirth
FROM default per
WHERE per.type = "person" AND ANY s IN SUFFIXES(LOWER(per.nameLast)) SATISFIES s LIKE "smith%" END
ORDER BY per.nameLast desc;

Thank you for your quick reply, much appreciated. So I tried changing the query to not use a wildcard in it–

SELECT per.nameFirst, per.nameLast, per.dateOfBirth
FROM deco_db_deco per
WHERE per.type = ‘person’ AND per.nameLast = ‘Smith’

This query is also fully covered by an index, but takes 2 seconds to run. We removed the wildcards. Any thoughts or ideas why it takes so long?

Thanks

Couchbase Indexer based on b-tree and only leading keys can pushed to indexer when they are present in the predicate, If any leading key not present in the predicate, any trailing keys in predicates can’t pushed to indexer and need to apply after Full Index Scan on non-pushed keys.

The predicate is on type and nameLast but index keys have type, nameFirst, nameLast… i.e it needs to get all the items of type = “person” this same as before.
Right index will be

CREATE INDEX ix1 ON default(nameLast, nameFirst, dateOfBirth) WHERE type = "person";

The following book will has useful information.

Wow thank you that really helped, that decreased the query speed to 46ms (from 2s)! So the order of the keys in the index matters more than we initially thought?

it matters. The book has articles how to optimize the queries and design index.