Can a n1ql query span multiple indexes?

I’m trying to find ways to improve performance of n1ql queries. The couchbase documentation on http://developer.couchbase.com/documentation/server/current/indexes/gsi-for-n1ql.html describes “Partitioning with GSI” by dividing the index among multiple nodes.

I’ve split the index for the time field by month like this on every node (for redundancy):
Definition: CREATE INDEX bwidx_bwecl_sp_startTime_10_17_4_155_2016_1 ON bwecl(serviceProviderUid,startTime) WHERE (startTime between 1454306400000 and 1456811999999) USING GSI
Definition: CREATE INDEX bwidx_bwecl_sp_startTime_10_17_4_155_2016_2 ON bwecl(serviceProviderUid,startTime) WHERE (startTime between 1456812000000 and 1459486799999) USING GSI
Definition: CREATE INDEX bwidx_bwecl_sp_startTime_10_17_4_155_2016_3 ON bwecl(serviceProviderUid,startTime) WHERE (startTime between 1459486800000 and 1462078799999) USING GSI
… and so on

The query encapsulating jan-march
EXPLAIN SELECT * FROM bwecl AS ecl WHERE serviceProviderUid = ‘SP2200000000’ AND startTime BETWEEN 1454306400000 AND 1462078799999 ORDER BY serviceProviderUid, startTime DESC;
returns
"errors": [
{
“code”: 4000,
“msg”: “No primary index on keyspace bwecl. Use CREATE PRIMARY INDEX to create one.”
}
]

the query encapsualting all of january
EXPLAIN SELECT * FROM bwecl AS ecl WHERE serviceProviderUid = ‘SP2200000000’ AND startTime BETWEEN 1454306400000 AND 1456811999999 ORDER BY serviceProviderUid, startTime DESC;

returns
{ "requestID": "409ec16e-10fa-4ba5-922d-158547fc61dd", "signature": "json", "results": [ { "#operator": "Sequence", "~children": [ { "#operator": "Sequence", "~children": [ { "#operator": "IndexScan", "index": "bwidx_bwecl_sp_startTime_10_17_4_155_2016_1", "keyspace": "bwecl", "namespace": "default", "spans": [ { "Range": { "High": [ "\"SP2200000000\"", "1456811999999" ], "Inclusion": 3, "Low": [ "\"SP2200000000\"", "1454306400000" ] } } ], "using": "gsi" }, { "#operator": "Parallel", "~child": { "#operator": "Sequence", "~children": [ { "#operator": "Fetch", "as": "ecl", "keyspace": "bwecl", "namespace": "default" }, { "#operator": "Filter", "condition": "(((ecl.serviceProviderUid) = \"SP2200000000\") and ((ecl.startTime) between 1454306400000 and 1456811999999))" }, { "#operator": "InitialProject", "result_terms": [ { "expr": "self", "star": true } ] } ] } } ] }, { "#operator": "Order", "sort_terms": [ { "expr": "(ecl.serviceProviderUid)" }, { "desc": true, "expr": "(ecl.startTime)" } ] }, { "#operator": "FinalProject" } ] } ], "status": "success", "metrics": { "elapsedTime": "89.747684ms", "executionTime": "89.568326ms", "resultCount": 1, "resultSize": 3173 } }

this is the exact same query, but the time span for the first one should hit multiple indexes and the time span for the first hits just one.

You can write a query with multiple query blocks with UNION ALL.
Each query block can have predicates spanning individual range from the index WHERE clause.

Right now, this is not done automatically.

@keshav_m Could this perhaps be a planned improvement?

@jberglund: We would have this at some point. We’d have to recognize the contiguous ranges defined within the indices (like partitioned index) and do it automatically.

Has anything changed since the original post in rgards to multiple indexes in a query?

In Couchbase 5.5, we’ve introduced hash partitioned index. Queries will use multiple sub-indexes, do index-pruning, etc automatically.

We still don’t have range partitioned index yet.

… but this can be achieved manually at the moment.

You can create multiple indexes over different ranges manually, but would have to be careful to write queries to be addressed by a single partition or write multiple queries and UNION them.

The idea of range partitioned is that optimizer will figure out the right ranges to scan and do it automatically.

1 Like