Can a n1ql query span multiple indexes?


#1

I’m trying to find ways to improve performance of n1ql queries. The couchbase documentation on http://developer.couchbase.com/documentation/server/current/indexes/gsi-for-n1ql.html describes “Partitioning with GSI” by dividing the index among multiple nodes.

I’ve split the index for the time field by month like this on every node (for redundancy):
Definition: CREATE INDEX bwidx_bwecl_sp_startTime_10_17_4_155_2016_1 ON bwecl(serviceProviderUid,startTime) WHERE (startTime between 1454306400000 and 1456811999999) USING GSI
Definition: CREATE INDEX bwidx_bwecl_sp_startTime_10_17_4_155_2016_2 ON bwecl(serviceProviderUid,startTime) WHERE (startTime between 1456812000000 and 1459486799999) USING GSI
Definition: CREATE INDEX bwidx_bwecl_sp_startTime_10_17_4_155_2016_3 ON bwecl(serviceProviderUid,startTime) WHERE (startTime between 1459486800000 and 1462078799999) USING GSI
… and so on

The query encapsulating jan-march
EXPLAIN SELECT * FROM bwecl AS ecl WHERE serviceProviderUid = ‘SP2200000000’ AND startTime BETWEEN 1454306400000 AND 1462078799999 ORDER BY serviceProviderUid, startTime DESC;
returns
"errors": [
{
“code”: 4000,
“msg”: “No primary index on keyspace bwecl. Use CREATE PRIMARY INDEX to create one.”
}
]

the query encapsualting all of january
EXPLAIN SELECT * FROM bwecl AS ecl WHERE serviceProviderUid = ‘SP2200000000’ AND startTime BETWEEN 1454306400000 AND 1456811999999 ORDER BY serviceProviderUid, startTime DESC;

returns
{ "requestID": "409ec16e-10fa-4ba5-922d-158547fc61dd", "signature": "json", "results": [ { "#operator": "Sequence", "~children": [ { "#operator": "Sequence", "~children": [ { "#operator": "IndexScan", "index": "bwidx_bwecl_sp_startTime_10_17_4_155_2016_1", "keyspace": "bwecl", "namespace": "default", "spans": [ { "Range": { "High": [ "\"SP2200000000\"", "1456811999999" ], "Inclusion": 3, "Low": [ "\"SP2200000000\"", "1454306400000" ] } } ], "using": "gsi" }, { "#operator": "Parallel", "~child": { "#operator": "Sequence", "~children": [ { "#operator": "Fetch", "as": "ecl", "keyspace": "bwecl", "namespace": "default" }, { "#operator": "Filter", "condition": "(((ecl.serviceProviderUid) = \"SP2200000000\") and ((ecl.startTime) between 1454306400000 and 1456811999999))" }, { "#operator": "InitialProject", "result_terms": [ { "expr": "self", "star": true } ] } ] } } ] }, { "#operator": "Order", "sort_terms": [ { "expr": "(ecl.serviceProviderUid)" }, { "desc": true, "expr": "(ecl.startTime)" } ] }, { "#operator": "FinalProject" } ] } ], "status": "success", "metrics": { "elapsedTime": "89.747684ms", "executionTime": "89.568326ms", "resultCount": 1, "resultSize": 3173 } }

this is the exact same query, but the time span for the first one should hit multiple indexes and the time span for the first hits just one.


#2

You can write a query with multiple query blocks with UNION ALL.
Each query block can have predicates spanning individual range from the index WHERE clause.

Right now, this is not done automatically.


#3

@keshav_m Could this perhaps be a planned improvement?


#4

@jberglund: We would have this at some point. We’d have to recognize the contiguous ranges defined within the indices (like partitioned index) and do it automatically.


#5

Has anything changed since the original post in rgards to multiple indexes in a query?


#6

In Couchbase 5.5, we’ve introduced hash partitioned index. Queries will use multiple sub-indexes, do index-pruning, etc automatically.

We still don’t have range partitioned index yet.


#7

… but this can be achieved manually at the moment.


#8

You can create multiple indexes over different ranges manually, but would have to be careful to write queries to be addressed by a single partition or write multiple queries and UNION them.

The idea of range partitioned is that optimizer will figure out the right ranges to scan and do it automatically.