Index for query with order by


#1

Hello,

I am trying to leverage the Couchbase indexes to get better performance in a query with an ORDER BY. Here is an example of a query:

SELECT users.*
FROM default users
WHERE users.type="User"
AND users.someId="someId"
AND users.status="following"
AND users.active=true
ORDER BY users.creationDate DESC
LIMIT 10

I have created the following index:

CREATE INDEX idx_user ON default(type, someId, status, active) WHERE type="User"

which has great performance for when there is no ORDER BY. What I am trying to achieve now, is to get an index (or indexes) that will allow me to execute the query with:

ORDER BY users.creationDate DESC (by default) OR ORDER BY users.creationDate ASC.

This query will be executed in the context of a listing, hence the need to have both sorting orders. How can I adjust my index to improve performance on the query?

Thank you in advance


#2

All predicates are equality predicates you can try in this. EXPLAIN the query check there should not be any Order Operator and “limit”:10 should present in IndexScan operator

In 5.0.0

CREATE INDEX idx_user ON default( someId, status, active, creationDate DESC) WHERE type="User";
  SELECT users.*
    FROM default users
    WHERE users.type="User"
    AND users.someId="someId"
    AND users.status="following"
    AND users.active=true
    ORDER BY users.creationDate DESC
    LIMIT 10

Pre 5.0.0

CREATE INDEX idx_user ON default(someId, status, active, -MILLIS(creationDate)) WHERE type="User";

SELECT users.*
FROM default users
WHERE users.type="User"
AND users.someId="someId"
AND users.status="following"
AND users.active=true
ORDER BY -MILLIS(users.creationDate) 
LIMIT 10

#3

I have created the index. Without the ORDER BY I get <100ms response times, and with the ORDER BY + the index I get ~2s. The performance is similar to the index without the creationDate. Could I be doing something wrong? The index seems to be used according to the EXPLAIN.

Also, will this work for both the ASC and DESC cases or will I have to create two distinct indexes to support this use case?


#4

Please post the EXPLAIN it should not take more time.
We don’t do reverse Scan MB-19917. If you need ASC/DESC you need to have two indexes. As query can pick any index you might have to use USE INDEX


#5

Please disregard my previous comment, I may have created the index incorrectly. I now have a result set produced in under 100ms with the provided index, thank you.

Will the use of USE INDEX be required for 5.0 as well? We are using 4.6.2 at the moment, I just wanted to know if we will need to update the query logic we implemented or if the update to 5.0 would sort that out


#6

5.0.0 also required USE INDEX.


#7

Hi, i’m having the same problem.
I’m using CouchBase 5.1
I create this index: CREATE INDEX idx_tenant_type_publication_instant ON CONTENT (tenant, main.type, main.sysAttrs.publicationInstant DESC) using GSI;

for this query:

SELECT CONTENT.*
FROM CONTENT use index (idx_tenant_type_publication_instant)
WHERE main.type IN [“foo”,“bar”,“axe”] AND tenant IN [“ox”]
ORDER BY main.sysAttrs.publicationInstant LIMIT 4

Without the order by my query is <100ms but with it goes >500 ms almost a 1,5s

this is my plan text:
{
#operator”: “Sequence”,
#stats”: {
#phaseSwitches”: 2,
“execTime”: “1.538µs”,
“kernTime”: “469.802618ms”
},
“~children”: [
{
#operator”: “Authorize”,
#stats”: {
#phaseSwitches”: 4,
“execTime”: “2.334µs”,
“kernTime”: “468.913966ms”,
“servTime”: “718.769µs”
},
“privileges”: {
“List”: [
{
“Target”: “default:CONTENT”,
“Priv”: 7
}
]
},
“~child”: {
#operator”: “Sequence”,
#stats”: {
#phaseSwitches”: 3,
“execTime”: “2.184µs”,
“kernTime”: “468.908077ms”
},
“~children”: [
{
#operator”: “Sequence”,
#stats”: {
#phaseSwitches”: 2,
“execTime”: “1.217µs”,
“kernTime”: “468.806533ms”
},
“~children”: [
{
#operator”: “IndexScan2”,
#stats”: {
#itemsOut”: 2971,
#phaseSwitches”: 11887,
“execTime”: “4.149843ms”,
“kernTime”: “287.400432ms”,
“servTime”: “10.354564ms”
},
“index”: “idx_tenant_type_publication_instant”,
“index_id”: “f19975e5ad13e5db”,
“index_projection”: {
“primary_key”: true
},
“keyspace”: “CONTENT”,
“namespace”: “default”,
“spans”: [
{
“exact”: true,
“range”: [
{
“high”: ““jht””,
“inclusion”: 3,
“low”: ““jht””
},
{
“high”: ““interview””,
“inclusion”: 3,
“low”: ““interview””
}
]
},
{
“exact”: true,
“range”: [
{
“high”: ““jht””,
“inclusion”: 3,
“low”: ““jht””
},
{
“high”: ““live””,
“inclusion”: 3,
“low”: ““live””
}
]
},
{
“exact”: true,
“range”: [
{
“high”: ““jht””,
“inclusion”: 3,
“low”: ““jht””
},
{
“high”: ““story””,
“inclusion”: 3,
“low”: ““story””
}
]
}
],
“using”: “gsi”,
#time_normal”: “00:00.0145”,
#time_absolute”: 0.014504407
},
{
#operator”: “Fetch”,
#stats”: {
#itemsIn”: 2971,
#itemsOut”: 2971,
#phaseSwitches”: 11903,
“execTime”: “4.837686ms”,
“kernTime”: “10.607591ms”,
“servTime”: “434.135885ms”
},
“keyspace”: “CONTENT”,
“namespace”: “default”,
#time_normal”: “00:00.4389”,
#time_absolute”: 0.438973571
},
{
#operator”: “Sequence”,
#stats”: {
#phaseSwitches”: 5,
“execTime”: “1.967µs”,
“kernTime”: “468.7965ms”
},
“~children”: [
{
#operator”: “Filter”,
#stats”: {
#itemsIn”: 2971,
#itemsOut”: 2971,
#phaseSwitches”: 11889,
“execTime”: “117.121154ms”,
“kernTime”: “351.468913ms”
},
“condition”: “((((CONTENT.main).type) in [“interview”, “live”, “story”]) and ((CONTENT.tenant) in [“jht”]))”,
#time_normal”: “00:00.1171”,
#time_absolute”: 0.117121154
},
{
#operator”: “InitialProject”,
#stats”: {
#itemsIn”: 2971,
#itemsOut”: 2971,
#phaseSwitches”: 11891,
“execTime”: “139.025981ms”,
“kernTime”: “329.761066ms”
},
“result_terms”: [
{
“expr”: “CONTENT”,
“star”: true
}
],
#time_normal”: “00:00.1390”,
#time_absolute”: 0.139025981
}
],
#time_normal”: “00:00.0000”,
#time_absolute”: 0.000001967
}
],
#time_normal”: “00:00.0000”,
#time_absolute”: 0.0000012170000000000002
},
{
#operator”: “Order”,
#stats”: {
#itemsIn”: 2971,
#itemsOut”: 4,
#phaseSwitches”: 5955,
“execTime”: “7.174595ms”,
“kernTime”: “461.682339ms”
},
“limit”: “4”,
“sort_terms”: [
{
“expr”: “(((CONTENT.main).sysAttrs).publicationInstant)”
}
],
#time_normal”: “00:00.0071”,
#time_absolute”: 0.007174595
},
{
#operator”: “Limit”,
#stats”: {
#itemsIn”: 4,
#itemsOut”: 4,
#phaseSwitches”: 21,
“execTime”: “2.903µs”,
“kernTime”: “468.868924ms”
},
“expr”: “4”,
#time_normal”: “00:00.0000”,
#time_absolute”: 0.000002903
},
{
#operator”: “FinalProject”,
#stats”: {
#itemsIn”: 4,
#itemsOut”: 4,
#phaseSwitches”: 23,
“execTime”: “3.682µs”,
“kernTime”: “468.891346ms”
},
#time_normal”: “00:00.0000”,
#time_absolute”: 0.000003682
}
],
#time_normal”: “00:00.0000”,
#time_absolute”: 0.000002184
},
#time_normal”: “00:00.0007”,
#time_absolute”: 0.000721103
},
{
#operator”: “Stream”,
#stats”: {
#itemsIn”: 4,
#itemsOut”: 4,
#phaseSwitches”: 15,
“execTime”: “2.76µs”,
“kernTime”: “469.79477ms”
},
#time_normal”: “00:00.0000”,
#time_absolute”: 0.00000276
}
],
“~versions”: [
“2.0.0-N1QL”,
“5.1.0-5552-enterprise”
],
#time_normal”: “00:00.0000”,
#time_absolute”: 0.000001538
}


#8

ORDER BY requires produce all qualified items and sort the results even you have limit (Last item might be first in the sort list).
Checkout “#itemsIn”,“#itemsOut” for each “#operator”.


#9

Sorry, I don´t understand, what are you mean with Checkout “#itemsIn”,“#itemsOut” for each “#operator”.??


#10

If you take look Order operator has to consume 2971 documents produce 4 documents.

https://blog.couchbase.com/optimize-n1ql-performance-using-request-profiling/


#12

Ok, but there is a way on my query or my index to reduce this consume?


#13

The query requires a sort and doesn’t use index key order or query processing changes order it needs to sort.
That is is the case here (multiple IN elements). You can checkout this article for more details https://blog.couchbase.com/offset-keyset-pagination-n1ql-query-couchbase/