N1QL use startkey instead of OFFSET


Is it possible to apply paging on a N1QL query using a startkey instead of having to use OFFSET?

I’ve got the following query:

 SELECT * FROM `default` WHERE 
    type = 'record' AND 
    customer_id = 'cust:1' AND 
    project_id = '1:proj:1' AND 
    updated_on > '2019-03-05T17:29:25.732Z' 
    ORDER BY updated_on, META().id LIMIT 100;

Which works perfectly fine for the first results, however, the issues start when I want to get the second page.
Basically our system will receive the document key from the last returned results, I’d like to use this key as the ‘start’ of the second page query, but I can’t seem to find any examples of this. Is it actually possible?

Our document IDs for the records look like this:

I’ve tried using the AND META().id > '1:record:398d277c6da346038538c8165154902b' filter on the query, but the problems arrive when we have the following dataset:

id  updated_on
"1:record:e6e35c835a864666af224eaca3ecb06b"	"2019-03-05T17:29:26.733Z"
"1:record:0e284b6830f04cfdad3efa9729075c5c"	"2019-03-05T17:29:28.733Z"
"1:record:11689be1f87b492994e23c4a7c902712"	"2019-03-05T17:29:28.733Z"
"1:record:19324ddadbe24f1d9b92bd08a2160697"	"2019-03-05T17:29:28.733Z"
"1:record:24b11c05a0f34f9e8f690280bc4669a7"	"2019-03-05T17:29:28.733Z"
"1:record:398d277c6da346038538c8165154902b"	"2019-03-05T17:29:28.733Z"
"1:record:f611b3f51fab43e182ac9d5722464bb2"	"2019-03-05T17:29:28.733Z"
"1:record:fa6290d876f5436d9d887a6b0060bccc"	"2019-03-05T17:29:28.733Z"
"1:record:faac4358c5374765b4d0e04002617d8f"	"2019-03-05T17:29:28.733Z"
"1:record:e6d4acb7505d4c4bac6b8bd6f8a10d54"	"2019-03-05T17:29:29.733Z"

This will make the first record (1:record:e6e35c835a864666af224eaca3ecb06b) get returned as well, when we take 1:record:398d277c6da346038538c8165154902b as the latest returned one for example (as updated_on is larger than the predefined one, and the key predicate is also fulfilled,however this document was already returned in the previous set.

I’ve created an index like this:

CREATE INDEX `idx_updated_records` ON `default`(`customer_id`,`project_id`,`updated_on`,META().id) WHERE (`type` = "record")

You have ORDER BY more than one expression. You need to change both of them in next execution (note: last order key must be > and all others >= otherwise you will miss rows if LIMIT satisfies on first key duplicates).

Each time you execute query u need to change updated_on >= $newval AND META().id > $newval1
newval, newval1 are last row from previous execution
Please take look https://blog.couchbase.com/offset-keyset-pagination-n1ql-query-couchbase/ and follow this.


That’s what we thought as well, I tried it just to be sure.
The query still doesn’t work correctly, what I’ve done:

Kept the index as posted in the initial post
Changed updated_on with the value of the last row from the previous results, changed that filter to >=
Added a META().id filter with the value of the previous result.

All seems to work just fine, although some more testing revealed the following results:
Dataset, currently sorted by updated_on:

META().id  updated_on
"1:record:e6e35c835a864666af224eaca3ecb06b"	"2019-03-05T17:21:26.733Z"
"1:record:19324ddadbe24f1d9b92bd08a2160697"	"2019-03-05T17:31:28.733Z"
"1:record:398d277c6da346038538c8165154902b"	"2019-03-05T17:32:48.733Z"
"1:record:f611b3f51fab43e182ac9d5722464bb2"	"2019-03-05T17:35:28.733Z"
"1:record:faac4358c5374765b4d0e04002617d8f"	"2019-03-05T17:38:28.733Z"
"1:record:e6d4acb7505d4c4bac6b8bd6f8a10d54"	"2019-03-05T17:40:29.733Z

Using the initial query, with a limit of 3, works just fine, 1:record:398d277c6da346038538c8165154902b will be the last returned value, a second query will returned the next 3 results.

However, take a limit of 4, and 1:record:f611b3f51fab43e182ac9d5722464bb2 is the last returned value. When we use this key to filter the next results, the last result in the dataset will NOT be returned, as the key to filter on is larger (string compared).
The format of our document keys is ‘customer:type:guid’ where the guid has it’s - stripped.

EDIT: Added query for clarity

SELECT updated_on, META().id FROM `default` USE INDEX(idx_updated_records) WHERE
      type = 'record' AND
      project_id = '1' AND
      customer_id = '1' AND 
      updated_on >= '2019-03-05T17:35:28.733Z' AND
      META().id > '1:record:f611b3f51fab43e182ac9d5722464bb2' 
      ORDER BY updated_on, META().id LIMIT 4

When you have multiple order keys and duplicate the above query is not right. Because a > =5 and b > 4 condition makes false when a is 6 and b 3 that is what happening in above query. Instead you need to modify the query take this account as follows.

SELECT updated_on, id FROM ( SELECT updated_on, META().id FROM `default` USE INDEX(idx_updated_records)
                             WHERE type = 'record' AND project_id = '1' AND customer_id = '1' AND
                                   updated_on >= '2019-03-05T17:35:28.733Z'
                                   ORDER BY updated_on, META().id) AS d
WHERE d.updated_on > '2019-03-05T17:35:28.733Z' OR d.id > '1:record:f611b3f51fab43e182ac9d5722464bb2'