How to improve IndexScan time performance

n1ql

#1

I have a N1QL query which is covered and the predicate contains one array with (ANY x IN array SATISFIES x = “string”) and 4 range predicates.

The index contains ~600k documents.

Now the query performance is heavily dependant on the string I am querying for. For one string (a lot of documents contain that string in the array) the query completes in 368.26ms (index scan time: 00:00.3592). For another string (not so much documents contain that string in the array) it is 12.33s (index scan time: 00:12.3154).

How do I improve the IndexScan time? I thought index nodes need memory, number of cores is not that important. But the index resides completely in memory. Should I increase the number of cores or use faster cores or something different?


#2

Post the query and index definition. Check this out https://blog.couchbase.com/create-right-index-get-right-performance/


#3

I read the recommended blog post and did some optimizations like order of keys in the index.

This is my index definition:

CREATE INDEX idx_score_desc 
ON `bucket`(score DESC, upvotes, creationTime, creator.followerCount,countComments, ALL DISTINCT ARRAY t for t IN tags END) 
WHERE (`_class` = "some.class.package.Class") AND score > 0 
WITH {"nodes": [ "192.168.0.1:8091",  "192.168.0.2:8091", "192.168.0.3:8091" ], "defer_build":true, "num_replica": 2};

My query:

SELECT RAW meta().id FROM `bucket` 
WHERE `_class` = "some.class.package.Class"  
AND creationTime BETWEEN 1501413511 AND 1532949511 
AND upvotes BETWEEN 0 AND 29639
AND creator.followerCount BETWEEN -1 AND 16969739
AND countComments BETWEEN -1 AND 10000000  
AND score > 0  
AND ANY t IN tags SATISFIES t = "tag1" END 
ORDER BY score DESC, upvotes, creationTime, creator.followerCount, countComments 
LIMIT 16 OFFSET 0

#4

It is pagination query. When query satisfies pagination it stops. If you have lot of t = “tag1” it satisfies early so less index disk reads and scans. If not many more index disk reads/scans.

One option you can try is Move ALL DISTINCT index to leading index key (Rule #11). Check if the query using index order. If not the current query plan is optimal


#5

I will give that a try, thank you.

But back to my main question: What do I need to scale to improve pure IndexScan performance (IndexScan time): Number of cores, faster cores, memory, disks (faster SSDs)?