DISTINCT processing is done by the index. However you do need to specify the following in the where clause as GSI does not index MISSING or NULL values;
select distinct type from default where type is not missing and type is not null;
We are adding covering indexes to address this issue. As @cihangirb indicated, here is the correct query, which will automatically benefit when we add covering indexes:
select distinct type from default where type is not null;
A slight tweak to @cihangirb’s note: GSI does not index MISSING, but it does index NULL.
If the entire bucket has docs with type and not a lot of documents are missing type, the index adds an additional hop. scanning the bucket seems more efficient.
What is the total number of documents being returned with the query. If the number of documents is large then the fetch between query engine and the couchbase server could be the bottleneck.
If that is the case then we would try increasing some of the system parameters to increase the fetch throughput
the result set, currently is only about 20-30 rows. not much at all.
we have renamed nickel to nyquill (nyql) until we get proper indexes from couchbase - joking aside, n1ql is very useful and we would love to see couchbase push to massively improve performance. we’ll keep using it in the meantime, but with the nyquill name since it does put a few devs to sleep.
we did a test where we indexed every single query we use in the mentioned script - used explain to double check if the indexes were being used, and then guess what - script execution time went from 45secs to 66secs . we then deleted all the indexes.
Well that nyquill for you is like red bull for us. What is the total number of documents in the bucket ? Even if the final result-set is small, the query could still be fetching a large number of documents.
You also could try using view indexes and see if that improves things. I doubt that it would but it’d be worth a try. To create an index using view you need to do replace USING GSI with USING VIEW.