Performance of single vs multiple n1ql queries for small data sets

damianChapman · August 17, 2017, 4:02pm

Hi all,

I am using Couchbase Version: 4.5.1-2844 Enterprise Edition (build-2844)
Cluster State ID: 038-170-F17

I am doing 12 n1ql queries, using the java Couchbase client v2.4.2, from 12 different document types for items containing ids, start and end times and combining them into a list ordered by accending start time.

The queries are very basic:

"select * from write where docType = ‘$1’ and id = $2;’

I sort the results using java rather than adding the ordering to the n1ql queries.
The items have other fields which are all string values. The maximun number of fields in each item <= 20

Each query would return a maximum number of items < 100.

If I want to increase the performance of the queries:

Should I combine them into one query with the ordering by start time?
Should I leave the queries as is and create indexes for each of the individual queries?

I have noticed that if there are a lot of items between the start and end time, the first time the queries are run it can be slow(40 seconds) but the next time it is considerably faster(2 seconds) and
all the times after the initial time are equally fast.

However, if I wait for 15 minutes and run the same query again, it is as slow as the first time. The next time it is considerably faster and
all the times after the initial time are equally fast.

I tried looking in the documentation about this issue and on this forum but I couldn’t find any information.
Is there some documentation or white paper available that may help me better understand or debug why this issue is occurring for me?

Any help would be greatly appreciated.

Regards,
Damian.

vsr1 · August 17, 2017, 4:44pm

What is your index definition. What is order query looks like. Do you need all the results or going to limit them.

Example:

CREATE INDEX ix1 ON default(docType,id,starttime);
SELECT * from default WHERE docType = $1 AND id = $2 AND starttime >= $3 ORDER BY docType,id, starttime LIMIT 200;

Above query and index combinations will perform better. Details available in https://dzone.com/articles/designing-index-for-query-in-couchbase-n1ql

First time it may need to establish connections and also If data is not in memory it needs to read etc.

damianChapman · August 17, 2017, 6:33pm

vsr1,

Thank you for your prompt response.

damianChapman · August 22, 2017, 3:44pm

Hi vsr1,

I created the select query that brings back multiple documents now:

select * from write
where docType in [‘guestExcursion’, ‘guestActivity’, ‘guestSpecialtyDining’, ‘guestTraditionalDining’, ‘guestSpa’, ‘guestTransfer’, ‘guestFlight’, ‘guestPrivateGround’, ‘guestTravelInfo’, ‘guestorder’, ‘guestDisembarkation’, ‘guestCompositeMemory’]
and id = '12345678’
and startDate >= 1493463600
or startTime >= 1493463600
or flightDepartureDate >= 1493463600
or departureDate >= 1493464000
and endDate <= 1493464000
or endTime <= '1493464000
or flightArrivalDate <= 1493464000
or arrivalDate <= 1493464000
order by docType, startDate, startTime, flightDepartureDate, departureDate

The query takes 12 -13 seconds to run. I have tried the query multiple times and it still takes around 13 seconds.

I tried adding an index but this has had no effect on the query response times:

CREATE INDEX ix1 ON write(docType, guestId, startDate, flightDepartureDate, departureDate, endDate, endTime, flightArrivalDate, arrivalDate)
where docType in [‘guestExcursion’, ‘guestActivity’, ‘guestSpecialtyDining’, ‘guestTraditionalDining’, ‘guestSpa’, ‘guestTransfer’, ‘guestFlight’, ‘guestPrivateGround’, ‘guestTravelInfo’, ‘guestorder’, ‘guestDisembarkation’, ‘guestCompositeMemory’] using GSI

Is there any other way that I can reduce the response time for the query?

Regards,
Damian.

vsr1 · August 22, 2017, 4:39pm

Your query has multiple OR’s And Ands could you please use parenthesis and post again so that know exact precedence. Also id and guestId are same? How many documents query qualifies.
Is each comparison uses same value 1493463600, 1493464000 in given query

When you run the query you get requestId, use that and plug-in query and provide phaseOperators, PhaseCounts info.

select * from system:completed_requests WHERE requestId = “…”;

damianChapman · August 23, 2017, 2:23pm

vsr1,

My query has been greatly simplified now. I do not need to do the date checks anymore:

select * from write
where docType in [‘guestExcursion’, ‘guestActivity’, ‘guestSpecialtyDining’, ‘guestTraditionalDining’, ‘guestSpa’, ‘guestTransfer’, ‘guestFlight’, ‘guestPrivateGround’, ‘guestTravelInfo’, ‘guestorder’, ‘guestDisembarkation’, ‘guestCompositeMemory’]
and id = '12345678’

I takes about 4 seconds to run.

I was able to create the following index which got the query down to 170ms.

CREATE INDEX index_write_doctype_guestid ON write(docType, guestId)
where docType in [‘guestExcursion’, ‘guestActivity’, ‘guestSpecialtyDining’, ‘guestTraditionalDining’, ‘guestSpa’, ‘guestTransfer’, ‘guestFlight’, ‘guestPrivateGround’, ‘guestTravelInfo’, ‘guestorder’, ‘guestDisembarkation’, ‘guestCompositeMemory’] using GSI

Thanks for all of your help.

Regards,
Damian.

vsr1 · August 23, 2017, 4:41pm

Your index has guestId and query predicate has id = '12345678’. Please include id in the index as leading key or in the middle