N1QL | Query Performance Issue with multiple where clause

Hello CouchBase Community,

The N1QL query is taking more response time when we add a specific where clause.

For your understanding, we have two sets of documents (pages and instances). There is one - many relationship between a page and instances. The pageId will be common for a given page and its instances.

The WHERE clause mostly revolves around the request time in milliseconds. For each instance document, there is an effective time and expiry time as shown below

page
{
id: 1
pageId: β€œ/page1”
}

inst
{
id: 12,
pageId: β€œ/page1”,
effectiveTime:β€œ1617778597591”,
expiryTime:β€œ1617778597596”
}

inst
{
id: 34,
pageId: β€œ/page1”,
effectiveTime:β€œ1617778597561”,
expiryTime:β€œ1617778597569”
}

When a specific page ID is requested, the where clause will make sure to return the instance document falling with the specified request time. If there is no document falling within the range, it will be 404

WHERE instance.pageId = β€œ/page1” AND
(instance.effectiveTime <= 1617778597593 AND 1617778597593 <= instance.expiryTime)

Until this point, we have a very good response time.

However, we have a new requirement where for a particular set of pages, if β€œallowPastInstance = true” is set at the page document level, then we need to fetch the latest expired instance

page
{
id: 1
pageId: β€œ/page1”
allowPastInstance: true
}

inst
{
id: 12,
pageId: β€œ/page1”,
effectiveTime:β€œ1617778597591”,
expiryTime:β€œ1617778597596”
}

inst
{
id: 34,
pageId: β€œ/page1”,
effectiveTime:β€œ1617778597561”,
expiryTime:β€œ1617778597569”
}

We tried to add an extra where clause along with the existing one as below

WHERE instance.pageId = β€œ/page1” AND
((instance.effectiveTime <= 1617778597599 AND 1617778597599 <= instance.expiryTime) OR (page.allowPastInstance = true AND instance.effectiveTime <= 1617778597599))

Adding the new where clause increased the response time drastically, we also observed that, if this query is executed against a huge number of documents, the response time is going high.

Any help or suggestions to improve the response time would be appreciated.

Note: We tried to create the indexes as per query workbench advice, but it didnt help much.

Thanks,
Kannan S M

Please post the complete query

Hi vsr1,

Here is the query.

WITH pages AS (
SELECT page.pageId,
*
FROM bucket_name AS page
WHERE page.type = β€œpage”
AND page.cid= β€œ12qw12qwop90op90”
AND page.pageId IN ["/content/page1"])
SELECT MAX([ instance.expiryTime, { instance , β€œpage”: p.page , β€œ_ID”: META(instance).id, β€œ_CAS”:META(instance).cas }])[1].*
FROM bucket_name AS instance
JOIN pages AS p ON instance.pageId = p.pageId
AND instance.type = β€œpagedef”
AND instance.cid= β€œ12qw12qwop90op90”
AND (instance.effectiveTime <= 1617278040000
AND (1617278040000 <= instance.expiryTime
OR instance.expiryTime IS NULL)
OR (p.allowPastInstance = TRUE AND instance.expiryTime < 1617278040000))
GROUP BY instance.pageId

CREATE INDEX ix1 ON bucket_name(cid, pageId, allowPastInstance, page) WHERE type = "page";
CREATE INDEX ix2 ON bucket_name(cid, pageId, effectiveTime, expiryTime) WHERE type = "pagedef";

WITH pages AS ( SELECT pg.pageId, pg.page, pg.allowPastInstance
                FROM bucket_name AS pg
                WHERE pg.type = "page"
                      AND pg.cid= "12qw12qwop90op90"
                      AND pg.pageId IN ["/content/page1"])
SELECT MAX([ instance.expiryTime, { instance , "page": p.page , "_ID": META(instance).id, "_CAS":META(instance).cas }])[1].*
FROM bucket_name AS instance
JOIN pages AS p ON instance.pageId = p.pageId
                   AND instance.type = "pagedef"
                   AND instance.cid= "12qw12qwop90op90"
                   AND (instance.effectiveTime <= 1617278040000
                   AND (1617278040000 <= instance.expiryTime
                   OR instance.expiryTime IS NULL)
                   OR (p.allowPastInstance = TRUE AND instance.expiryTime < 1617278040000))
GROUP BY instance.pageId;

OR

– Make sure instance is covered with above indexes.

SELECT d.page, META(d2).id AS _ID, META(d2).cas AS _CAS, d2 AS instance
FROM ( WITH pages AS ( SELECT pg.pageId, pg.page, pg.allowPastInstance
                       FROM bucket_name AS pg
                       WHERE pg.type = "page"
                             AND pg.cid= "12qw12qwop90op90"
                             AND pg.pageId IN ["/content/page1"])
       SELECT MAX([ instance.expiryTime, {p.page,  "id": META(instance).id])[1].*
       FROM bucket_name AS instance
       JOIN pages AS p ON instance.pageId = p.pageId
                          AND instance.type = "pagedef"
                          AND instance.cid= "12qw12qwop90op90"
                          AND (instance.effectiveTime <= 1617278040000
                          AND (1617278040000 <= instance.expiryTime
                          OR instance.expiryTime IS NULL)
                          OR (p.allowPastInstance = TRUE AND instance.expiryTime < 1617278040000))
       GROUP BY instance.pageId) AS d
JOIN bucket_name AS d2 ON d.id = META(d2).id;
1 Like

Thank you very much @vasr1. This query improved the performance. We have also added the allowPastInstance property to all other documents and set the value as false by default

@vsr1 - The second query is only returning the inst document, we also want the page document as an output. Kindly help here.

Project what you want in WITH clause and change accordingly

SELECT d.page, META(d2).id AS _ID, META(d2).cas AS _CAS, d2 AS instance
FROM ( WITH pages AS ( SELECT pg.pageId, pg AS page, pg.allowPastInstance
                       FROM bucket_name AS pg
                       WHERE pg.type = "page"
                             AND pg.cid= "12qw12qwop90op90"
                             AND pg.pageId IN ["/content/page1"])
       SELECT MAX([ instance.expiryTime, {p.page,  "id": META(instance).id])[1].*
       FROM bucket_name AS instance
       JOIN pages AS p ON instance.pageId = p.pageId
                          AND instance.type = "pagedef"
                          AND instance.cid= "12qw12qwop90op90"
                          AND (instance.effectiveTime <= 1617278040000
                          AND (1617278040000 <= instance.expiryTime
                          OR instance.expiryTime IS NULL)
                          OR (p.allowPastInstance = TRUE AND instance.expiryTime < 1617278040000))
       GROUP BY instance.pageId) AS d
JOIN bucket_name AS d2 ON d.id = META(d2).id;