Optimizing query performance for IN statements

Hi There,

I’ve a query in which I’m trying to get count of some documents based on some id. Here’s the query:

SELECT xid, COUNT(DISTINCT yid) AS count
FROM fooboo
WHERE _class = "com.foo.bar"
    AND appId = "app1"
    AND MILLIS_TO_STR(MILLIS(createdAt), "2006-01-02") > "2017-02-22"
    AND xid IN ["fba0a67e-aa20-4333-8404-4fb382f20c9c","beae20e8-4aac-4b1e-9520-e15e619c9b97"]
GROUP BY xid

Earlier, the code was calling this query for each individual xid without the AND xid IN ... part. But when the number of xid's grow, the query performance degraded. Therefore I changed the query like above. I also added the following index:

CREATE INDEX `pid_appId_xid_yid_createdAt_idx`
ON `fooboo`(`appId`,`xid`,`yid`,`createdAt`)
WHERE (`_class` = "com.foo.bar")

When the xid array includes 100 elements, the query performs around 2.67 seconds. Is there a way to improve this further?

can you post explain?

How about create index by

CREATE INDEX `pid_appId_i_xid_yid_createdAt_idx`ON `default`(`appId`, DISTINCT ARRAY i FOR i IN `xid` END,`yid`,`createdAt`) WHERE (`_class` = "com.foo.bar")

and rewrite N1QL by

SELECT xid, COUNT(DISTINCT yid) AS count
FROM fooboo
WHERE _class = "com.foo.bar"
    AND appId = "app1"
    AND MILLIS_TO_STR(MILLIS(createdAt), "2006-01-02") > "2017-02-22"
    AND ANY i IN xid SATISFIES i IN ["fba0a67e-aa20-4333-8404-4fb382f20c9c","beae20e8-4aac-4b1e-9520-e15e619c9b97"] END
GROUP BY xid

let’s see explain and time.

If createdAt is in ISO-8601 started it is string comparable. And change query comparison accordingly. If not created index on SUBSTR(createdAt,0,10) and use in query.

CREATE INDEX `pid_appId_xid_yid_createdAt_idx`
ON `fooboo`(`appId`,`xid`,`createdAt`,`yid`)
WHERE (`_class` = "com.foo.bar");

SELECT xid, COUNT(DISTINCT yid) AS count
FROM fooboo
WHERE _class = "com.foo.bar"
    AND appId = "app1"
    AND createdAt > "2017-02-22"
    AND xid IN ["fba0a67e-aa20-4333-8404-4fb382f20c9c","beae20e8-4aac-4b1e-9520-e15e619c9b97"]
GROUP BY xid