Can you tell me what is the best index for the following query?
SELECT
distinct order.userId,
ARRAY {p.name,p.language} for p in product END
FROM bucket order
JOIN bucket product
ON product.userId = order.userId AND product.type='Product'
WHERE order.type='Order' AND order.clientId=$clientId
CREATE INDEX ix1 ON bucket( clientId, userId ) WHERE type='Order';
CREATE INDEX ix2 ON bucket( userId , name, language) WHERE type='Product';
SELECT order.userId,
ARRAY_AGG( {product.name,product.language}) AS products
FROM bucket AS order
JOIN bucket AS product
ON product.userId = order.userId AND product.type='Product'
WHERE order.type='Order' AND order.clientId=$clientId
GROUP BY order.userId;
CREATE INDEX ix1 ON bucket( clientId, userId ) WHERE type='Order';
CREATE INDEX ix2 ON bucket( userId , name, language, createDate) WHERE type='Product';
SELECT d.userId , (SELECT RAW p FROM d.products AS p ORDER BY p.createDate DESC LIMIT 100) AS products
FROM (SELECT order.userId,
ARRAY_AGG( {product.name,product.language, product.createDate}) AS products
FROM bucket AS order
JOIN bucket AS product
ON product.userId = order.userId AND product.type='Product'
WHERE order.type='Order' AND order.clientId=$clientId
GROUP BY order.userId) AS d;
only for covering. you can’t use for sorting. In case of JOIN order by can be exposed on left most bucket only. all others order will be changed and we need explicit sort
Both are static you can use both. Any thing c = $1 or c=$C or not static constants.
if your queries doesn’t go include different type then type is good candidate in index WHERE clause.
In addition to above c values limited and use all values then you can create that many index and include index WHERE clause. If c values are many number of indexs become too much move to index keys.
I create index and run the query
When order.clientId=$clientId contains some documents (i.e. less than 10K) query runs very fast , But when it includes millions document it take long time, event If I add LIMIT 100
SELECT d.userId , (SELECT RAW p FROM d.products AS p ORDER BY p.createDate DESC LIMIT 100) AS products
FROM (SELECT order.userId,
ARRAY_AGG( {product.name,product.language, product.createDate}) AS products
FROM bucket AS order
JOIN bucket AS product
ON product.userId = order.userId AND product.type='Product'
WHERE order.type='Order' AND order.clientId=$clientId
GROUP BY order.userId
LIMIT 100) AS d;
As it join and then limits , How can I first limit then join? to get faster result
I want to paginate it , product.userId >lastPageUserId AND order.clientId=$clientId LIMIT $perPage
@vsr1 I have another very similar query, for push notification , I want to join two document types, User and FcmTokens
Each User can have 1:n FcmTokens, for a broadcast notification, I need to fetch all users with their FcmTokens, I want to fetch tokens as fast as possible, currently it take about 20 minutes for fetching with covered index
Do you know how can I speed up it?(less than a minute)
It is exactly same as your suggested query ,just renamed
CREATE INDEX ix1 ON default( clientId, userId ) WHERE type='User';
CREATE INDEX ix2 ON default( userId , token, language, updateDate) WHERE type='UserFcmToken';
SELECT d.userId , (SELECT RAW t FROM d.tokens AS t ORDER BY t.updateDate DESC LIMIT 100) AS tokens
FROM (SELECT user.userId,
ARRAY_AGG( {token.token,token.language, token.updateDate}) AS tokens
FROM default AS user
JOIN default AS token
ON token.userId = user.userId AND token.type='UserFcmToken'
WHERE user.type='User' AND user.clientId=$clientId
GROUP BY user.userId) AS d;
I talk about 5 million results, the query take 20 minutes to calculate 5 million results
I want to send notification to all users as fast as possible , fetching tokens from database must not take 20 minutes
GROUP BY, ORDER BY must produce all qualified documents first before finalizing and streaming.
If you need streaming you can do following which is almost streaming
CREATE INDEX ix1 ON default( clientId, userId ) WHERE type='User';
SELECT DISTINCT userId FROM default WHERE clientId=$clientId AND type='User';
For Each User repeat the following query
CREATE INDEX ix2 ON default( userId , updateDate DESC, token, language) WHERE type='UserFcmToken';
SELECT token, language, updateDate, userId
FROM default
WHERE type='UserFcmToken' AND userId = $userId
ORDER BY updateDate DESC
LIMIT 100;
It is Per Query node. In CE you have all services (Data, index, query) on same node. If you have 8 cores
query maximum uses 4 cores (GOMAXPROCS https://golang.org/pkg/runtime/), other services may be using all 8 cores.