GROUP CLAUSE AFTER JOIN Does not used index but uses primary key


#1

I have a single bucket with two types of objects in them namely trade and risk. The structure is as follows:-
TRADE:-
{
“id”: “000005f6df4b46e088e6330d4323e0ef”,
“strategyName”: “JPY”,
“portfolioName”: “DAMFDA”,
“tradeId”: “66692033”,
“name”: “FX_FORWARD_Sell_DY_USD/JPY”,
“version”: 0,
“instrumentId”: 10371255,
“instrumentVersion”: 0,
“notional”: -160000,
“fund”: “DADCFDA”,
“portfolioManager”: “DY”,
“counterParty”: “XTX”,
“executionBroker”: “XTX”,
“side”: “Sell”,
“status”: “Active”,
“tags”: “JPY”,
“sourceTradeId”: “SBO0042-3”,
“sourceTradeStatus”: “2”,
“portfolioId”: 0,
“strategyId”: 0,
“inception”: “2018-02-15T00:00:00”,
“asOfDate”: “2018-02-16T00:00:00”,
“originalUpdate”: “2018-02-15T19:25:33+08:00”,
“created”: “2018-02-15T19:25:33.4252154+08:00”,
“lastUpdated”: “2018-02-15T19:25:33.4252154+08:00”,
“initialValue”: 0,
“sourceSystem”: “flex”,
“executionUser”: “DY”,
“commission”: 0,
“fee”: 0,
“tradeDate”: “2018-02-15T00:00:00”,
“custodian”: “CITI-FX”,
“valueDate”: “2018-02-15T00:00:00”,
“settlementDate”: “2018-02-20T00:00:00”,
“price”: 106.614,
“orderQuantity”: -5000000,
“orderCreationTime”: “1970-01-01T00:00:00”,
“instrument”: {
“domesticCurrency”: “JPY”,
“foreignCurrency”: “USD”,
“deliveryPrice”: 106.614,
“metaInfo”: {},
“id”: 10371255,
“name”: “USD/JPY”,
“denominationCurrency”: “JPY”,
“type”: “FX_FORWARD”,
“assetClass”: “FX”,
“assetCurrency1”: “USD”,
“assetCurrency2”: “JPY”,
“category”: “CUR”,
“maturity”: “2018-02-20T00:00:00”,
“settlementDate”: “2018-02-20T00:00:00”,
“version”: 1,
“source”: “flex”,
“sourceId”: “10371255”,
“inception”: “2018-02-15T00:00:00”,
“originalUpdate”: “2018-02-15T19:25:33+08:00”,
“created”: “2018-02-15T19:25:33+08:00”,
“lastUpdated”: “2018-02-15T19:25:33.2845904+08:00”
},
“portfolio”: null,
“strategy”: null,
“type”: “trade”
}

RISK:-

{
“id”: “1fe188662ca14182bbe0cba66ace1a30”,
“parentTradeId”: “6ee95783bec64ddd8af5a26339361acc”,
“portfolioManager”: “CY”,
“asOfDate”: “2018-01-01T00:00:00”,
“tradeId”: “65387140”,
“exposure”: “OIS_FWD_RATE”,
“underlier”: “AUD”,
“baseValue”: 0.0319021374446762,
“shift”: 0.0,
“firstOrderExposure”: 4.5439021717409023E-18,
“secondOrderExposure”: 0.0,
“instrumentExposure”: -2.0792084614903E-20,
“ccy1Amount”: 0.0,
“ccy2Amount”: 0.0,
“tenor”: “13Y”,
“skewType”: null,
“expiry”: “”,
“type”: “risk”
}

I have an GSI Index defines as follows:-

CREATE INDEX ix_nameTradeIdFirstOrderExposure ON Dart(name,tradeId,firstOrderExposure)

I have a N1QL as shown below:-

SELECT t.name, t.tradeId, SUM(r.firstOrderExposure)
FROM Dart r
JOIN Dart t
ON KEYS r.parentTradeId
WHERE t.name IS NOT NULL AND t.tradeId IS NOT NULL AND r.firstOrderExposure IS NOT NULL
GROUP BY t.name, t.tradeId
LIMIT 100

I noticed the query is using the primary index rather than the index ix_nameTradeIdFirstOrderExposure as shown by the explain:-

{
“plan”: {
#operator”: “Sequence”,
“~children”: [
{
#operator”: “Sequence”,
“~children”: [
{
#operator”: “PrimaryScan3”,
“as”: “r”,
“index”: “#primary”,
“index_projection”: {
“primary_key”: true
},
“keyspace”: “Dart”,
“namespace”: “default”,
“using”: “gsi”
},
{
#operator”: “Fetch”,
“as”: “r”,
“keyspace”: “Dart”,
“namespace”: “default”
},
{
#operator”: “Join”,
“as”: “t”,
“keyspace”: “Dart”,
“namespace”: “default”,
“on_keys”: “(r.parentTradeId)”
},
{
#operator”: “Parallel”,
“~child”: {
#operator”: “Sequence”,
“~children”: [
{
#operator”: “Filter”,
“condition”: “((((t.name) is not null) and ((t.tradeId) is not null)) and ((r.firstOrderExposure) is not null))”
},
{
#operator”: “InitialGroup”,
“aggregates”: [
“sum((r.firstOrderExposure))”
],
“group_keys”: [
“(t.name)”,
“(t.tradeId)”,
“(r.firstOrderExposure)”
]
}
]
}
},
{
#operator”: “IntermediateGroup”,
“aggregates”: [
“sum((r.firstOrderExposure))”
],
“group_keys”: [
“(t.name)”,
“(t.tradeId)”,
“(r.firstOrderExposure)”
]
},
{
#operator”: “FinalGroup”,
“aggregates”: [
“sum((r.firstOrderExposure))”
],
“group_keys”: [
“(t.name)”,
“(t.tradeId)”,
“(r.firstOrderExposure)”
]
},
{
#operator”: “Parallel”,
“~child”: {
#operator”: “Sequence”,
“~children”: [
{
#operator”: “InitialProject”,
“result_terms”: [
{
“expr”: “(t.name)”
},
{
“expr”: “(t.tradeId)”
},
{
“expr”: “sum((r.firstOrderExposure))”
}
]
},
{
#operator”: “FinalProject”
}
]
}
}
]
},
{
#operator”: “Limit”,
“expr”: “100”
}
]
},
“text”: “SELECT t.name, t.tradeId, SUM(r.firstOrderExposure)\nFROM Dart r \nJOIN Dart t \nON KEYS r.parentTradeId\nWHERE t.name IS NOT NULL AND t.tradeId IS NOT NULL AND r.firstOrderExposure IS NOT NULL\nGROUP BY t.name, t.tradeId, r.firstOrderExposure\nLIMIT 100”
}

How do I get this JOIN to use the ix_nameTradeIdFirstOrderExposure index to speed up the grouping?


#2

JOINS are LEFT to RIGHT and u need to have predicate on leading key of LEFT keyspace to pick the index.

Based on the query Index JOIN is right one, Following are query and indexes

SELECT t.name, t.tradeId, SUM(r.firstOrderExposure)
FROM Dart t
JOIN Dart r
ON KEY r.parentTradeId FOR t
WHERE t.type = "trade" AND t.name IS NOT NULL AND t.tradeId IS NOT NULL  AND  r.type = "risk"
GROUP BY t.name, t.tradeId
LIMIT 100;

CREATE INDEX ix1 ON Dart(name,tradeId) WHERE type = "trade";
CREATE INDEX ix2 ON Dart(parentTradeId,firstOrderExposure) WHERE type = "risk";

#3

Thank you for your solution. It worked using indexes and the query now works well.