Joining alot of smaller docs vs one big doc

dipen_patel · October 31, 2018, 7:28pm

Existing Solution
We have about 20 main type docs, and each main type has about 30-40 subtypes
Single Doc

{
"id":"AA",
 "description": {"en":"blah sub type"},
"type":"main-type"
}

{
 "id":"123",
 "description": {"en":"blah sub type"},
 "type":"sub-type"
 "parent":"AA"	
}

We have 1M of these types of docs
{
“id”: “CCCC”,
“mainType”:“AA”,
“subType”:“123”,
“type”:“asset”
}

Select meta(b).id
From bucket b
LEFT JOIN bucket m ON KEYS b.mainType
LEFT JOIN bucket s ON KEYS b.subType
Where type == “asset”
Order by m.description.en, s.description.en

Testing:
Combine the main-type and sub-types all into 1 doc its about 250K. Its doc we know the id of.

Combine doc
{
“id”: “combine-doc”,
“maintypes”: {
“AA”: {
“description”: {“en”:“blah sub type”},
“subTypes”: {
“CCCC”:{
“description”: {“en”:“blah sub type”},
}
}
}

  }

}

Select meta(b).id From bucket b
JOIN bucket m ON KEYS ‘combine-doc’
Where type == “asset”
Order by m.maintypes[b.mainType].description.en, m.maintypes[b.mainType].subTypes[b.subType].description.en

Metrics
Records selected: 13000
Current Solution: 3 seconds
Testing Solution: 20-30 seconds

It seems join a big doc is MUCH slower then pulling a-lot of smaller docs. Can we provide the performance?

vsr1 · October 31, 2018, 8:37pm

Are you sure your query returns results in right order?
maintypes is object not ARRAY.
So m.maintypes[b.mainType].description.en is it giving right value or giving MISSING?

Also ‘combine-doc’ is bigger and you are fetching for every document again.

CREATE INDEX ix1 ON bucket(mainType, subType) WHERE type ="asset";
SELECT RAW META(b).id
FROM bucket AS b
LET m = (SELECT RAW d FROM bucket AS d USE KEYS ["combine-doc"])[0]
WHERE b.type = "asset" AND b.mainType IS NOT NULL
ORDER BY m.maintypes.[b.mainType].description.en, m.subTypes.[b.subType].description.en;

dipen_patel · November 1, 2018, 1:24pm

Thanks the Let worked.