N1ql query to retrieve data from 2 buckets

@vsr1
I have 2 buckets history and cms. Data looks like this in our documents.

history

[
{
“history”: {
“id”: “historyID1”,
“instanceRef”: “ID1”,
“type”: “history”
}
},
{
“history”: {
“id”: “historyID2”,
“instanceRef”: “ID2”,
“type”: “history”
}
},
{
“history”: {
“id”: “historyID3”,
“instanceRef”: “ID3”,
“type”: “history”
}
},



]

cms
[
{
“cms”: {
“id”: “ID1”,
“history”: [
“historyID1”,
“historyID4”,
“historyID5”
],
}
},
{
“cms”: {
“id”: “ID2”,
“history”: [
“historyID2”,
],
}
},



]

notes: history in cms document is array. I am trying to flatten after retrieval.
CMS ID === History InstanceRef

requirement: I am trying to get documents with the below points :
0. I need to find cms ids that are not in history and retrieve ids and history ids.

  1. how can I get data as array [{id1, historyid1}, {id1, historyid2}, …]
  2. I am trying by joining both buckets
CREATE INDEX ix1 ON history(instanceRef ) WHERE type = "history";
CRETE INDEX ix2 ON cms(id) WHERE ARRAY_LENGTH(history) > 0 ;

SELECT d.id, historyid
FROM (SELECT c.id, c.history
                 FROM cms AS c
                 LEFT JOIN history AS h ON c.id == h.instanceRef AND h.type = "history"
                  WHERE  c.id IS NOT NULL  AND ARRAY_LENGTH(c.history) > 0 AND h IS MISSING ) AS d
UNNEST d.history AS historyid;
1 Like

Thank you so much!
Added as solution

@vsr1 - in the query - h is missing
what does it mean? any other way we can say h.instanceRef is missing

It is LEFT JOIN. When c.id == h.instanceRef AND h.type = “history” condition is not satisfied it will produce h as MISSING.
That is your requirement “I need to find cms ids that are not in history”

thank you vsr1 got it!