Left outer join + UNNEST + count including empty cases


#1

Hi, I have this problem:

I have this documents for storages and toys
Storages:
{ “id”: “storage::267”,
“main”: {

“type”: “STORAGE”
“status”: “OPEN”
“toys”: [
{ ref: “toy::65”,
type: “ACTION”
},
{ ref: “toy::67”,
type: “SPORTS”
},
],

}
}

Toys:
{ id: “toy::65”,
main: {
“category”: “ACTION”,
“name” : “ActionMan”,
“type”: “TOY”,
“status”: “AVAILABLE”
}
}

I need to list all ACTION AVAILABLE TOYS with the count of all OPEN storages which have the toy, including toys in 0 Storages.
I know i have to LEFT JOIN TOYS (LEFT SIDE) with STORAGE (RIGHT SIDE) and additional UNNEST my STORAGE documents to get all toys in the array. But i dont know how to represent this two ideas in a N1QL query and his indexes.

Please help


#2

Current relation ship you have it needs to use INDEX JOIN but there is no relation ship directly inside right side document (it has inside array). So it is not possible LEFT index JOIN on right side UNNEST.
If your data is small you can do ARRAY joins.


#3

Well, I´m working with millions of toys, but can you give an example with ARRAY joins??


#4

Couple of 100’s may be okay. such large dataset will perform poorly.

SELECT  .....  SUM(ARRAY_COUNT(ARRAY 1 FOR v IN st WHEN v.ref = toys.id END))) 
FROM default toys
LET st = (SELECT  t.ref FROM default AS s  UNNEST s.toys AS t WHERE ...)
WHERE ....

#5

oh right. Now, let´s include all toys with al least 1 storage. I did this query:

SELECT toy.id, count(storage.id) as count, toy.main
FROM CONTENT storage
UNNEST storage.main.toys AS toys
LEFT JOIN CONTENT toy ON KEYS toys.ref
WHERE toy.main.type = ‘TOY’
AND toy.main.category= ‘ACTION’
AND toy.main.status= “AVAILABLE”
AND storage.main.type = ‘STORAGE’
AND storage.main.status = “OPEN”
GROUP BY toy.id, toy.main

Can you suggest me an index for this query? or maybe an improvement of my query?


#6

Use following query and Index. You don’t need UNNEST

   CREATE INDEX ix20 ON CONTENT(main.status, toys[*].ref,id) WHERE main.type = "STORAGE";

    SELECT toy.id, count(storage.id) as count, toy.main
    FROM CONTENT storage
    LEFT JOIN CONTENT toy ON KEYS storage.toys[*].ref
    WHERE toy.main.type = "TOY"
    AND toy.main.category= "ACTION"
    AND toy.main.status= "AVAILABLE"
    AND storage.main.type = "STORAGE"
    AND storage.main.status = "OPEN"
    GROUP BY toy.id, toy.main;

#7

is there a way to define an index join for this query? Right now your solution works, but is very slow (>4 min)with only 100k documents and I gonna be working with millions


#8

Index Join is not possible