Using subselect as joined


#1

When performing a JOIN operation, i have noticed that a subselect can be used on the left side of the join, but when trying it on the right side i could’nt make the query get compiled due to syntax errors.
Is there any way i can do something like:
SELECT * FROM bucket1 JOIN (SELECT * FROM bucket2 where…)


#2

Hi, right now we only support Left outer joins. @geraldss can tell you more about this, he is our N1QL master.


#3

you can do this using the JOIN syntax today.
SELECT c.state FROM
(SELECT b.brewery_id FROM beer-sample b WHERE type=“beer”) as a
JOIN beer-sample c ON KEYS a.brewery_id
WHERE c.type=“brewery”;

would this help?


#4

Hi, thanks for the reply.
Yes, i noticed that this syntax works, but the syntax i was asking about should have the subselect on the right side of the JOIN.

for example, if B is my bucket and S is a select expression, i am aware of being able to do SELECT S JOIN B but what about SELECT B JOIN S


#5

Hi @eyalrubichi,

No, that syntax is not currently supported. The reason is that N1QL performs JOINs by doing a key-value lookup of the right-hand side. This is not possible for generalized subqueries.

If you post or email your complete query, it may be rewritable using JOIN syntax. The right-hand subquery can usually be inlined into the outer query.

Thanks,
Gerald


#6

Hi,
thanks, here is the query that i want to write in an efficient way.
The data exists in one bucket with two ‘schemas’, user and user_device, while each user_device document has a user_id reference (so the relation is many-to-one and not one-to-many).
The ouput of the query should be all the users which the last activity of the recent device they used is between the given timestamps.

Currently i could only do this by using EXCEPT. While the first part runs quickly (small subset of the documents i need but this subset contains also documents that i don’t want), the second part kills that couchbase and calculation takes forever.

If i could use the first part and join this data (by user_id) it would run much faster, but i use CB4.0 and probably have no option of joining the data in the way that i want.

SELECT
meta(usr).id user_id,
usr.email as user_email,
FROM users dvc JOIN users usr ON KEYS dvc.user_id
WHERE
dvc.last_run BETWEEN 1452067057 AND 1452097057
GROUP BY meta(usr).id
EXCEPT
SELECT
meta(usr).id user_id,
usr.email as user_email,
FROM users dvc JOIN users usr ON KEYS dvc.user_id
WHERE
dvc.last_run > 1452097057
GROUP BY meta(usr).id;


#7

Can you try the following query, and make sure you have a covering index on dvc(last_run, user_id).

SELECT 	meta(usr).id user_id,
	MIN(usr.email) as user_email
FROM users dvc JOIN users usr ON KEYS dvc.user_id
WHERE	dvc.last_run >= 1452067057
GROUP BY meta(usr).id
HAVING MAX(dvc.last_run) <= 1452097057;