How to use chained JOIN in N1QL


#1

Hi,
We are working on the Couchbase connector in StrongLoop application.
My work is to translate the StrongLoop syntax into N1QL syntax and submit the N1QL statement to Couchbase server. I have couple of questions regarding how to use N1QL “JOIN” statement.

Here is my scenario:

there are three types of documents, “TicketUser”, “UserAddress” and “Ticket”. Each TicketUser could have multiple UserAddress, and also have multiple Ticket.

The testing data has two TicketUser documents:

ticket_user_0000001
{
“doctype”: “TicketUser”,
“name”: “Eric Chou”,
“occupation”: “Software Engineer”
}

ticket_user_0000002
{
“doctype”: “TicketUser”,
“name”: “Nolan Dubeau”,
“occupation”: “Lead Software Developer”
}
And there are three UserAddress documents: two UserAddress is for the TicketUser “ticket_user_0000001”, and one UserAddress is for the TicketUser “ticket_user_0000002”

user_address_0000001
{
“doctype”: “UserAddress”,
“address”: “1234 Maud street”,
“city”: “Toronto”,
“user_id”: “ticket_user_0000002”
}
user_address_0000002
{
“doctype”: “UserAddress”,
“address”: “91 Crayford Drive”,
“city”: “Scarborough”,
“user_id”: “ticket_user_0000001”
}
user_address_0000003
{
“doctype”: “UserAddress”,
“address”: “125 Bamburough Circle”,
“city”: “Scarborough”,
“user_id”: “ticket_user_0000001”
}

You can see that the “user_id” field in the UserAddress documents is the foreign key to the TicketUser.
Then, we also have 5 Ticket documents, two of them belong to “ticket_user_0000001” and three of them belong to “ticket_user_0000002”

ticket_0000001
{
“doctype”: “Ticket”,
“style”: “Alert”,
“description”: “Send panic alert to all the others”,
“user_id”: “ticket_user_0000001”
}
ticket_0000002
{
“doctype”: “Ticket”,
“style”: “Regular”,
“description”: “Send regular messages to the friends”,
“user_id”: “ticket_user_0000001”
}
ticket_0000003
{
“doctype”: “Ticket”,
“style”: “Regular”,
“description”: “Send regular messages to the friends”,
“user_id”: “ticket_user_0000002”
}
ticket_0000004
{
“doctype”: “Ticket”,
“style”: “Alert”,
“description”: “Send panic alert to all the others”,
“user_id”: “ticket_user_0000002”
}
ticket_0000005
{
“doctype”: “Ticket”,
“style”: “Alert”,
“description”: “Send panic alert to all the others”,
“user_id”: “ticket_user_0000002”
}

Now, as a baby step try, I try to obtain all the TicketUser along with their UserAddress documents, the N1QL statement is the following:

SELECT * FROM default useraddress JOIN default ticketuser KEYS [useraddress.user_id] WHERE useraddress.doctype=‘UserAddress’ AND ticketuser.doctype=‘TicketUser’

This statement works as I expected: three records were returned.

So, question one: I found that if I swap the two sides of the JOIN, I will get nothing returned, look at this statement:

SELECT * FROM default ticketuser JOIN default useraddress KEYS [useraddress.user_id] WHERE useraddress.doctype=‘UserAddress’ AND ticketuser.doctype=‘TicketUser’

--looks like I have to keep the "useraddress" to be the left side of JOIN. Is that true?

And then, the second question:
I want to collect not only TicketUser and UserAddress documents, but also Ticket documents, I wrote the statement like this but got nothing:

SELECT

  • FROM default useraddress JOIN default ticketuser KEYS
    [useraddress.user_id] LEFT JOIN default ticket KEYS [ticket.user_id]
    WHERE useraddress.doctype=‘UserAddress’ AND ticketuser.doctype=‘TicketUser’ AND ticket.docType=‘Ticket’

–could you let me know how to write the correct syntax for the two JOINs in the same statement?

Thanks in advance


#2

Hi Eric,

In N1QL, a query like:

SELECT * FROM k1 JOIN k2 ON keys

means we are joining k1 to k2 using the keys which should be attributes of k1; these keys will be joined (compared) to the keys of k2. So the query:

SELECT *
FROM default ticketuser JOIN default useraddress KEYS [useraddress.user_id]

is saying “join ticketuser with useraddress and use the field useraddress.user_id of ticketuser to join to useraddress”. Because ticketuser likely has no field called useraddress, the KEYS clause is resolving to empty and the query (correctly) returns no matches.

I think you have a similar issue with the LEFT JOIN part of the second query; ticketuser has no field called ticket, so the keys tickets.user_id will be empty and not match any tickets. Try re-writing the query so that TicketUser is right-most:

SELECT *
FROM default ticket JOIN default useraddress KEYS [ticket.user_id] JOIN default ticketUser KEYS [useraddress.user_id]


#3

Thanks colm, your explanation gives me good idea for understanding the way N1CL works.
However, I tried the following select statement and still got empty result:

SELECT * FROM default ticket JOIN default useraddress KEYS [ticket.user_id] JOIN default ticketuser KEYS [useraddress.user_id] WHERE useraddress.docType=‘UserAddress’ AND ticketuser.docType=‘TicketUser’ AND ticket.docType=‘Ticket’

I believe the problem happens with the first JOIN which is between “ticket” and “useraddress”, since the “ticket.user_id” contains the Document ID of “ticketUser” instead of “userAddress”, the first JOIN gets no result and therefore the whole two JOINs returns nothing.
How to solve this problem? Any idea?

Thanks
Eric


#4

Hi Eric,

I think you said it! “ticket” and “useraddress” do not reference to each other so the first JOIN will return no results. Looking at your data more closely, I see that “useraddress”.user_id refers to “ticketUser” and “ticket”.user_id also refers to “ticketUser”, so we can join “useraddress” with “ticketUser” and also join “ticket” with “ticketUser”. But I don’t see a way to join the three entities; with a three-way join, there’s usually a reference from A to B to C, or an entity A that references the other two - for example, purchase, product, and customer, with purchase referencing both product and customer so we construct a query like SELECT * FROM purchases JOIN product ON KEYS [purchases.product_id] JOIN customer ON KEYS [purchases.customer_id]


#5

Hi,

I know that this is a really old topic but I was wondering if there is a work around to this issue. I have the exact same structure, where I have two types of documents referencing a third type. Since a three-way join is not possible in this case, what can be done to get all of the needed information out of the different types of documents?

Cheers

Michal


#6

Hi Michal,

We are working on bi-directional joins right now, and it will be available soon (perhaps in a pre-release version). Until then, this requires client-side assembly, unfortunately.


#7

Thank you very much for the answer. Looking forward to the new version.