Index join using an array index


#1

Hi !

I am trying to mimic a right to left join using the method explained in the help (https://developer.couchbase.com/documentation/server/current/n1ql/n1ql-language-reference/from.html#concept_rnt_zfk_np__index-join). It is explained that it shall to be performed using an index join.

my documents are:
bucket users:
{"id":"1", "name":"foo", "owned_lists": "1"}
bucket lists:
{"id":"1", "name":"bar"}

When I execute the following n1ql with an index created on a field of my document, it works:
SELECT s.* FROM lists s JOIN users u ON KEY u.owned_lists FOR s
this requires to create an index on the field u.owned_lists with this:
CREATE INDEX test ON users(owned_lists)

However if I replace the field by an array of values in the bucket users like this:
bucket users:
{"id":"1", "name":"foo", "owned_lists": ["1"]}

The query does not complete with the following error:
msg": "No index available for join term u",

I tried to create an array index with this:
CREATE INDEX test ON users(DISTINCT owned_lists)

but it does not work.

Any ideas of what I am getting wrong ?


#2
ON KEY <onexpr> 
<onexpr> needs to be exactly matched with leading index key. 

The array index will not work for index join. Also Index has duplicates for array key and result in extra matches. So array index will not be used in Index Join even though ON KEY is scalar field.

If you know which element of array to match you can do that by directly referencing. Not matching any elements
Example: CREATE INDEX test ON users(owned_lists[5])
SELECT s.* FROM lists s JOIN users u ON KEY u.owned_lists[5] FOR s


#3

ANSI joins are available in Couchbase 5.5 (Developer preview later this month).


#4

Hi

Thank you for the informations !
Seems that ANSI join are exactly what I am looking for. I’ll wait for the
developer preview to implement my feature correctly. The IN clause shall do
the trick by looking in the array.


#5

Hi @aacebedo,

5.5 DP available to download. Please let us know any feedback.


#6

can you clarify, what you mean by duplicates for array key. And, why is it a problem?
Is it a problem if the array owned_lists[] doesn’t have duplicate elements?

And , why doesn’t it work if I UNNEST the array, such as:
SELECT s.* FROM lists s UNNEST users.owned_lists uo_item
JOIN users ON KEY users.uo_item FOR s

and, does ansi joins in 5.5 work with array-index and unnest?

thanks.


#7

Index join will not work with Index Join because Join on document key, In case of ARRAY Index it repeats same document for each array value that result in duplicates.

If you UNNEST and then JOIN it works, UNNEST nothing but JOIN.
Example . Left has 1 document with array of 10 items, UNNEST on that array produces 10 items,
Then left side of Index Join becomes 10 items and Index Join on that produces more items.
If that is what you are expecting you can use UNNEST.

ANSI Join works check the link above which has examples.


#8

btw, I verified that ANSI joins work with array index. However, that seems pretty inefficient when compared to the capability of index-join working with array-index (if it can), when the two keyspaces are related with document-key/meta().id (ANSI-join is good when joining based on non-key field). Because, with ansi join(NLJ), the right side keyspace is blindly iterated for all the docs in it. However, the index-join is based on key, it is just an index lookup and it will avoid the inner-loop of NLJ. Isn’t??


#9

ANSI JOIN can also do with document key check last example of the ANSI JOIN link above.

If you have right index on right side it will not iterate through all the documents it does indexscan like same as Index JOIN. The following blog gives all the details https://blog.couchbase.com/ansi-join-support-n1ql/