Join not working as expected - when key does not exist for all joined docs


#1

Ok, I thought I understood this - but apparently there is something missing…

I have this data:

{
  "assockey": "1",
  "date": "2019-02-16T12:00:00+0100",
  "key": "62B05776D04CA5E1C12583A50038F10C",
  "specieskey": "13",
  "type": "Catch"
}

and a related “sub-doc.”:

{
  "key": "62B05776D04CA5E1C12583A50038F10C",
  "personname": "John Dalsgaard",
  "type": "Private"
}

The first has the key: Catch:62B05776D04CA5E1C12583A50038F10C and the second has the key: Catch:Private:62B05776D04CA5E1C12583A50038F10C

Now the situation is that only some of the cathces have a “sub doc.”. However, I seem to only get those docs that have a private doc connected to it:

SELECT t2.*, t1.* FROM data AS t1 LEFT JOIN data AS t2 ON KEYS 'Catch:Private:'||t1.`key` WHERE t1.type='Catch' AND t2.type='Private'

This query returns 2 docs. If I do a “normal” query:

SELECT * FROM data WHERE type='Catch'

… I get 10.000+ docs

As I understand it from the documentation then the LEFT should only require the left side of the join to be valued (i.e. t1).

What have I misunderstood here?


#2

t2.type=‘Private’ predicate causes to eliminate this.

Option 1)

SELECT t2.*, t1.* 
FROM data AS t1 
LET t2 = (SELECT RAW t3 FROM data AS t3 USE KEYS 'Catch:Private:'||t1.`key`  WHERE t3.type='Private')[0]
WHERE t1.type='Catch' ;

Option 2)

SELECT t2.*, t1.* 
FROM data AS t1 
LEFT JOIN data AS t2 
ON KEYS 'Catch:Private:'||t1.`key` 
WHERE t1.type='Catch' AND (t2 IS MISSING OR t2.type='Private')

Option 3) USE ANSI JOIN

SELECT t2.*, t1.* 
FROM data AS t1 
LEFT JOIN data AS t2 
ON  'Catch:Private:'||t1.`key`  = META(t2).id AND t2.type='Private'
WHERE t1.type='Catch' ;

#3

I cannot get option 1 to work. There is a syntax error and adding “… AND t3.type…” isn’t valid either.

The two other options I can get to work :+1:

I don’t really see any significant difference in response time from the two options. So I guess from that point it doesn’t really matter which one I choose. Are there any differences in terms of options to tune the queries? I would like to optimise the query indifferently of which one I select (current response time is about 1.30 secs. for 10K docs.)

So basically “reordering” a little of the conditions on t2.type in my own query makes it work like the ANSI query.

Thanks!


#4
SELECT t2.*, t1.* 
FROM data AS t1 
LET t2 = (SELECT RAW t3 FROM data AS t3 USE KEYS 'Catch:Private:'||t1.`key`  WHERE t3.type='Private')[0]
WHERE t1.type='Catch' ;

#5

Ah, yes! That worked :slight_smile:

A quick test of response times show a little more than 2 secs. for the same data set…