Join with non primary key

Is it possible to perform join operation with non primary key attribute with both buckets
If I need to perform the following query
The two buckets are :
Conditions
• CODE
• DESCRIPTION
• ENCOUNTER
• PATIENT
• STARTDATE
• STOP

And encounters
• BASE_ENCOUNTER_COST (number)
• CODE (number)
• DESCRIPTION (string)
• ENCOUNTERCLASS (string)
• Id (string)
• ORGANIZATION (string)
• PATIENT
• PAYER (string)
• PAYER_COVERAGE (number)
• PROVIDER (string)
• REASONCODE (number)
• REASONDESCRIPTION
• STARTDATE (string)
• STOP (string)
• TOTAL_CLAIM_COST

Select COVID-19 conditions in the simulation for every patient
SELECT ARRAY_AGG( DESCRIPTION),
conditions. PATIENT
FROM conditions
JOIN encounters
ON conditions. PATIENT = encounters. PATIENT
AND encounters.REASONDESCRIPTION =“COVID-19”
GROUP BY conditions. PATIENT ;
The same for another query
All observation for patient didn’t diagnosed
The buckets are observations

CODE
DATE
DESCRIPTION
ENCOUNTER
OBSERVATIONVALUE
PATIENT

SELECT encounters. PATIENT
observations. PATIENT,
observations. DESCRIPTION,
observations. OBSERVATIONVALUE
FROM encounters
JOIN observations ON observations. PATIENT = encounters. PATIENT
WHERE encounters. REASONDESCRIPTION IS NOT MISSING ;

It is possible. Did you tried it. Did u see any issue.
The following link explain 17 Use cases ANSI JOIN Support in N1QL | The Couchbase Blog

it give me
“msg”: “Ambiguous reference to field DESCRIPTION.”,

i dont know which attribute is right to use in on condition to join ( patient or encounter )
ON conditions. ENCOUNTER = encounters. Id
or ON conditions. PATIENT = encounters. PATIENT

You need to make decision what is your join condition. You should fully qualify with left of JOIN alias or right of JOIN alias

1 Like

for the second query it take very long time and give me the execution plan

SELECT observations. PATIENT,

   observations. DESCRIPTION,

   observations. OBSERVATIONVALUE

FROM observations

JOIN encounters ON observations. PATIENT = encounters. PATIENT

WHERE encounters. REASONDESCRIPTION IS MISSING ;

i think the idea in both query to retrieve data from one bucket based on acondition from another bucket

CREATE INDEX ix1 ON encounters(PATIENT) WHERE REASONDESCRIPTION IS MISSING;
CREATE INDEX ix2 ON observations(PATIENT, DESCRIPTION, OBSERVATIONVALUE);
SELECT o.PATIENT, o.DESCRIPTION, o.OBSERVATIONVALUE
FROM encounters AS e
JOIN observations AS o ON o.PATIENT = e.PATIENT
WHERE e.REASONDESCRIPTION IS MISSING ;

it give me the same result (run for more than five minutes and finally give me the query paln )

when i tried another condition value it run . is it could be due to large result ?

if i need to group the result like that

SELECT o. ARRAY_AGG (PATIENT), o.DESCRIPTION, o.OBSERVATIONVALUE
FROM encounters AS e
JOIN observations AS o ON o.PATIENT = e.PATIENT
WHERE e.REASONDESCRIPTION =“COVID-19”
GROUP by o.PATIENT ;

why it give me an error
“msg”: “Expression (o.DESCRIPTION) must depend only on group keys or aggregates.”,
thanks in advance

The error clearly says what it is you can’t project o.DESCRIPTION, o.OBSERVATIONVALUE those are not part of group by.

SELECT o.PATIENT, o.DESCRIPTION, o.OBSERVATIONVALUE
FROM ( SELECT DISTINCT e.PATIENT 
                  FROM encounters AS e
                  WHERE e.REASONDESCRIPTION =“COVID-19”) AS d
JOIN  observations AS o ON o.PATIENT = d.PATIENT;

OR

SELECT   o.PATIENT, ARRAY_AGG ({ o.DESCRIPTION, o.OBSERVATIONVALUE}) AS reason
FROM encounters AS e
JOIN observations AS o ON o.PATIENT = e.PATIENT
WHERE e.REASONDESCRIPTION =“COVID-19”
GROUP by o.PATIENT ;
1 Like