Dataset creation using multiple buckets with JOINs

Hi,
I am new to couchbase and would like to create a dataset as part of Analytics service. I have below query with 3 different buckets/tables: say 3 tables: A, B, C .
Could you please help me to create a dataset with Join please. Thanks in advance.

CREATE DATASET ABC_DATASET ON A a
LEFT OUTER JOIN B b ON b.id = a.id
LEFT OUTER JOIN C c ON c.number = b.number

but getting syntax error could you please help to create data set in couchbase 6.6?

(Please tag your post with “analytics” too to be sure all followers are notified.)

According to the specification:
https://docs.couchbase.com/server/current/analytics/5_ddl.html#dataset-specification
you can’t create a data set on a join.

Would you not be able to simply use a SELECT query at consumption time to get the result-set you’re looking for?

SELECT * FROM a LEFT OUTER JOIN b ON a.id = b.id LEFT OUTER JOIN c ON b.`number` = c.`number`;

HTH.

Thank you for quick reply. Is there any way I can use 2 or more buckets to collect data and create Data Set with that data?

Hi @krishna123 ,

the datasets that can be created on the Analytics service

  1. always contain data from a single bucket (or s subset of a bucket) and are
  2. kept current wrt to all modifications that are applied to the bucket.

To ensure that data is kept current it is currently necessary that data from different buckets is stored in separate datasets.

To analyze data from multiple buckets you can create datasets on your buckets A, B, and C

CREATE DATASET A_DATASET ON A;
CREATE DATASET B_DATASET ON B;
CREATE DATASET C_DATASET ON C;

and then query the data

SELECT *
FROM A_DATASET a
LEFT OUTER JOIN B_DATASET b ON a.id = b.id
LEFT OUTER JOIN C_DATASET c ON b.`number` = c.`number`;

using the MPP query processor of Couchbase Analytics .

Hi There,

But in below link, in Analytics Reader role it is mentioned that “multiple buckets may be combined into a single shadow dataset”.
https://docs.couchbase.com/server/current/learn/security/roles.html

What is the way to perform the same.

Best Regards,
Shiva

Hi @shiva ,

unfortunately, this formulation is a bug in the documentation.
It is not possible at this time to combine data from multiple buckets into a single dataset.
The documentation will be corrected soon.

Thank you for pointing out this error,
Till

Thank you for confirming this. :grinning: