Joining on the same bucket, not using a primary key

Hello,

I came across similar questions but none of them yielded the answer I was looking for. Here’s my problem. I have the following records:

{“key1”:“123”, key2:“789”}
{“key1”:“124”, key2:“990”}
{“key1”:“123”, key2:“991”}
{“key1”:“124”, key2:“992”}

Is there any way to return data like

{
[{
“key1”: “123”,
“values”: [
“789”, “990”
]
}, {
“key2”: “124”,
“values”: [
“991”, “992”
]
}, ]
}

I’ve tried a few things, but keep getting the empty result set.

Thanks,
Alex

Hi, should that be the following (see the modified data values):

If so, you can do:

SELECT key1, ARRAY_AGG(key2) AS `values`
FROM mybucket
GROUP BY key1;

Thank you! That works. Follow-up question:

How can I do something like this?

SOURCE DATA:

{“key1”:“123”, key2:“A”, key3: “X”}
{“key1”:“123”, key2:“B”}
{“key1”:“124”, key2:“A”, key3: “Y”}
{“key1”:“124”, key2:“B”}
{“key1”:“125”, key2:“A”, key3: “X”}
{“key1”:“125”, key2:“B”}
{“key1”:“126”, key2:“A”, key3: “X”}
{“key1”:“127”, key2:“B”}

How do I return something like

{
[
{“key3”: “X”, “key1”: 2},
{“key3”: “Y”, “key1”: 1}
]
}

What I am looking for is a co-occurrence of key2=A and key2=B against the same key1 value. This happens twice for key3 of “X”, and once for the key3 of “Y”. In this reduction key1 represents the number of distinct key1 values from the original source set satisfying the rule.

Can this be accomplished?

Here you go. These are standard SQL aggregations.

SELECT key3, COUNT(key1) AS key1
FROM mybucket
GROUP BY key3;

And if you don’t want to count duplicate pairs:

SELECT key3, COUNT(DISTINCT key1) AS key1
FROM mybucket
GROUP BY key3;