Select non-group by fields using sql query

karthikv · May 7, 2018, 10:03pm

I am new to couchbase and I have been going through couchbase documents and other online resources for a while but I could’t get my query working. Below is the data structure and my query:

Table1:
{ “jobId” : “101”, “jobName” : “abcd”, “jobGroup” : “groupa”, “created” : " “2018-05-06T19:13:43.318Z”, “region” : “dev” },
{ “jobId” : “102”, “jobName” : “abcd2”, “jobGroup” : “groupa”, “created” : " “2018-05-06T22:13:43.318Z”, “region” : “dev” },
{ “jobId” : “103”, “jobName” : “abcd3”, “jobGroup” : “groupb”, “created” : " “2018-05-05T19:11:43.318Z”, “region” : “test” }

I need to get the jobId which has the latest job information (max on created timestamp) for a given jobGroup and region (group by jobGroup and region).

My sql query doesn’t help me using self-join on jobId.
Query:

/* Idea is to pull out the job which was executed latest for all possible groups and region and print the details of that particular job*/
select * from (select max(DATE_FORMAT_STR(j.created,‘1111-11-11T00:00:00+00:00’)) as latest, j.jobGroup, j.region from table1 j group by jobGroup, region) as viewtable join table t on keys meta(t).id where viewtable.latest in t.created and t.jobGroup = viewtable.jobGroup and viewtable.region = t.region
Error Result: No result displayed
Desired result :
{
“jobId” : “102”,
“jobName”:“abcd2”,
“jobGroup”:“groupa”,
“latest” :“2018-05-06T22:13:43.318Z”,
“region”:“dev”
},
{
“jobId” : “103”,
“jobName” : “abcd3”,
“jobGroup” : “groupb”,
“created” : " “2018-05-05T19:11:43.318Z”,
“region” : “test”
}

vsr1 · May 7, 2018, 10:15pm

SELECT MAX([created,jobId])[1] AS jobId 
FROM default WHERE ..  GROUP BY jobGroup, region;

eben · May 9, 2018, 2:00pm

To expand on @vsr1’s answer, here is the reply I posted on StackOverflow:

This can be answered using ‘group by’ and no join. I tried entering your sample data and the following query gives the correct result:

select max([created,d])[1] max_for_group_region 
 from default d
 group by jobGroup, region;

How does it work? It uses ‘group by’ to group documents by jobGroup and region, then creates a two-element array holding, for every document in the group:

the ‘created’ timestamp field
the document where the timestamp came from

It then applies the max function on the set of 2-element arrays. The max of a set of arrays looks for the maximum value in the first array position, and if there’s a tie look at the second position, and so on. In this case we are getting the two-element array with the max timestamp.

Now we have an array [ timestamp, document ], so we apply [1] to extract just the document.

karthikv · May 9, 2018, 4:37pm

@vsr1 @eben :
Thank you very much for the query and explanation, it worked

vsr1 · May 9, 2018, 4:41pm

If you want select whole document Check out JOIN query in above link for better performance avoid unnecessary fetch (i.e make covered query for document that qualified and fetch those documents).