Views and count distinct


#1

I am searched very must but failed to have a solution , I have documents like

{
   "client":"a",
   "user":"b",
   ...
}

{
   "client":"a",
   "user":"b",
   ...
}

{
   "client":"a",
   "user":"c",
   ...
}

I want to have a view that show me how many DISTINCT user with cliant a exists


#2

Please reply me …


#3

Any idea ???


#4

:sob: help please …


#5

After 5 days!! UP …:disappointed_relieved:


#6

@matthew.groves @brett19 @vsr1 @ingenthr I really need your helps


#7

@socketman2016, Just got alerted on this one. Started looking at it. We shall respond soon.


#8

Hi @socketman2016,

Does it need to be a map/reduce view or is N1QL on the table (I notice you tagged @vsr1).


#9

In N1QL it’s too easy
I want to know how can I do it in a map reduce view , the reason is here Question : Cached count

I know one solution is Feature request : _approx_count_distinct


#10

Hi @socketman2016,

You could use N1QL embedded in an event handler, and store the result in a bucket. You can then either refresh the aggregate periodically using either timers.

Please see:
https://blog.couchbase.com/using-n1ql-with-couchbase-eventing-functions/
https://docs.couchbase.com/server/5.5/eventing/eventing-examples.html
https://blog.couchbase.com/timers-couchbase-functions/

Best Regards,
Siri


#11

Hi @socketman2016,

You could create a map function which emits client and user and reduce function _count.
Then call the view with group_level=2 and start key as [client, null] and end key as [client, “\uffff”] and count the number of row returned. This will be the number of distinct user for that client for startkey[0].

Map:
function (doc, meta) {
emit([doc.client, doc.user]);
}

reduce: _count

Query:
Querying for distinct user for client “a”.
stale=false&inclusive_end=true&full_set=true&group_level=2&startkey=%5B"a"%2C%20null%5D&endkey=%5B"a"%2C%20"%5Cu0fff"%5D

Result:
{“rows”:[
{“key”:[“a”,“Value1”],“value”:100},
{“key”:[“a”,“Value2”],“value”:100}
]
}
counting the number of rows gives the number of distinct user for client “a”.


#12

@AnkitPrabhu your approach is not scaled up
What do you think about 100 millions distinct user per client??
The N1QL works better than your approach