Full text search and distinct


#1

How can I set DISTINCT clause for FTS? I want to search in geo location data (geo field) and just fetch distinct place ids


#2

Help Please …


#3

Hi,

FTS don’t have any DISTINCT semantics.
Can you please share more insights into the duplicate docID issues with your geo queries, like a sample query with an example?

thanks,


#4

I have a nearby search service, users can share their locations and each user can have more than 1 location
In search result I want to filter duplicate users and in nearby result you can see distinct nearby users


#5

Would you share a sample document, to help us understand how you intend to do the filtering.


#6
| Type | User |
|------|------|
| id   | 1    |
| name | Jack |

| Type | User    |
|------|---------|
| id   | 2       |
| name | William |


| type   | Location              |
|--------|-----------------------|
| id     | 100                   |
| geo    | {lat:50.11,lon:80.28} |
| userId | 1                     |


| type   | Location              |
|--------|-----------------------|
| id     | 101                   |
| geo    | {lat:50.10,lon:80.27} |
| userId | 1                     |


| type   | Location              |
|--------|-----------------------|
| id     | 102                   |
| geo    | {lat:50.12,lon:80.28} |
| userId | 2                     |

Assume my nearby search bounded all above Location documents ,FTS result (radius based geo search) that I except must contain one of userId 1 locations not both, I want to remove duplicate results (per userId)


#7

Thanks for explaining that.
So like @sreeks mentioned above, there isn’t a DISTINCT clause with FTS.

The scoring however will be different for the docs with the common userid, which will be based on the relevance or in this case proximity with respect to the geo coordinates.

The application should be able to make the decision on selecting unique documents and it can do so by selecting one with the highest score.


#8

By default, radius based geo search results are not ranked based on the distance from the given location in query. But you could achieve that by sorting the results based on the geo_distance to get the ranking/scoring that @abhinav mentioned here.

ref: https://docs.couchbase.com/server/6.0/fts/fts-geospatial-queries.html