Finding documents with duplicate field content

Hi,

Couchbase server 5.0.

Is there a way to use a N1QL query to find documents containing fields with duplicate content (same field value as another document)? I could probably use a view to do this, but it would be much simpler if I could use N1QL.

Example document:

{
“id”: “[UUID]”
}

Find all documents which share value of the “id” field.

Thanks

For each id it gives array of document keys that has same value of id

SELECT d.id, ARRAY_AGG(META(d).id)  AS dockeys
FROM default AS d
WHERE .....
GROUP BY d.id;

Thanks!

That certainly gives me a starting point to work from. Just a quick follow-up then… this seems to give me result entries for every document, reporting the document itself as the duplicate key, which makes it hard to work with given a large document set.

My reason for this query is to find certain document inconsistencies in a bucket, and would be something to use manually in case where this is noticed.

Would there be a quick way to adapt this to only return results where there are other documents with duplicate field content?

Thanks again!

SELECT d.id, dockeys
FROM default AS d
WHERE .....
GROUP BY d.id
LETTING dockeys = ARRAY_AGG(META(d).id)
HAVING ARRAY_LENGTH(dockeys) > 1;

Thanks!

That work perfectly. :slight_smile: