Couchbase deletion of documents


#1

I am using couchbase 4.1. when i delete the document, a new revision of document is getting created with attribute deleted=True. when compaction runs older revisions are getting deleted but am curious to know when the latest document (which is having only meta data) is going to delete from couchbase permanently.
Is there any setting i am missing ?


#2

I have set autocompaction with purge. but documents tombstone documents are not deleting from couchbase. any help on this please?


#3

As you have correctly identified, due to its append-only nature Couchbase Server does not actually remove a document when it receives a delete operation. Instead it will create a new revision of the document which indicates that it has been ‘deleted’.


I am curious to know when the latest document (which is having only meta data) is going to delete from couchbase permanently.

As explained in our compaction documentation, this metadata is retained for three days by default after the document has been deleted, after which compaction will purge this metadata.


The reason that we retain this metadata for so long after the document has been deleted is to ensure data consistency between two clusters connected via XDCR.
Imagine a situation where you have two clusters, cluster A and cluster B, cluster A is linked bidirectionally to B via XDCR, where all documents are replicated from A->B and from B->A.
In this situation a document is deleted on cluster B but then immediately auto-compaction kicks in.

If Couchbase Server immediately removed metadata of deleted documents, then no matter what, cluster A would recreate this document on cluster B, based on its own data, this is unexpected behaviour and breaks the semantics of XDCR, creating an ‘inconsistency’.

However as we retain the metadata of deleted documents for a prolonged period time, the clusters can perform the conflict resolution based on revision number with the ‘deleted’ document on cluster B, deleting it also on cluster A.

This is an extreme example, however there are times where XDCR can take a very long time to replicate between two clusters, so it is important that we keep the metadata around long enough for this not to be a risk, even in very high latency environments.


Apologies for going into slightly more detail than you asked for, but I thought it was important to understand why Couchbase Server does not immediately remove the metadata of deleted documents, so that you can make an informed decision as to how to handle this issue.

If in your environment you are not using any XDCR and you wish to purge this metadata more often to conserve disk space, then I suggest altering the Metadata Purge Interval in the auto-compaction settings to a more suitable value for your use-case.


#4

Thanks matt for detailed explanation.
I have set 0.04 i.e 1 H for meta purge interval and autocompaction is also running everyday. but can see no tombstones are getting deleted from couchbase.
i have deleted X number of documents and even after 2-3 days i can see tombstones are not getting deleted permanently.


#5

That’s interesting, how are you determining that these tombstones are not being deleted?

Additionally, could you verify that your bucket-level auto-compaction settings have been set correctly?
By default, the auto-compaction settings at a bucket-level will not overwrite the cluster-wide settings, however this may be the case here.

You can verify these settings by hitting the REST endpoint documented in the REST API documentation.
Could you please paste your auto-compaction settings for the cluster as well as for the bucket in question?

Perhaps this will shed more light on the issue! :slight_smile:


#6

I have only set compaction at cluster level. below is my cluster setting

{“autoCompactionSettings”:{“parallelDBAndViewCompaction”:true,“allowedTimePeriod”:{“fromHour”:0,“toHour”:23,“fromMinute”:0,“toMinute”:0,“abortOutside”:true},“databaseFragmentationThreshold”:{“percentage”:30,“size”:“undefined”},“viewFragmentationThreshold”:{“percentage”:30,“size”:“undefined”},“indexFragmentationThreshold”:{“percentage”:30}},“purgeInterval”:0.04}

and bucket level i have not override the compaction hence i got “autoCompactionSettings”:false, in rest api response.

i think Meta data purge interval is not working here.
i can see that older revisions are getting deleted from server. but latest revision which has attribute _deleted=true is not getting deleted.
I have written a view for getting _deleted=true and could see still many older than 15 days tombstones.


#7

any help on above please.


#8

We’re having a similar problem and any pointers on best ways to resolve would be great. Here’s our problem. We’re building an iOS app + Admin website with Couchbase Mobile 1.2 and Couchbase Server 4.1. The app is a basic employee directory app. The problem we have is that when an employee leaves, we delete the employee document on the server but the employee keeps popping up on the app. We get into the tombstone process and the employee record never really gets deleted or never goes away from the mobile app database. We’ve tried purging the sync gateway cache, compaction, etc. but no clean database/sync level solution as yet. What’s the best way to do this? How do we delete a record on the server and have that “delete” propagate to all devices quickly?


#9

Apologies for not getting back to this sooner.

I realise now that you are probably discussing documents created via sync gateway, is that the case?

If so the purge feature was added as part of Sync Gateway 1.2 which will actually delete all documents which have been marked as deleted=true.

Let me know if this helps!


#10

Yes matt, documents are creating and deleting through sync gateway. purge does what is expected. just to curious to know, what is the significance of Metadata Purge Interval in autocompaction settings.


#11

I’m glad that the purge feature of sync_gateway worked for you!

The Metadata Purge Interval is the amount of time that metadata of a deleted document (deleted from Couchbase Server, not sync gateway) is retained.

I explained why this exists in a lot of detail in my original response to your question.


#12

matt, is Metadata Purge Interval related to deleted documents in couchbase from SDK’s?
and it does not related to documents deleted from syncgateway?


#13

Yes that is exactly the case!

Documents deleted via sync gateway normally (i.e without a purge) simply create a new revision of the document with a field called deleted=true so that sync gateway knows not to access the document and to treat it as ‘deleted’.
I believe, although you may want to check this, that the sync gateway purge also issues the same delete command as an SDK would, so it is probably expected that these tombstones will exist for the purged documents until the metadata has also been removed as part of the compaction process.


#14

Thanks matt, In that case ,can we use the couchbase sdk’s to delete the Documents deleted via sync gateway normally (i.e without a purge) with a field called deleted=true from couchbase permanently instead of using the purge through syncgateway(just to avoid the load on syncgateway) ?


#15

I cannot see a reason why not, I will test this locally when I get a chance, personally I do not have a lot of experience with using sync gateway so I am not sure of the exact behaviour.

Perhaps you could test this in a pre-production environment to see if this works for you?
It is possible that there are also internal caches in sync gateway that the document might get removed from by the purge command, so it is worth verifying sync gateways behaviour when you do this.


#16

Thanks matt for quick reply. i will test and verify.