Bulk Purge of Tombstones


#1

I have a very large amount of documents that have accumulated over time that are tombstoned and are definitely no longer needed by the client. I believe simply using a N1QL query in the workbench would cause the sync gateway to be out of sync with the bucket, and the _purge endpoint of the SG Admin API seems cumbersome to use unless i can somehow get all of the uuid’s into the post body (again, there is alot of these)

It would be easiest to just bulk delete the tombstones via the workbench and then perhaps kick the SG but I don’t know if that would be enough to stop those now missing tombstones from being attempted to sync.


#2

Any ideas here? I tried just doing a straight N1QL delete on a development server and it got rid of the tombstones, but if I hit the _changes endpoint on the sync gateway I still see the tombstones there. So I essentially would need to take the bucket offline, resync, and bring it back online. Completely infeasible for production.


#3

Hate to say it, but I’m experiencing the same issue

We have 100,000 legitimate documents, but those documents need to be refreshed weekly-- so an admin deletes all 100,000 and another 100,000 are uploaded.

The problem we are facing is that those documents are just tombstoned, and pushed, and then Couchbase Server stores all of our tombstoned revisions, and those are getting pulled into our clients as well, and we’ll quickly end up with 500,000+ documents, mostly tombstoned?

We are currently looking into updating documents rather than deleting them, but in the meantime, it’d be nice if we could

  1. Purge these tombstones? Though the problem would be if a client has a document that has been purged out of the server, then it would be pushed up? So probably not a good idea to purge tombstones from server
  2. Stop clients from pulling tombstone revisions. Ideally, a client has documents, pull replication starts, and it notices the server has tombstone revisions. If we could then delete and purge only these documents, that’d be great. The main issue is that now the clients are pulling all of these tombstone revisions that they don’t even care about

It’s possible I’m misunderstanding something, or missing something, or that this is just a limitation of Couchbase, but nevertheless this is something we need a solution for


#4

As you have already pointed out, deletions are synced by design so that clients are notified of the deletion and don’t attempt to push the document .

Which version of CBL / SGW are you on? Options vary depending on which version you are on …


#5

I’ll chime back in here and say that in a vacuum and then on our production environments, running Couchbase Server 5.1.0 against CBL/SG 2.0 doing just a simple “delete from bucket_name where _deleted is true and whatever_other_criteria is true” worked to purge the tombstones I wanted, and then triggered a redraw of the sync design views.

Perhaps spin up a dev environment on your machine to test this out before “doing it live” ?


#6

Hi, thanks for the response!

We just upgraded everything to CBL 2.1.1, and SGW 2.1.0
CBServer is 5.5.1

I haven’t re-tested yet, and I wanted to play with the SGW config to try to optimize our replication-- but again here is the main question:

Can we stop clients from pulling tombstone revisions, for documents they don’t have locally?
The main issue is that now the clients are pulling all of these tombstone revisions for documents they never even had.

But going backwards, the only reason this is a concern is because these tombstone revisions take a while to sync. Our entire replication process, for about 500K documents (which ostensibly includes tombstones), takes ~10 minutes? But with the users constantly deleting documents and reuploading new versions, we could easily end up with millions of documents/tombstones.
But we only have maybe 200K documents at any given time, that aren’t tombstones.

And it’s not clear how long it takes to push either, because our users could update 100K documents at once, we ideally want that push replication to happen as fast as possible. If the user quits the application while the push is still happening, we currently don’t have any support for handling that scenario

**EDIT:
So actually maybe our SGW config could be optimized as well? For our MAX 300K documents at any given time, if that could sync within 30 seconds, that would be our goal.

Here is an example of a config for one of our databases:

"live-partDB": {
    "server": "http://db:8091",
    "username": "Administrator",
    "password": "password",
    "bucket": "live-partBucket",
    "users": {
        "GUEST": { "disabled": false, "admin_channels": ["*"] }
    },
    "allow_conflicts": false,
    "replications":[
        {
            "changes_feed_limit": "10000"
        }
    ],
    "num_index_replicas": 0
},

Our Bucket settings:
100MB Memory,
Bucket Type: Couchbase,
Replicas Enabled, 1
Replicate View Indexes: enabled
Compression Mode: Passive
Ejection Method: Value Only
Bucket Priority: High (we set them all to high)

Couchbase Server Settings:
Data 2GB
Index 2GB
Search 500MB
Analytics 1GB
Eventing 500MB

I think everything else is default

We just want to sync everything as fast as possible, but it’s unclear why it seems to take so long to replicate right now. Our server seems strong enough, the thought is maybe our configuration isn’t ideal. We’ll do whatever it takes to replicate 300K documents within 30 seconds


#7

There are many reasons why replication could potentially be slow in your deployment and its a tough one to answer .
But first, you can do some things to optimize the volume of data that gets replicated. You mention that you have “users” in your system but you are clearly not separating documents by user. You have a single GUEST account (which BTW is typically only used on dev/testing and not recommended on prod ) .

Is the expectation that all documents have to get to all users ? If not, I’d recommend the use of channels to logically split up documents and to only deliver only the documents relevant to the user . (User can be granted access to one or more channels and by extension, they get access to documents in the channel).

So I’d suggest that you begin with leveraging channels to filter the documents being replicated to each client.

If the replication is still slow, then look into your sync gateway node

  • CPU utilization
  • memory utilization

Check network latency

There is some monitoring of the sync gateway node that is possible through that will give you insights into what may be going on.

Also, unrelated to above, I’d recommend that you upgrade to latest CBL 2.1 version of Android.


#8

Correct, there is no concept of a user for us, because everybody will share the same data

our “users” document is simply a username we pull to track app usage,

Also we are on iOS CBL 2.1.1

For the SG Node, if I want to see CPU/Memory utilization, do I see that in the Couchbase Server portal? Or the SG Admin portal?