Sync.gateway and change of import_filter

I have changed the import_filter function to include some more documents that I previously excluded.

After the change I did these steps:

  1. Restart the sync gateway task
  2. Take it offline
  3. Run resync
  4. Bring it back online

However, after the next sync with the app the “new” documents still don’t get sync’ed to the app… What further steps do I need to carry out to make the change effective?

Environment:
Couchbase Sync Gateway/2.7.1(5;a08bf70) CE
Community Edition 6.0.0 build 1693
.NET NuGet 2.7.0

Import processing and import filters are separate from the resync operation (which only re-applies the Sync Function to already mobile-aware documents).

Sync Gateway uses a DCP feed to import and cache documents. This feed is a stream of mutation events, and is checkpointed per vbucket to avoid reprocessing data we’ve already seen on startup.

One way you could cause a re-import is to delete the DCP checkpoint docs in the bucket, which will cause Sync Gateway to reprocess all data in the bucket, and your new import filter rules will be applied. For documents already imported, these will be skipped.

There’s one rather large caveat with this … a change to the import filter that excludes documents compared to the previous filter won’t have any affect. This will only work for including new documents, given that imported documents are skipped. If you wish to make a change to exclude documents, you’ll have to manually remove the _sync xattr from the affected documents in Couchbase Server.

To remove Sync Gateway’s DCP checkpoints:

  1. Stop all Sync Gateway nodes.
  2. There will be up to 1024 documents that have a key starting: _sync:dcp_ck:. These are the per-vbucket checkpoints (e.g: _sync:dcp_ck:561).
    Delete them from the bucket.
  3. Also remove the _sync:dcp_backfill document.
  4. Start Sync Gateway back up.

Thanks for your reply.

So if I understand you correctly then removing the DCP checkpoints doesn’t really affect the existing data ("…For documents already imported, these will be skipped.). Is that right? So I need to remove the _sync.xattr from the affected documents in addition to removing the DCP checkpoints?

If that is correct then how can I remove the xattrfrom the affected documents? What exactly does “manually” mean? Can I use a N1QL query to remove them? I don’t seem to be able to edit and remove them manually… - and that wouldn’t be practical anyway as it affects 5-6.000 docs.

Looking at some data I guess it is the "xattrs:{_sync:{...." (the other way round) attributes? Like they show up in the meta data:

  "xattrs": {
    "_sync": {
      "rev": "2-6f9ef809e4ff939d27afcd309ff418f2",

I think given your original statement:

there’s no need to remove existing xattrs, I was just including the caveat in my post for completeness if you were to update your import filter to exclude documents.

In your case, as long as you’re just adding new documents to import, you can simply remove the checkpoints and restart Sync Gateway.

Sounds good.

So just to be sure. This is the code I removed from my import filter:

if ((doc.type == 'FishingTrip' || doc.type == 'Catch') && typeof doc.userkey === 'undefined') {
    return false;
}

As I want all of the docs of type ‘FishingTrip’ and ‘Catch’ to be available in the app. So if I understand you correctly, this will import more documents and as such be on the “safe path”.

Though not relevant for my situation, I’m still a little curious as to how one would be able to remove the attributes…? A quick search did not give any good answers :innocent:

Yep that seems fine to me.

Honestly, I’m not too sure off the top of my head without me trying to search the same documentation you are.

In general, removing previously mobile-aware documents is not a use case we really support, because mobile clients can just as easily push their documents back up to Sync Gateway again as soon as they see Sync Gateway does not have it. You’d need a way of purging the documents from clients too, or include a new rule to reject them in the sync function in order for it to work well. It’s clunky.

1 Like

Ok, makes sense.

My use case is slightly different at the moment as I haven’t gone live with the sync based solution yet. So I can just wipe the data on the mobile during the tests :innocent:

And shold something like this happen in production then the data would never travel back as in my case they would all be required to have a userkey. And I can see that I could just take them out of the channel that the clients read…

Still trying to get my head around all the aspects of the sync. capabilities… :innocent:

Thanks for the enlightenment!