Sync.gateway and change of import_filter

I have changed the import_filter function to include some more documents that I previously excluded.

After the change I did these steps:

  1. Restart the sync gateway task
  2. Take it offline
  3. Run resync
  4. Bring it back online

However, after the next sync with the app the “new” documents still don’t get sync’ed to the app… What further steps do I need to carry out to make the change effective?

Environment:
Couchbase Sync Gateway/2.7.1(5;a08bf70) CE
Community Edition 6.0.0 build 1693
.NET NuGet 2.7.0

Import processing and import filters are separate from the resync operation (which only re-applies the Sync Function to already mobile-aware documents).

Sync Gateway uses a DCP feed to import and cache documents. This feed is a stream of mutation events, and is checkpointed per vbucket to avoid reprocessing data we’ve already seen on startup.

One way you could cause a re-import is to delete the DCP checkpoint docs in the bucket, which will cause Sync Gateway to reprocess all data in the bucket, and your new import filter rules will be applied. For documents already imported, these will be skipped.

There’s one rather large caveat with this … a change to the import filter that excludes documents compared to the previous filter won’t have any affect. This will only work for including new documents, given that imported documents are skipped. If you wish to make a change to exclude documents, you’ll have to manually remove the _sync xattr from the affected documents in Couchbase Server.

To remove Sync Gateway’s DCP checkpoints:

  1. Stop all Sync Gateway nodes.
  2. There will be up to 1024 documents that have a key starting: _sync:dcp_ck:. These are the per-vbucket checkpoints (e.g: _sync:dcp_ck:561).
    Delete them from the bucket.
  3. Also remove the _sync:dcp_backfill document.
  4. Start Sync Gateway back up.

Thanks for your reply.

So if I understand you correctly then removing the DCP checkpoints doesn’t really affect the existing data ("…For documents already imported, these will be skipped.). Is that right? So I need to remove the _sync.xattr from the affected documents in addition to removing the DCP checkpoints?

If that is correct then how can I remove the xattrfrom the affected documents? What exactly does “manually” mean? Can I use a N1QL query to remove them? I don’t seem to be able to edit and remove them manually… - and that wouldn’t be practical anyway as it affects 5-6.000 docs.

Looking at some data I guess it is the "xattrs:{_sync:{...." (the other way round) attributes? Like they show up in the meta data:

  "xattrs": {
    "_sync": {
      "rev": "2-6f9ef809e4ff939d27afcd309ff418f2",

I think given your original statement:

there’s no need to remove existing xattrs, I was just including the caveat in my post for completeness if you were to update your import filter to exclude documents.

In your case, as long as you’re just adding new documents to import, you can simply remove the checkpoints and restart Sync Gateway.

Sounds good.

So just to be sure. This is the code I removed from my import filter:

if ((doc.type == 'FishingTrip' || doc.type == 'Catch') && typeof doc.userkey === 'undefined') {
    return false;
}

As I want all of the docs of type ‘FishingTrip’ and ‘Catch’ to be available in the app. So if I understand you correctly, this will import more documents and as such be on the “safe path”.

Though not relevant for my situation, I’m still a little curious as to how one would be able to remove the attributes…? A quick search did not give any good answers :innocent:

Yep that seems fine to me.

Honestly, I’m not too sure off the top of my head without me trying to search the same documentation you are.

In general, removing previously mobile-aware documents is not a use case we really support, because mobile clients can just as easily push their documents back up to Sync Gateway again as soon as they see Sync Gateway does not have it. You’d need a way of purging the documents from clients too, or include a new rule to reject them in the sync function in order for it to work well. It’s clunky.

1 Like

Ok, makes sense.

My use case is slightly different at the moment as I haven’t gone live with the sync based solution yet. So I can just wipe the data on the mobile during the tests :innocent:

And shold something like this happen in production then the data would never travel back as in my case they would all be required to have a userkey. And I can see that I could just take them out of the channel that the clients read…

Still trying to get my head around all the aspects of the sync. capabilities… :innocent:

Thanks for the enlightenment!

I think I need a little more adjustment to my sync function to get these documents out on the mobile.

Perhaps I just need to stop processing in the sync function earlier…?

I have this function:

			function (doc, oldDoc) {
                function _log(t) {
                    // Write to sg_info.log
                    console.log('' + t);
                }
                function _getUserKey(d) {
                    var key = null;
                    if (d) {
                        if (d.type == 'User') {
                            key = d.key;
                        } else {
                            key = d.userkey;
                        }
                    }
                    return key;
                }
            
				if (doc && doc._deleted) {
                    // Doc. deleted -> if public then require
					if(oldDoc){
	                    var userkey = _getUserKey(oldDoc);
	                    _log('delete doc id: ' + doc._id + ', userkey=' + userkey);
    	                if (userkey != null) {
		                    requireUser(userkey);
            	        } else {
                    	    requireAdmin();
						}
                    }
                    _log('doc deleted, id: ' + (doc._id || 'no id!') + ', ' + (oldDoc ? ('old key=' + oldDoc.key + ', userkey=' + userkey) : 'no oldDoc'));
                    return;
                }
                _log('doc id: ' + (doc._id || 'no id!') + ', ispublic: ' + doc.ispublic + ', userkey=' + doc.userkey + ', ' + (oldDoc ? (oldDoc._deleted ? 'oldDoc is deleted' : ('old key=' + oldDoc.key + ', oldDoc.userkey=' + oldDoc.userkey + ' update')) : ' creation'));
                // Document type is mandatory
                if (typeof doc.type === 'undefined') {
                    _log('Document type missing: ' + JSON.stringify(doc));
                    throw ({ forbidden: "Document type is required. id=" + doc._id });
                }
                // Document key is mandatory
                if (typeof doc.key === 'undefined') {
                    _log('Document key missing: ' + JSON.stringify(doc));
                    throw ({ forbidden: "Document key is required. id=" + doc._id });
                }
                // Update: Cannot allow change of type or key
                if (oldDoc != null && !oldDoc._deleted) {
                    // Update
                    if (oldDoc.type != doc.type) {
                        throw ({ forbidden: "Can't change doc type" });
                    }
                    if (oldDoc.key != doc.key) {
                        throw ({ forbidden: "Can't change doc key" });
                    }
                }
                // Document sync is disabled (used for type Image - but generic implementation)
                if (doc.issyncdisabled) {
                    throw ({ forbidden: "Sync. disabled for id=" + doc._id });
                }
                // All public docs are available in the app
                if (doc.ispublic) {
                    _log('public, id: ' + (doc._id || 'no id!'));
                    channel('!');
                }
                // All fishing trips and catches from users AND external reporting are available (for stats or quotas)
                if (doc.type == 'FishingTrip' || doc.type == 'Catch') {
					if(typeof doc.userkey !== 'undefined'){
	                    _log('Trip/catch for user: ' + doc.userkey + ', id: ' + (doc._id || 'no id!'));
					}else{
					    _log('Externally reported trip/catch, id: ' + (doc._id || 'no id!'));
					}
                    channel('!');
                 }
                // All users are available (for stats)
                if (doc.type == 'User' && doc.deleted != true) {
                    _log('User doc, id: ' + (doc._id || 'no id!'));
                    channel('!');
                 }

                // Allow anyone to create a Feedback or Observation on the server
                if (oldDoc == null && doc.userkey == null && (doc.type == 'Feedback' || doc.type == 'Observation')) {
                    _log('Created ' + doc.type + ': ' + (doc._id || 'no id!') + ', key: ' + doc.key + ' as anonymous user ');
                    return;
                }

                // Only non-public docs "owned" by user can be created/updated (and replicated)
                var userkey = _getUserKey(doc);
                if (userkey != null) {
                    if (oldDoc != null && ! oldDoc._deleted) {
                        // Update
                        if (oldDoc.userkey && oldDoc.userkey != doc.userkey) {
                            throw ({ forbidden: "Can't change user key" });
                        }
                    }
                    _log('User owned, id: ' + (doc._id || 'no id!') + ', type: ' + doc.type + ', user: ' + userkey);
					if(doc.type != 'Image'){	// Do not send images TO mobile
	                    channel('channel.' + userkey);
					}
                    access(userkey, 'channel.' + userkey);
					requireUser(userkey);
                } else if (doc.ispublic) {
	                requireAdmin();
                } else {
                    // Creation/update without user
                    _log('Document type cannot be created without user key: ' + (doc.type === 'Image' ? doc._id : JSON.stringify(doc)));
                    throw ({ forbidden: "This document type cannot be created without user key. id=" + doc._id });
                }
             }

And for a specific document that I want to send out to the mobile I have these loggings:

 2020-03-31T08:41:32.222+02:00 [INF] Javascript: Sync doc id: FishingTrip:1A725F5366D5B833C125853C0024C2BA, ispublic: undefined, userkey=undefined,  creation
> 2020-03-31T08:41:32.222+02:00 [INF] Javascript: Sync Externally reported trip/catch, id: FishingTrip:1A725F5366D5B833C125853C0024C2BA
> 2020-03-31T08:41:32.222+02:00 [INF] Javascript: Sync Document type cannot be created without user key: {"_deleted":false,"_id":"FishingTrip:1A725F5366D5B833C125853C0024C2BA","_rev":"1-ec336542807417badc34d9164bbd5457","anonymousemail":"ck@xyz.dk","anonymousname":"Christian XYZ","assockey":"3","assoczonekey":"2","catchkeys":["7E495104F4AA0BE4C125853C0024C2BD"],"cloudcover":"0","clubonlykey":"203","clubplacekey":"2C16DE9C74FAA24CC125853C002493DF","date":"2020-03-31T12:00:00+0200","humidity":"48","key":"1A725F5366D5B833C125853C0024C2BA","live":false,"locationlevel1":"9","locationlevel2":"9","locationlevel4":"203","locationtype":"2","month":3,"pressure":"1031","revisioninfo":{"created":"2020-03-31T08:41:32+0200","createdby":"Christian ZXyz/EBCED20047B25F42C1257CB2003360D3/Fangst","modifiedcount":1,"updates":[{"modified":"2020-03-31T08:41:32+0200","modifiedby":"Christian ZXyz/EBCED20047B25F42C1257CB2003360D3/Fangst"}]},"secret":false,"showsocioquestions":false,"statslocation":"9","statspublic":true,"temperature":"6","type":"FishingTrip","weather":"Skyfrit","windbearing":"256","winddirection":"247","windspeed":"4","year":2020,"zerotrip":false}
> 2020-03-31T08:41:32.222+02:00 [INF] Sync fn rejected doc "FishingTrip:1A725F5366D5B833C125853C0024C2BA" / "" --> 403 This document type cannot be created without user key. id=FishingTrip:1A725F5366D5B833C125853C0024C2BA
> 2020-03-31T08:41:32.222+02:00 [INF] Import: Error importing doc "FishingTrip:1A725F5366D5B833C125853C0024C2BA": 403 This document type cannot be created without user key. id=FishingTrip:1A725F5366D5B833C125853C0024C2BA

I want to make sure that I never receive a document of this type from the app - which is why I test for the userkey field.

I think what I find difficult is to understand what parts of my sync function is used when the documents are being sent out to the app and what parts are used when receiving data from the app.

So far my understanding is that anything that is added to a channel will be available for the app.
And anything I want to update must have a requireUser(…) or requireAdmin() applied to it.

So how do I best obtain this?

  1. Sbould I stop processing in the sync function by issuing a return; once I have added the document to the channel("!")?
  2. Or should I call requireAdmin() to allow it to update on the mobile?

Thanks in advance for any clarification :slight_smile:

I think this has been clarified before in previous posts, but to re-iterate: The sync function does not care about the “direction” in which a document is being sent, it’s simply applied to all mutations of documents originating from anywhere. That is; Clients, Sync Gateway, or Couchbase Server (via SG import).

This sync function logic is really specific to your app, and your data model, so I can’t really provide any specifics on what you should do. I think you’re in a better position than me to determine that. Getting a solid grasp of how the sync function is running will help you to determine what needs to be changed to give you the desired routing behaviour.

I’m fully aware of the requirement for understanding the application needs.

However, I did ask two specific questions of a more general nature… I don’t want you to solve my programming needs - but I do want to know what to do to obtain what I need. This is why I enclosed my sync. function…

I understand that you say that the function is applied to documents originating from anywhere - but I (and probably many others) often have a need to control where documents end up - i.e. some logic related to direction

A couple of examples:

  1. Allow documents to sync to the app - but not allow updates from the app. Updates on the server should be updated on the app (I use this for meta data and statistics calculated on the app).
  2. Allow creation of new documents on the app that are sent to the server - but docs of this type should not be sent to the app if created elsewhere (examples could be “feedback” or some general “observations” that we have)

It is really difficult to build something like this when the only answer is that it applies to “everywhere”… I need some examples on how to control this and implement something like the above. So in the real world “direction” is definitely a parameter in the equation…

Sorry if I’m pedantic - but I have read and re-read your documentation and blogs about this several times and I (obviously) still has not been able to implement this correctly - and it is a little painful to have to “guess and pray” - I don’t like that kind of programming :innocent:

If I have overlooked any examples related to the above requirements than please point me in the direction of them. I know that I have asked some of these questions before - so there may have come articles etc. that better describe this?

In answer to your specific questions/points above:

It depends what roles/channels the app user has access to, which is going to be specific for your setup. You can have an app user that is only allowed to see channel documents in foo for example.

Again, this isn’t a general requirement, but if you only want authorised users to update a doc in your app, then yes.

Coming back to your specific app’s data rules… Do you need type=FishingTrip and type=Catch publicly accessible to all users? Should it have the ispublic flag set?
Do you care about applying other sync function rules for these doc types? If you don’t then returning early seems fine to me, and avoids it being rejected by the userkey rules you have defined further down in your sync function.

Only adding requireAdmin() isn’t going to work, because you’re still going to be hitting the userkey validation rules (because the document doesn’t have ispublic=true set). Maybe this is the root of problem?


Please remember, this forum doesn’t have any support, or architecture review obligations for community users. I’m happy to find some time to answer small product questions, triage bugs, or help guide people, but my spare time is generally limited, and so getting into app/domain-specific discussions tends to swallow up a lot of that time.

Our enterprise customers have access to dedicated technical support and solutions engineers which can also do a far better job than I can to answer these types of questions. If you have further questions specific to your app, or need any in-depth help with your architecture, I would strongly advise you consider looking into Couchbase Professional Services.

1 Like

Thanks for giving answers to the specific questions :+1:

As I mentioned I fully understand that you cannot spent much time on replying here - I really didn’t mean to abuse your time (and thought I tried to explain that before).

I agree that a solution in my case could be to set ispublic on those documents. They weren’t before as they shouldn’t be sent to the app earlier on. But now they need to be used for statistics and so I changed the import_filter and sync function to try and reflect that.

I think I got replies to better understand what is going on in the sync function - so I’ll go testing :slight_smile:

1 Like

Thanks, and good luck with your app!

1 Like