Sync Gateway _changes feed does not return all documents, sometimes


#1

Hi everyone, we are very new to couchbase and sync gateway. We are using this technology in a project because we think its exacly what we need. But we are having some trouble and we don’t know if this is the normal behaviour or if we are doing something wrong.

First of all, our testing environment:

We don’t have many resources at the moment, so we are doing development on a vagrant virtual machine with both, couchbase and sync gateway on it. 4GB Ram and 4cores for both of them, not each. Maybe this could be an issue for big datasets, but we are working with less than 3000 items on the bucket right now.

Some users creates delta objects (with custom type T) that we will use just once, but we will keep them into the bucket because we give access to certain channels to the user and certain role using this delta object. The sync function will also do a channel call on this objects to sync them in another channel (lets say A), that is not present into the delta object channels attribute on creation. All this calls are using port 4984.

We use a nodejs process, using the “follow” project on github. This listener will keep an eye on changes on the “A” channel. So every time a delta object with type T is created should be retrieved by the listener, the listener will be using the port 4985 on localhost without authentication. That is not happening, sometimes some of this deltas are not pulled by the _changes feed, leading to errors. We are using this deltas as events.

If a new delta object is created (D2) after the missing one (D1), then both are pulled by the replicator.

We have also tested this giving access to every user to the “A” channel, to test if the SDK pull replicator is getting this changes from the feed (to all the other users), but they are not. We did this to test if the problem was the “follow” listener, but it seems to be working fine.

We think that maybe this is something related to the channels view cache, maybe that object (D1) is not triggering the view cache refresh, and the channel A view is not updated correctly. But then, when the D2 object creation triggers this “refresh” and then both are pulled.

I have to say that D1 sequence number is greater than the last pulled change from the follow process, this process will pull the changes since “now” at first run time, but will keep the last sequence number taken by the _changes feed, this number is logged on console and stored into a file (to be used if the server or the node process fails and restarts), and it is lower than the object D1 sequence number.

We don’t know where is the problem, and we don’t know if this could be a sync gateway or a couchbase problem, so please, forgive me if this is not the right tag.


"_changes" feed not 100% reliable
#2

A few questions to help clarify the scenario:

  1. From Sync Gateway’s perspective, I’m assuming that a ‘delta object’ is just a document. Let me know if the term ‘delta object’ is actually signifying something more.
  2. Are you able to share the Sync Gateway logs (in a gist) covering the time range where D1 is written (and isn’t replicated), and subsequently when D2 is written (when D1 and D2 get replicated)?

#3

Hi, thanks for the fast response.

  1. Yes, a delta object is just a document that will never be updated, for us will be used as an event and has meaning just on creation. We use it on the sync function to give access to channels and in the nodejs process to do other stuff that the sync function can’t do by itself.
  2. Yes, but if you don’t mind, could I PM you with that information?

Please, another question, I’ve seen that this objects we are using have a new attribute “access” on the “_sync” metadata after they are created and pass the sync function, this function has an “access” call on it, I am right to think that this new attribute is used by the sync gateway as an ACL ? If i am right this objects should not be removed, so if they are, we will lose that user/roles - channels relation.

We were planning to remove this delta objects in the future, giving them 1 month life time. But if this ACL metadata is required to get access to the channels that the users and roles need, then we can’t remove them.


#4

PM is perfect - thanks.

For the second question - you’re correct…access grants made by documents are only valid for that revision of the document - when a new revision is created (or that document is tombstoned), the access grants made by the previous revision of the document no longer apply.


#5

Hi, are there any news?

Were you able to check the private message?

Thanks :slight_smile:


#6

Hello again,

Have there been any progress assessing the situation? Do I need to send more files? What kind of tests should I do?

It’s been a long time and still do not know what might be going on.

I would greatly appreciate some indication.


#7

I still haven’t been able to create minimum viable demonstration of the phenomenon, but I wonder if this is similar to the issue I initially posted about here: Changes feed peculiarity/differences ?


#8

Sounds similar, but not 100% equal. In your case it seams your A client on 85 gets the document, but client B does not, when client C writes de document. I dont know if your document is getting replicated using the same channel than client B should be using.

In our case none of the users subscribed to channel A get the document D1 until D2 is writen, and that seams to be some kind of issue with permissions or cache. This happens sometimes, not every time.

Also, in our case the D1 and D2 documents will get the channel A permission at sync function level, using “channel” function.

We think this is related to the channels cache, because on sync_gateway service restart, the missing documents are sent to the channel correctly (this was tested when D1 is not replicated on _changes and not writing any D2 after it).