Can big attachments slow document replication?


#1

Greetings,

we are working on a Couchbaselite/SyncGateway based Android application. We manipulate 3 different types of documents (sampleset, sample and file). The application let users take pictures which are stored as attachments in file document type (one attachment per document).

Our client has asked to increase the picture quality which will increase the attachment size as well. This leads us to the following question: Could big attachments slow document replication?

We want to assign more priority to regular documents (sampleset and sample) replication than files. File documents are not shared across users and are just pushed to the server, we’ve accomplished this by associating file documents to a channel to which only the user that took the picture has access to. On the other hand, regular documents are shared by all the users and their replication to everyone who is online is critical, as soon as they are created.

We are afraid that increasing the attachment size could affect the regular document replication which could lead to inconsistent data visualization across users. For instance, imagine that a user that is offline creates several regular documents and also file documents, once it gains connection all the documents are pushed to the server (in a random order?). Then, if attachments are replicated first, we would delay regular documents replication, affecting other users data visualization.

That said, is there any kind of implicit prioritization when replicating documents/attachments? If not, is there a way of assigning more priority to some documents than others? (for example, having separate Cblite instances for each document type)

Thanks in advance.


#2

There isn’t any prioritization currently. Documents are transferred in order of increasing modification time (basically). After a document is transferred, its attachments are transferred one at a time if the destination side doesn’t already have them. But lots of docs are being transferred in parallel; the socket is multiplexed.

The way you can control this is by assigning channels, and replicating first one set of channels and then another. If you want a document to have higher priority than its attachments, you’d have to split the attachments out of the document by creating a secondary document with just the blob properties in it, which would be in the lower-priority channel. Then the main document has the docID of the secondary document, to tie them together.


#3

Thanks for your response.

Actually the file document is exactly what you suggest, is just a document that contains the attachment, and it should be in the low priority channel.

The documents that we want to replicate first (high priority) are the sample, sampleset documents. So, as you say, they should be placed in a separated channel as follows:

channel-high-priority: sampleset, sample
channel-low-priority: file

Now, how do we determine which channel should replicate first? is something that need to be set in SyncGateway (through the sync function)? Or is something in CouchbaseLite? (FYI: we are using continuous replication).

Thanks again for your feedback!


#4

Start a one-shot replication that’s filtered to the first channel.
Wait for it to stop.
Then start another replication filtered to the other channel(s).


#5

We resolved this by implementing two replicators, one for the file documents (bigger documents, low priority) and another one for the rest of the documents (smaller documents, high priority), and set both of them as continuous replication. We run several tests, and in all of them we could see that smaller documents (high priority) are mostly replicated first than bigger documents (low priority) which is exactly what we were trying to accomplish.

I think we can close this now.

Thank you so much for the help.