Hi team, first of all thanks and congratulations for the amazing work Couchbase is. I wonder if it’s possible to configure a single threaded pusher (Java is Single as I checked the code). The reason is too bad internet connection for users and submitting a lot of documents at once takes a lot of time, making it impossible to sync when there are many documents with images (binaries). I checked the docs, specially for Xamarin but couldn’t find anything. If you have any idea please let me know.
Do you mean single thread or single socket?
How would either of those speed up the replication?
Let me explain what’s hapenning. I have an app for auditing. It has an auditing model with around 50 auditing itens. Every item has it’s own json document and the user may take let’s say 2 pictures. Now I have 50 docs with 100 pics., around 100mb in data. No problem so far, because every item is a distinct doc so it’s 1 doc with 2 mb each. The user finishes auditing and then will try to sync all data through 3G (yes, I’m in Brazil, internet sucks at most places). The device will try to sync now, it’ll send let’s say 10 itens at once, around 20MB at speed of 128KB/s, which means every doc will sync at speed of 13KB/s, which will take around 160 seconds. The problem now is network/firewall will drop the connection after around 100 seconds, and then a lot of downloads will be lost almost at the end.
If I could define how many concurrent uploads could be performed we could bypass the network problem (unfortunately I cannot change the network config from my client).
Maybe I expressed incorrectly at first time. What I’d really like to is control the amount of documents uploaded concurrently. I even got to place a Thread.sleep in the CBL sync filter. It works and I can block other itens from being uploaded when the syncing starts, but this is ugly as hell.
Thanks for your time and sorry for the long text.
Do you know where this network problem is? Being disconnected every 100 seconds is pretty bad.
There’s no configuration that will control this. You’d have to modify the code.
It’s not a “network problem” is a configuration in place on their network. I’ll check the code for CBL on Xamarin and see if I can come up with a pull request (maybe this would be interesting as a feature). Anyway just wanted to know if there’s a way to configure it.
We probably wouldn’t take a PR at this point, since we’re developing 2.0, which is a mostly entirely new codebase. Only high-profile fixes are going into 1.x.
(Of course you’re still welcome to work on it yourself!)
Weighing in on this actually in .NET there is a way to do this that differs from the other platforms. I don’t remember why I put this in but it is here. This controls the concurrent number of HTTP requests that are allowed and it defaults to 8.
Thanks Jens, thanks borrden, I’ll try and let you know.
Hi borrrden, thank you very much for your comments. It works perfectly now. Please try not to abandon this feature in next releases, is a life saver when you have bad internet connection.
Thanks for the support jens.
CBL 2.0 only opens a single socket, but it multiplexes a lot of requests on that socket; that’s important for performance. (We’re not using HTTP anymore, rather a protocol based on WebSockets.)
I’m not clear on your description of the problem. You’ve said both that it’s “too bad internet connection for users” and “internet sucks at most places”, but then later you said “It’s not a ‘network problem’ is a configuration in place on their network”. (Also, who is “they”?)
I’m not trying to argue, I just want to understand the problem. Closing active sockets after 100 seconds is a terrible thing to do, especially if throughput is as bad as you say.
It’s a mix of both things.
1 - Bad internet connection makes it slow to upload a lot of documents altogether.
2 - This specific client drops any active connections after 100 seconds. It doesn’t make enough time to upload all docs and then it starts again. But it starts again with a lot of documents and connection is dropped once again. And it continues in this loop forever.
Now defining the amount of concurrent uploads we make sure documents are uploaded one by one in a “fast” way, we can even configure 2 concurrent uploads and it will work with enough time. You’re right, after a 100 seconds the connection will be dropped but in this case it doesn’t matter because the documents which started uploading finished uploading and only one or two documents will retry upload (because they failed when connection was dropped).
I think it should not be “defining the amount of active connections” but instead define the amount of concurrent uploads.
Let me know what you think.
What client do you mean? The application? I want to understand who is making the decision to close a socket after 100 seconds. Thanks.
we can even configure 2 concurrent uploads and it will work with enough time.
Not if there’s a document or attachment that’s large enough to take more than 100 seconds to upload. This would never succeed and would keep being retried over and over.
The problem with limiting the number of concurrent transfers to a small N is that if you ever have N very large resources being transferred at the same time, they act as a bottleneck, choking everything else until they complete. We’ve seen this problem in real scenarios. An analogy with Web browsing would be a page where the text and images on the page get stuck halfway through loading, while the 30MB video downloads. (The original NCSA Mosaic browser had this problem; all browsers since Netscape 1.0 have used between 4 and 8 parallel downloads to work around it.)
(We do still have a limit N, but it’s significantly larger than 2.)
By client I mean the person buying our app. He’s going to use our software to audit a huge company and this company has a firewall rule dropping connections every 100 seconds. It’s weird I know, but there’s no talking to them: Either we change the way the app works or they’ll buy from another vendor who will change the way the app works and adapt to their network crazy rule.
About the documents we’ve designed with it in mind. No document will have more than 1mb because you can take a predefined number of pics. We’re safe in this regard, just need to control the amount of concurrent uploads so we don’t reach the 100seconds network deadline.
How many concurrent uploads do you have in mind for the next version?
OK, I think I understand the situation now. Thanks for explaining.
In the new replication protocol, the peer that’s sending revisions is in control of how fast/parallel they get sent. That means that on a pull replication the server’s in charge, on push the client’s in charge. I just checked the client code and its parallelism is 5. I don’t know what the server uses, but I’m sure it’s also hardcoded.
Your best option is to file an issue against couchbase-lite-core asking for the ability to configure the parallelism of sending revisions. That way we’ll have a reminder of your request.