Best practice for replication start/stop in Couchbase Lite?

Hi

I have an application in Xamarin Forms with 100K+ docs. replicating between the app on the mobile and the server. There are not many changes once the replicator has started and is ready. But I find that the first replication takes quite a while before it completes.

So far I am checking the events for sleep/resume of the app. If the app goes to sleep I stop the replicator and start it again once it resumes. But then I hit the long startup again…

I’m afraid that it will not have time enough to complete the first replication (or even the subsequent ones after resume) - so my question is if this is the right way to do it?

  1. Is it normal that it takes a while to start the replication? If not, how could I troubleshoot?
  2. Could/should I let the replication keep running when going to sleep? On resume should I then just check if it is still running - and only start it if not? I’m afraid of memory leaks here…
  3. Any other approaches that I should look into?

The app is not in production yet - but running against production servers. Planning on going live today or tomorrow - but had these observations during the final tests - so thought I would ask (did search - but didn’t really find anything conclusive…) :innocent:

Thanks for any thoughts and insight!

/John

Environment:
Community Edition 6.0.0 build 1693
Couchbase Sync Gateway/2.7.2(2;583d2dc) CE
Couchbase Lite (2.7.1)

Just a quick note on the observation. In my replicator I log the events while debugging:

[DbDataStore] PushAndPull Replicator: 107525/107800, activity = Busy
Thread started:  #886
Thread finished:  #886
Thread finished:  #882
Thread finished:  #883
[DbDataStore]   Doc ID: Catch:1542241844731780063F236
[DbDataStore]   Doc ID: Catch:1542241952581780063F236
Thread started:  #889
[DbDataStore]   Doc ID: Catch:1542241766914780063F236
Thread started:  #890
Thread finished:  #889
Thread finished:  #890
Thread started:  #892
Thread finished:  #892
[DbDataStore]   Doc ID: Catch:1542241878729780063F236
Thread started:  #893
[DbDataStore]   Doc ID: Catch:1542241336498780063F236
Thread finished:  #879
Thread started:  #894
Thread finished:  #893
Thread started:  #896
Thread finished:  #894
Thread started:  #897
Thread finished:  #896
Thread started:  #901
Thread finished:  #897
Thread started:  #902
Thread finished:  #901
Thread started:  #904
Thread finished:  #902
Thread started:  #906
Thread finished:  #904
Thread started:  #907
Thread finished:  #906
Thread finished:  #907
Thread started:  #1056
Thread started:  #1058
Thread finished:  #1056
Thread finished:  #1058
Thread started:  #1060
Thread finished:  #1060
Thread started:  #1062
Thread started:  #1064
Thread finished:  #1062
Thread finished:  #1064
Thread started:  #1065
Thread finished:  #1065
Thread started:  #1066
Thread finished:  #1066
Thread started:  #1069
Thread started:  #1070
Thread finished:  #1069
Thread started:  #1071
Thread finished:  #1070
Thread started:  #1072
Thread finished:  #1071
Thread started:  #1073
Thread finished:  #1072
Thread started:  #1074
Thread finished:  #1073
Thread started:  #1076
Thread finished:  #1074
Thread started:  #1079
Thread finished:  #1076
Thread started:  #1080
Thread finished:  #1079
Thread started:  #1081
Thread finished:  #1080
Thread started:  #1082
Thread finished:  #1081
Thread started:  #1087
Thread finished:  #1082
: 
:

and after some time there are just numerous Thread started: and Thread finished: logs (that CB Lite writes - not me) - and the replicator stays “Busy”… is this some kind of indicator of a problem?

When the replicator turns “Idle” I publish some events for my code to check if some data needs to be reloaded. Obviously, it takes a very long time until this happens…

100K+ docs do take some time to complete the replication. Did you try to enable delta sync feature? You mentioned there are not many changes, delta sync will definitely help you.

https://docs.couchbase.com/sync-gateway/2.5/index.html

Delta sync. is an enterprise edition feature only. So no, I haven’t as this installation is running on the Community Edition.

I actually have a built-in database that is put in place at installation time - so I’ll never have to replicate everything on the device - only when I myself build a new copy of the built-in database :slight_smile:

But still with this setup it seems to take a loooong time the first time (and after resume - there is a new “first time”)…

Out of curiosity, approximately how much data is this? And what device are you running on?

In ideal circumstances — over a LAN, with no other connections to SG, and with CBL running on laptop-class hardware (e.g. a MacBook Pro) — CBL can pull several thousand smallish documents a second. But a lot of factors will slow that down; the primary ones seem to be high server load and slow client storage.

I’m afraid that it will not have time enough to complete the first replication

Replication is incremental. So when you stop the replicator and then restart it, it isn’t starting from scratch. The replicator maintains a persistent “checkpoint” of its current progress, and starts from there.

Could/should I let the replication keep running when going to sleep?

That depends on the OS. On iOS you can’t; the OS will only give you a limited amount of time in the background, on the order of a minute, before killing the process. Android is different, more lenient about background processes, but I don’t know much about that platform.

To put some perspective on things: I set up a pretty underpowered Sync Gateway and loaded it was 700k documents recently and used an iPhone 6S+ to replicate them (while no other users were connected to Sync Gateway) and it took between 15 and 20 minutes to get them all.

Side note: It’s not CBLite writing those thread started messages either, it’s the Xamarin framework itself.

The fact that you say it “stays busy” is concerning. Does it stay busy indefinitely? It should be always making progress in some shape or form unless there is something preventing it.

The entire database is 116MB (which seems a lot for the number of documents…). That is the size of the entire cblite2 directory. At present I only have one index (on the “type” property") - I’ve not really found the best practices for determining the requirement for further indexes - but that means that indexes are not taking up a lot of the space (I guess?)

When we deliver the app we have a built-in database with all of the data as per the time of building that release. So really only a few changes would replicate (when the release is new). So replication should only run for a relatively short period.

The app can run without having a user logged in and most of the data is available for all users. So when the app starts we do an anonymous replication (only the “!” channel). The built in database is a full anonymous replication. When the user logs in (and some do that “auto” when launching the app - thus skipping the anonymous replication) then we stop the anonymous replication (if started) and restart the replicator with the user logged in. The first user replication is their “channel.userxyz…” channel only - to speed up reception of the user’s private data. Once that has completed (status “Idle”) we stop and start the replicator with the “!” channel and their user channel.

But following your questions I actually made a design change so that I capture the time it takes from going Busy to going Idle - and surface that on a page in the app :slight_smile:

The full replication cycle (idle -> busy -> idle) seem to take 30-45 seconds which I suppose is Ok? The “user only” replication is very fast (<1 sec) - so way faster than what I observed the other day… I suppose that could be due to some external factors as the production server is at a remote site…

Here is a sample output from replication. First anonymous (and before logging in there was a small network problem), and then user logged in. Sorry, it is a bit lengthy…

  222,1: [DbDataStore] Start replicator: AppUtils.IsLoggedIn =False, firstUserRepl=True
  223,1: [DbDataStore] Start PushAndPull Replicator: Will connect to: wss://fangstjournalen.dtu.dk/_sync/data - user: 'anonymous'
  227,1: [DbDataStore] PushAndPull Replicator: 0/0, activity = Connecting
  244,1: [DbDataStore] PushAndPull Replicator: 0/0, activity = Busy
  281,1: [DbDataStore] PushAndPull Replicator: 0/16, activity = Busy
  427,1: [DbDataStore] PushAndPull Replicator: 0/19, activity = Busy
  916,1: [DbDataStore] PushAndPull Replicator: 0/20, activity = Busy
  7077,1: [DbDataStore] PushAndPull Replicator: 0/21, activity = Busy
  8300,1: [DbDataStore] PushAndPull Replicator: 0/27, activity = Busy
  8367,1: [DbDataStore] PushAndPull Replicator: 0/28, activity = Busy
  8416,1: [DbDataStore] PushAndPull Replicator: 0/30, activity = Busy
  8457,1: [DbDataStore] PushAndPull Replicator: 0/32, activity = Busy
  8504,1: [DbDataStore] PushAndPull Replicator: 0/33, activity = Busy
  8571,1: [DbDataStore] PushAndPull Replicator: 1/36, activity = Busy
  8573,1: [DbDataStore] Pull replication finished receiving 1 documents
  8576,1: [DbDataStore] PushAndPull Replicator: 1/36, activity = Busy
  8578,1: [DbDataStore]   Doc ID: Photo:5e9ffe02d63618173404a7a2 (access removed)
  8617,1: [DbDataStore] PushAndPull Replicator: 1/39, activity = Busy
  8716,1: [DbDataStore] PushAndPull Replicator: 1/41, activity = Busy
  8763,1: [DbDataStore] PushAndPull Replicator: 1/48, activity = Busy
  8823,1: [DbDataStore] PushAndPull Replicator: 1/76, activity = Busy
  8961,1: [DbDataStore] PushAndPull Replicator: 1/82, activity = Busy
  9023,1: [DbDataStore] PushAndPull Replicator: 1/83, activity = Busy
  9344,1: [DbDataStore] PushAndPull Replicator: 1/83, activity = Idle
  9345,1: [DbDataStore] Repl. completed on: 12-06-2020 12:08:56. Duration=00:00:41.4475970, pulls=1, pushes=0
  9422,1: [DbDataStore] PushAndPull Replicator: 1/83, error CouchbaseLiteException (POSIXDomain / 52): Network is unreachable., activity = Busy
  9423,1: [DbDataStore] Error :: Couchbase.Lite.CouchbasePosixException: CouchbaseLiteException (POSIXDomain / 52): Network is unreachable.
  9433,1: [DbDataStore] PushAndPull Replicator: 1/83, error CouchbaseLiteException (POSIXDomain / 52): Network is unreachable., activity = Offline
  9434,1: [DbDataStore] Error :: Couchbase.Lite.CouchbasePosixException: CouchbaseLiteException (POSIXDomain / 52): Network is unreachable.
  9439,1: [DbDataStore] Stopping replicator (going offline/to sleep)
  9440,1: [DbDataStore] Stop Replicator
  9442,1: [DbDataStore] PushAndPull Replicator: 0/0, activity = Stopped
  9525,1: [DbDataStore] Restart replicator after user 'john@dalsgaard-data.dk' (2124DEFEC111BA8FC1257Exxxxxxxx) logged in
  9526,1: [DbDataStore] Start replicator: AppUtils.IsLoggedIn =True, firstUserRepl=True
  9527,1: [DbDataStore] Start PushAndPull Replicator: Will connect to: wss://fangstjournalen.dtu.dk/_sync/data - user: '2124DEFEC111BA8FC1257Exxxxxxxx'
  9529,1: [DbDataStore] PushAndPull Replicator: 0/0, activity = Connecting
  9546,1: [DbDataStore] PushAndPull Replicator: 0/0, activity = Busy
  9576,1: [DbDataStore] PushAndPull Replicator: 0/6, activity = Busy
  9633,1: [DbDataStore] Pull replication finished receiving 6 documents
  9635,1: [DbDataStore] PushAndPull Replicator: 6/6, activity = Busy
  9638,1: [DbDataStore]   Doc ID: Photo:599eb4f105187e137b0e55f7
  9639,1: [DbDataStore]   Doc ID: Observation:6e30bfeffeb646019469fad0c6100c78
  9640,1: [DbDataStore]   Doc ID: Observation:0d696317b5ee40e787d9766f2df5ff9b
  9641,1: [DbDataStore]   Doc ID: Observation:1980227623d648688392896c301221e9 (deletion)
  9642,1: [DbDataStore]   Doc ID: Observation:90458f4003ae492a8b95fcaffeb3ad4b
  9643,1: [DbDataStore]   Doc ID: User:Private:2124DEFEC111BA8FC1257Exxxxxxxx
  9663,1: [DbDataStore] PushAndPull Replicator: 6/6, activity = Idle
  9664,1: [DbDataStore] Repl. completed on: 12-06-2020 12:52:43. Duration=00:00:06.0051690, pulls=1, pushes=0
  9665,1: [DbDataStore] Replication of only user data completed - restart for all data...
  9674,1: [DbDataStore] PushAndPull Replicator: 6/516, activity = Busy
  9730,1: [DbDataStore] Push replication finished sending 2 documents
  9731,1: [DbDataStore]   Doc ID: User:Private:2124DEFEC111BA8FC1257Exxxxxxxx
  9734,1: [DbDataStore] PushAndPull Replicator: 900/900, activity = Busy
  9736,1: [DbDataStore]   Doc ID: ActivityLog:9e4fd02e17b44fce95f453e8b44f43ce
  9737,1: [DbDataStore]   Purged Doc ID: ActivityLog:9e4fd02e17b44fce95f453e8b44f43ce from local database
  9742,1: [DbDataStore] PushAndPull Replicator: 900/900, activity = Idle
  9743,1: [DbDataStore] Repl. completed on: 12-06-2020 12:52:44. Duration=00:00:00.2973090, pulls=0, pushes=2
  9746,1: [DbDataStore] Restart replicator after user 'john@dalsgaard-data.dk' (2124DEFEC111BA8FC1257Exxxxxxxx) logged in
  9747,1: [DbDataStore] Stop Replicator
  9748,1: [DbDataStore] Start replicator: AppUtils.IsLoggedIn =True, firstUserRepl=False
  9751,1: [DbDataStore] Start PushAndPull Replicator: Will connect to: wss://fangstjournalen.dtu.dk/_sync/data - user: '2124DEFEC111BA8FC1257Exxxxxxxx'
  9754,1: [DbDataStore] PushAndPull Replicator: 0/0, activity = Connecting
  9780,1: [DbDataStore] PushAndPull Replicator: 0/0, activity = Busy
  9819,1: [DbDataStore] PushAndPull Replicator: 0/1, activity = Busy
  9883,1: [DbDataStore] Pull replication finished receiving 1 documents
  9886,1: [DbDataStore] PushAndPull Replicator: 1/1, activity = Busy
  9888,1: [DbDataStore]   Doc ID: User:13DBD1DDA37DA6EDC1257FB60062C6B3 (access removed)
  9953,1: [DbDataStore] Pull replication finished receiving 1 documents
  9956,1: [DbDataStore] PushAndPull Replicator: 2/2, activity = Busy
  9957,1: [DbDataStore]   Doc ID: User:19F96F2A44F3A09AC12580520033C1E1 (access removed)
  10135,1: [DbDataStore] PushAndPull Replicator: 2/3, activity = Busy
  10193,1: [DbDataStore] Pull replication finished receiving 1 documents
  10196,1: [DbDataStore] PushAndPull Replicator: 3/4, activity = Busy
  10198,1: [DbDataStore]   Doc ID: User:27107F725E4D537AC125842C006A33FD (access removed)
  10256,1: [DbDataStore] Pull replication finished receiving 1 documents
  10259,1: [DbDataStore] PushAndPull Replicator: 4/4, activity = Busy
  10262,1: [DbDataStore]   Doc ID: User:E4C8BDDC5DC1D844C1257F34004629E0 (access removed)
  10327,1: [DbDataStore] Pull replication finished receiving 1 documents
  10329,1: [DbDataStore] PushAndPull Replicator: 5/5, activity = Busy
  10332,1: [DbDataStore]   Doc ID: User:5588340C9EF20DB0C1258051004946C4 (access removed)
  10406,1: [DbDataStore] Pull replication finished receiving 1 documents
  10411,1: [DbDataStore] PushAndPull Replicator: 6/6, activity = Busy
  10413,1: [DbDataStore]   Doc ID: User:B8A0ED0E45D55427C125829E006C0001 (access removed)
  10483,1: [DbDataStore] Pull replication finished receiving 2 documents
  10486,1: [DbDataStore] PushAndPull Replicator: 8/8, activity = Busy
  10488,1: [DbDataStore]   Doc ID: User:1A8699C7C60CFD12C1258141006C12AD (access removed)
  10491,1: [DbDataStore]   Doc ID: User:BFF4CE231CB09556C125832E00193CE1 (access removed)
  10553,1: [DbDataStore] Pull replication finished receiving 1 documents
  10556,1: [DbDataStore] PushAndPull Replicator: 9/9, activity = Busy
  10558,1: [DbDataStore]   Doc ID: User:A0B70B94D81E8B1BC1257F370031521A (access removed)
  10693,1: [DbDataStore] PushAndPull Replicator: 9/10, activity = Busy
  10759,1: [DbDataStore] Pull replication finished receiving 1 documents
  10762,1: [DbDataStore] PushAndPull Replicator: 10/10, activity = Busy
  10764,1: [DbDataStore]   Doc ID: User:F5D51F8E6AA72157C1257F670066865B (access removed)
  10832,1: [DbDataStore] Pull replication finished receiving 2 documents
  10837,1: [DbDataStore] PushAndPull Replicator: 12/13, activity = Busy
  10838,1: [DbDataStore]   Doc ID: User:9D5AEA94A9291965C1257F1000744793 (access removed)
  10840,1: [DbDataStore]   Doc ID: User:C0929848FE4AEC22C12580DF0059493F (access removed)
  10896,1: [DbDataStore] Pull replication finished receiving 3 documents
  10898,1: [DbDataStore] PushAndPull Replicator: 15/15, activity = Busy
  10902,1: [DbDataStore]   Doc ID: User:50AD4B38C46638BFC1258346005D5886 (access removed)
  10908,1: [DbDataStore]   Doc ID: User:41AA0661C241C70AC12580A70061E668 (access removed)
  10910,1: [DbDataStore]   Doc ID: User:EC0CBDCE63CB1BF1C1257F5F0038A264 (access removed)
  11016,1: [DbDataStore] PushAndPull Replicator: 15/16, activity = Busy
  11078,1: [DbDataStore] Pull replication finished receiving 1 documents
  11081,1: [DbDataStore] PushAndPull Replicator: 16/16, activity = Busy
  11082,1: [DbDataStore]   Doc ID: User:0DEE3D32F242AC30C125848800666090 (access removed)
  11147,1: [DbDataStore] Pull replication finished receiving 1 documents
  11149,1: [DbDataStore]   Doc ID: User:8318C7E8D51E6BA0C125808F0042999E (access removed)
  11152,1: [DbDataStore] PushAndPull Replicator: 17/17, activity = Busy
  11232,1: [DbDataStore] Pull replication finished receiving 1 documents
  11235,1: [DbDataStore] PushAndPull Replicator: 18/18, activity = Busy
  11236,1: [DbDataStore]   Doc ID: User:4E0E05AA9D7BC38EC1257F5E005F6C33 (access removed)
  11418,1: [DbDataStore] PushAndPull Replicator: 18/19, activity = Busy
  11500,1: [DbDataStore] Pull replication finished receiving 3 documents
  11503,1: [DbDataStore] PushAndPull Replicator: 21/21, activity = Busy
  11504,1: [DbDataStore]   Doc ID: User:3060D4FCEBE20DBEC12581CC00453CA4 (access removed)
  11506,1: [DbDataStore]   Doc ID: Photo:5d52a468a8864015c305950d (access removed)
  11507,1: [DbDataStore]   Doc ID: User:22D916804490D371C12584EF0034C437 (access removed)
  11630,1: [DbDataStore] PushAndPull Replicator: 21/22, activity = Busy
  11706,1: [DbDataStore] Pull replication finished receiving 1 documents
  11709,1: [DbDataStore] PushAndPull Replicator: 22/22, activity = Busy
  11711,1: [DbDataStore]   Doc ID: User:6FF42B909049FDEFC12580D00052D80A (access removed)
  11860,1: [DbDataStore] PushAndPull Replicator: 22/40, activity = Busy
  11929,1: [DbDataStore] PushAndPull Replicator: 22/41, activity = Busy
  12420,1: [DbDataStore] PushAndPull Replicator: 22/42, activity = Busy
  18626,1: [DbDataStore] PushAndPull Replicator: 22/43, activity = Busy
  19791,1: [DbDataStore] PushAndPull Replicator: 22/48, activity = Busy
  19894,1: [DbDataStore] PushAndPull Replicator: 22/52, activity = Busy
  19975,1: [DbDataStore] PushAndPull Replicator: 22/54, activity = Busy
  20063,1: [DbDataStore] Pull replication finished receiving 1 documents
  20065,1: [DbDataStore]   Doc ID: Photo:5e9ffe02d63618173404a7a2 (access removed)
  20067,1: [DbDataStore] PushAndPull Replicator: 23/58, activity = Busy
  20164,1: [DbDataStore] PushAndPull Replicator: 23/61, activity = Busy
  20232,1: [DbDataStore] PushAndPull Replicator: 23/98, activity = Busy
  20462,1: [DbDataStore] PushAndPull Replicator: 23/104, activity = Busy
  20548,1: [DbDataStore] PushAndPull Replicator: 23/105, activity = Busy
  20803,1: [DbDataStore] PushAndPull Replicator: 23/105, activity = Idle
  20804,1: [DbDataStore] Repl. completed on: 12-06-2020 12:53:28. Duration=00:00:43.3215690, pulls=17, pushes=0

To be able to follow what is going on (and capture time/count/etc.) I have added a ChangeListener and a DocumentReplicationListener (writing the “Doc ID: …” messages).

Out of curiosity: What could cause the “…(access removed)” messages AFTER a login? The same docs are in the “!” channel for “read only”… I must be doing something that I did not intend to do :grinning::innocent: - but not sure where to look for it.

If I replicate everything from scratch it took around 30 minutes last time (if I remember correctly). But that is not my normal use case. Following the follow-up questions from @jens I did some more research and logging (as you can see above) - and that seems to be in line with what you have if my documents are larger than your test ones.

It is interesting that CBLite is not writing those thread messages as they seem to follow along with replication… I wonder what it is then? Admitted, it was just an observation of things happening when I saw the delay.

I didn’t mean that it didn’t change state - just that it took a long time. But it didn’t today (well 30-45 secs. - but nothing like what I saw the other days). So I guess there might have been a “bump” on the way to the hosting center?

I am almost sure it is the threadpool implementation for Tasks. I’ve often questioned why it feels the need to do that but have not arrived at a specific answer…

Thanks again for your feedback and valuable information about how the replicator works, @jens and @borrrden.

I did a little further testing following the knowledge that the replicator should restart quickly (say after a sleep). And with the extra info I added I can see that restarting the replication on resume it took 31 seconds to run (and received 1 doc).

I guess I’m just slightly afraid of this when we go live and get more users. The app is in review right now so that could be any time soon :smiley:

The replicator is very pessimistic in ways, and will only pick up where it left off if both sides of the picture agree about where they left off. If either side disagrees then it must start over but even that is not as serious as replicating to an empty database. There are basically three situations that you might find yourself in:

  1. Replicate to an empty database (slowest)
  2. Checkpoints mismatch (fairly rare situation) and the replication must restart (medium)
  3. Checkpoints match and the replication resumes from the middle (fastest)

The reason 2. is faster than 1. is because the protocol has several steps. The relevant ones here are that first it asks for a bunch of documents starting from X (the checkpoint), and the other side will send it batches of lists for review (just the metadata). It will look locally for which ones it has and decide which ones to pursue further. The ones it does not have, it will ask for and receive. With 2, a whole lot are already present so it can skip the longest step which is receiving the actual body data.

Sizing your cloud facing components appropriately is also an important task, so be sure to monitor if your servers are having trouble and expand appropriately. Internally for some testing I just haphazardly picked an amazon medium instance only to find out weeks later that it was single core and 3.75GB RAM, which started to look pretty pathetic when any sort of load was placed on it.

Ok. That would probably make sense as I do as much reading/writing to the database in the background (to try and make Android usable).

Thanks!

Makes sense!

I had a look at the Sync.Gateway server when I experienced the slow replication in the beginning of the week. And it wasn’t really doing anything.

I guess I should look at CPU and memory usage for any bottlenecks? Are there other metrics I should consider (assuming those first two are Ok)? I intend to follow that the near future.

The two Couchbase servers in the cluster have run for a while now - anything on them (other than CPU and memory) I should look out for when load on the sync. gateway server increases?

I am not qualified to answer that question unfortunately. Those sorts of questions start to get into the realm of paid support territory as well but a few general things I can say are:

  • The more RAM you have the more things you are able to keep cached to avoid disk fetches
  • Make sure that if you have any load balancers or proxies in between Lite and Sync Gateway that they don’t mess with or otherwise time out web socket connections
  • I guess keep an eye on network usage as well. If you start getting a lot of concurrent users you might be sending out lots of simultaneous traffic.

Good points!

And yes, I also have a proxy in front of the solution. I have verified the recommendations on settings for that - but you never know :wink:

Thanks for the points to keep an eye on.

I appreciate the line to paid support - and as this customer is a University they are short of resources for extra ongoing costs and would need to apply externally for any extra funding as they don’t have any available budget for something like this (will take a new budget year etc. etc. for them to get there). But we’ll see where they end up if they get happy enough for the solution :slight_smile:

The 30-45 secs. was on iOS…

Just tried on an Android device… first replication took just over 3 minutes… :frowning:

… and on sleep/resume it took half the time (1:30 min.)…

An update received while the replicator was started took just 1 sec. - so the overhead of restarting the replicator is much higher than just receiving replication events while started. This observation was what lead me to the initial question.

One of our customers raised a similar point a few months ago, and gave us their database to look at, and I found some issues in CBL that were preventing it from reclaiming free space. The fixes will be in 2.8.

If this is a DB you ship inside your app, you can work around this by opening it with the sqlite3 tool and entering VACUUM; at the prompt. Also, it helps if you create this database from scratch every time, instead of using an existing one and replicating new docs to it — that reduces the size of the document revision histories.

1 Like

This does happen when the replicator’s filters (channels or set of docIDs) are different than before. Each filter setting has its own checkpoint. So if you haven’t pulled the “foo” channel before, the server will first send the doc/rev ID of every document in that channel, regardless of when it was modified. That could take a while. But a subsequent replication with that channel will be quick.

1 Like