Sync Gateway internal persistence

Hi All,
I’ve successfully deploy a cluster and a sync gateway with docker compose using the latest CE (6.6 for the server and 2.7 for the gateway) . The sync gateway has just several volumes to map the configuration file, the shell and the import js function. Instead, the server has a map for the data, otherwise removing the container generates server data loss.

I’m wondering how and when the gateway store the documents coming from the server:

  1. It pull the documents at every startup?
  2. It persist locally on disk?
  3. What happen with a huge bucket mapped to a gateway db is everyday I’ll reboot the gateway?

Thanks,
Regards

Dario

Hi Dario,

Sync Gateway will act sort of like a cache in front of Couchbase Server for the documents that get either read or modified by Sync Gateway. For documents not already in Sync Gateway’s memory, it will fetch on-demand from Couchbase Server.

All persistent data written by Sync Gateway is actually written back to the Couchbase Server bucket, so that’s the only data volume you need to worry about in your deployment. Everything else in Sync Gateway (aside from the config and log files) are ephemeral and in-memory only.

To answer your specific questions:

  1. No, Sync Gateway listens to a continous stream of document mutations from Couchbase Server via DCP, and only needs to look for changes since the last change it had processed previously. These are tracked by things called DCP checkpoints to avoid re-processing of data.
  2. The only things Sync Gateway writes to disk are log files. Everything else is in-memory, and can be fetched from the Couchbase Server bucket on-demand.
  3. Nothing adverse will happen. Sync Gateway will continue to monitor document changes to the bucket since the last time Sync Gateway was running in order to run any processing, but it will not read the entire bucket’s worth of documents.

Hope that helps,
Ben

Hi Ben,
thanks for the details, it is more clear the gateway role. The last considerations are the following .

Imagine to have a bucket with 150.000 docs mapped on the gateway db with not specific channel rules. The gateway has been just rebooted.

A new mobile app with CBL listening at the gateway (ReplicatorType = ReplicatorType.PushAndPull and Continuous = true) has just been installed so the mobile db is empty. At app launch time the gateway receive the first documents request and its cache is empty. So the whole bucket we’ll be requested to the server and pushed down to the app, is it correct?

Moreover, an already installed mobile app has already most of the documents due to previous syncronization. What kind of interaction will happen between this second app and the gateway? The gateway try to send all the documents to the app that then recognize and store just the new documents?

Thanks,
Regards

Dario

In this scenario, Couchbase Lite is going to pull documents from Sync Gateway in batches of 200. Sync Gateway will run queries to retreieve these batches of documents from Couchbase Server, rather than retrieving items one at a time.
In the process, SG will insert the most recent documents into its caches in-case any other mobile devices request the same documents.

The caches in Sync Gateway are configurable by size and age, but once full, Sync Gateway will automatically evict old documents in the cache to fit new ones.

All 150,000 documents will be replicated through Sync Gateway to the mobile device, but only a small subset of those remain in Sync Gateway’s cache.

The app is pulling documents, they’re not pushed by Sync Gateway. This has the benefit that an app which already has documents pulled will only ask for documents that have changed since the last time it replicated, to prevent unnessesary overhead.

Thanks Ben, now the picture is clear. Sorry for all those doubts but the internal aspects of the gateway is not explicity detailed in the available documentation.

My best regards
Dario