Proper way to backup / restore sync gateway?


#1

We have a sync gateway set up and, unfortunately, we’re using bucket shadowing. What is the correct way to take a backup of this setup and then restore the backup?

What we’ve been doing currently is to dump both the sync gateway bucket and the shadow bucket using cbbackup and then restore them using cbrestore. We restore the shadow bucket first, and then the sync gateway bucket, and bring SG back up.

This doesn’t work very well. Whenever we do this, we end up with tons of documents ‘missing’ from the sync gateway bucket. We don’t actually lose any data, but lots of documents no longer sync down. However, if we go and do an update on a ‘missing’ document, the sync gateway will once again become aware of it and it will begin to sync down normally.

Any insight into the appropriate method for backup/restore would be appreciated.


#2

I think the approach to use cbrestore is the right one.
Can you try using the resync command on the admin port:

curl -vX POST http://localhost:4985/{db}/_resync

This should make SG aware of all the documents again and put the documents back in the right channels. Giving users access through a resync operation might be a bit more complicated (see note below). Are you are using access in your Sync Function?

Note: When running a resync operation, the context in the Sync Function is the admin user. For that reason, calling the requireUser, requireAccess and requireRole methods will always succeed. It is very likely that you are using those functions in production to govern write operations. But in a resync operation, all the documents are already written to the database. For that reason, it is recommended to use resync for changing the assignment to channels only (i.e. reads). Keep in mind that it’s perfectly fine if the Sync Function in a resync operation does not ressemble the Sync Function you expect to use in production. The former is only an intermediary function used in the resync operation and the latter is used to process reads and writes in a production environment.

James