I plan to automate the backup and restore processes for our couchbase servers using cbbackupmgr tool.
In order to do so, I want to get a rough measure on the backup sizes for both full and incremental backups with respect to the actual source data so that I can plan my storage spaces accordingly on both persistent volumes and cloud.
Sizing really depends on the workload on the cluster.
For the full backup you can look at the size of the data in Couchbase Server to give you a rough idea of what the backup will look like.
For incremental sizing it depends on how regular the incremental is being performed and how many documents have been updated. Again the cluster stats can be used, look at the number of updates between backup periods and times that it the average document size.
For example
The cluster is storing 100GB of data then the full backup will roughly be the same.
For incremental; 2000 new documents are added to the cluster, 5000 documents are updated and 200 documents are deleted. The average size of document of the documents is 5KB. The incremental size would be around:
(2000 + 5000) * 5KB = 700,000 KB
The backup also has to store the deletes but they’re generally a fix size plus the length of the key. Say an average key length of 10 in this case
200 * (56 + 10) = 13,200 Bytes
There will be some overhead for meta data, but it should be around 684MB.
This is a pretty rough guideline but for most cases it should be good enough when using Couchbase Server 6.5 or greater.