Hi,
Just got a case that I loss around 10% of my data of the bucket.
I tried to perform cbbackup and found that the progress stopped at around 90%, no matter how many times I tried. And when I cbrestore it to a new server, I found that around 10% keys were lost.
After that I tried to stop the service and start again, but the couchbase-server process just simply failed to start.
From the log (erl_crash.dump.xxxxx), I found something like this:
Slogan: Kernel pid terminated (application_controller) ({application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{‘EXIT’,nodistribution}}}}},{k
Then, I reboot the whole server and the service could be start again. However, 10% of data were lost and those data were exactly those that wasn’t been cbbackup
The environment detail is as below:
Operating system: Centos 6
Number of CPU core: 2
RAM: 12GB
Number of nodes in cluster: 3
Couchbase version: 3.0.1
Total amount of keys in bucket: 16195
Bucket policy: full eviction
One more finding: I’ve created a development view on the bucket and tried to list all the keys out, the result only contains 15939 “total_rows”, and which the missing keys were those keys that were lost with cbbackup and after restart.
May I know if there’s any extra operations that can be done to prevent this to happen? (e.g. run a command to force write the keys in RAM to disk) Or will there be any way we can know that this has happened so that I will try my best not restart the service?
I believe everyone would agree that data loss is a serious issue for a database, looking forward to a reply.
Thanks.