Out of memory: Kill process 388 (goxdcr) score 792 or sacrifice child

We are replicating data from 1 bucket to an elasticsearch cluster.
We currently having oom error on the goxdr process.
We tried different approach to fix the issue but we are still having the problem.
Even with a couchbase cluster with 5 nodes with 60GIG RAM each. The goxdr process takes all the avalable memory and eventually get killed. Usually between 45-50GIG usage on that single process. That is on all 5 nodes! This mean goxdr is taking over 250GIG RAM!

[442308.371110] [ 388] 988 388 14730245 12440745 28330 2021862 0 goxdcr
[442308.376386] Out of memory: Kill process 388 (goxdcr) score 792 or sacrifice child
[442308.381313] Killed process 388 (goxdcr) total-vm:58920980kB, anon-rss:49762980kB, file-rss:0kB

Anyone has advice concerning couchbase replication and elasticsearch?

Thanks,

Steeve

It’s been a while, so I’m not sure if this problem is still being seen or if this has been resolved. I’m also thinking by now the logs have been lost.

In any case, if XDCR is using too much memory on the source nodes, I’d recommend checking to make sure that both “Source Nozzle” and “Target nozzle” per node are set to no more than 2. Also, keeping the batch size under 2048KB is also ideal.

The above (default values) should ensure that XDCR processes don’t buffer too much data during replication.

follow the steps to find chich ps is consuming more ram , How to Check Memory Usage Per Process on Linux

do pmap for your processid which is consuming more , please check is there any [ anon ] memory blocks where memory got struck and kernel is not releasing to os. and clear it if not required from the pod.