Hard out of memory in full eviction bucket

#1

Hello experts.
We have a full eviction bucket, with minimum ram configured.
We’ve added a node to the cluster and got a “hard out of memory error” during the rebalance.
What does this message mean, why it happen and how can we avoid it?

(we use couchbase server 3.0.1)

Thanks.

#2

You may see the error under high load. Did your rebalance operation die with the error? any other errors in the errorlog?

#3

There wasn’t higher load then ever.
The operation didn’t die but it seems that the views were slowing it down.
When we’ve deleted the views the operation finished within 20 minutes.
It also seems we’ve lost a lot of documents during the rebalance.

#4

@cihangirb

This problem of error appearance and documents lost keep happening each time we add nodes and rebalance.

The full message of the log is

Hard Out Of Memory Error. Bucket “XXXXXXX” on node XXX.XXX.XXX.XXX is full. All memory allocated to this bucket is used for metadata

I don’t understand the message. We allocated minimum memory (300 mb because there are 3 nodes). Why does it needs any memory if it’s a full eviction bucket and how much does it needs?

Will disabling the indexAwareRebalance that documented in this link solve the problem?

Thanks

#5

Apologies for the delayed response.
There are a number of cases where you will need memory even under full eviction. For example: when you modify/add a value, we need to keep it in memory until it is flushed to disk. If changes happen faster than we can persist to disk, we’ll run out of memory and won’t be able to take more changes. Rebalance or compaction also need to bring in keys and values into memory so they are also going to suffer if there aren’t enough free pages of memory that is available to use.
It is hard to say what the right value is but we may be able to give you some round number if you can answer a few questions;
avg key size, avg value size, total keys, total number of buckets, total number of indexes, xdcr enabled/not enabled, working set (active portion of your keys at any given time), HW per node and number of nodes.
thanks
-cihan

#6

Hello, I am seeing a similar issue in our systems (using couchbase server 4.0.0)
However, I noticed that our disk is not being fully utilized, it supports about 3 times the number of I/O operations which couchbase is using and significantly more bandwidth than is being used.

What could the bottleneck be for write throughput (this bucket has a write-only workload at the moment), if the disk is not being fully utilized?

Thanks.