Storing large amounts of documents with huge disk space and low RAM

Hi,

We’re trying to understand whether CouchBase is suitable for storing huge amounts of hourly aggregated data - to be pulled out for reports generation.

The most cost-effective setup in such case would be using servers with moderate RAM and large SSD disks.

The problem is that we are not sure whether CouchBase allows inserting new data when it runs out of RAM.

Is it possible to rely mostly on disk reads and low RAM with CouchBase?

@alex1

Yes you can. What is your data access pattern?
EXAMPLE the most recent data frequently access i.e.the last 24 hours or 7 days.

Can you eventually move old data to cheaper storage solution?

Hi Alex,

Sorry for the late reply - old data should be removed automatically by the CB server due to TTL limitation which is applied on store procedure.

I’m not sure how to let CB know that we want to rely mostly on disk space - what happened in the past is that when CB run out of RAM allocated to a certain bucket, it stopped storing data in this bucket!

Any idea how to overcome this problem???

I would recommend creating a bucket with Full Eviction - that will allow Couchbase to not hold any details of your infrequently accessed documents in RAM (the default, Value Eviction keeps all document keys and metadata in RAM, regardless of how often they are accessed).

Thank you!

We will definitely try that.

Alex Brightov (on mobile)

Wait a second: if we use Full Ejection on certain bucket - will this bucket still support automatic removal of documents with TTL (document expiration)?

Yes, Full Eviction supports TTL.

Thanks a lot Dave! Really appreciate the help!

Alex

We setup Full Ejection on the bucket that supposed to be relying mostly on Disk space.

Couchbase was working OK until today. We got “[23:43:49] - Hard Out Of Memory Error. Bucket “Request_IDs” on node 172.31.9.196 is full. All memory allocated to this bucket is used for metadata.” alert and from now on Couchbase doesn’t store any new objects in this bucket. It holds 8+ million objects and the disk is used only up to 8%.

This is a huge problem for us and it looks like we will have to look for a different solution for storing large amounts of JSON documents since Couchbase is completely unpredictable - we followed all available recommendations on buckets configuration and RAM allocation, and provided it with virtually unlimited disk space.

It’s a shame!

Could you collect and upload logs somewhere? See Couchbase SDKs for details on collecting logs.

Hi Dave,

I restarted the master node and after restart the node released about 50%
of the used by the bucket RAM, while maintaining the same amount of
documents. At the moment the node uses about 50% of its RAM and it stores
new documents.

Do you think that collecting logs at the moment is still going to help
anyone to try to understand what happened? We run CB server 4.1 community
edition.

Probably less information than before the restart :slight_smile:

Given you’re on CE, any reason you’re not running 4.5.1? There’s a large number of issues fixed since 4.1 (see Couchbase SDKs). I can’t recall off the top of my head exactly when, but I think we fixed some issues with quota tracking in full-eviction buckets around then.

Thank you! I was planning to switch to 4.5.1 this week. Will switch to it ASAP!

CB 4.5 behaves much better. At the moment we don’t see any problems with RAM utilization. Full Ejection bucket also behaves as expected.

Thanks a lot for all the assistance!