Understanding Bucket Health Metrics


#1

We have a rather simple bucket setup: couchbase bucket with 1 replica on a 4 node cluster. It is storing simple key-value pairs (no views or indexes). We have run this setup for ~3 years on version 1.8 and then the last few months on version 2.5 of the enterprise edition.

We have had a process running for these last ~3 years that runs a stored procedure in the database, takes the ~16 result sets and hydrates an object graph and then stores that object in couchbase. The process runs every 15 min, asking the DB what’s changed, pulling the new data and updating the objects in couchbase.

Historically, our bucket would report 0% cache miss / 100% data in RAM (we have 60GB allocated to the bucket and our data is less than 30GB).

At some point, maybe around the time we upgraded both the server and the c# client from 1.8 to 2.5, we started seeing a lot of our data being pushed to disk, and our cache miss ratio going up.

My understanding of couchbase, which may have always been wrong, or it may have changed, was that the server would hold all keys and data in RAM unless the bucket ran out of space, at which point it would begin to shuffle the least used data to disk.

Has this changed? Am I missing something? How can I get my cache miss % and % in RAM numbers back to 0% / 100% ?


#2

I think the most important thing to look for now is: what is your bucket quota like? If you can get us more info on your allocated bucket size and how mich used, that would probably explain if it is expected or not. And yes, Couchbase Server only starts to eject if your RAM gets too crowded. That said, I think metadata and some other properties changed between 1.8 and 2.5, so it could be you are running into something else where maybe some re-evaluating of your cluster sizing is in order.


#3

Thanks for the reply! I think you’ll find we have gross overkill for this one bucket, but I need to understand how/why this bucket is acting differently/oddly before we start moving the rest of our cached data to this cluster. I can’t come up with any good reason why this data is being pushed off to disk.

Cluster Hardware: 4 physical servers, running Ubuntu (nothing else on these servers)
Total RAM in Cluster: 722 GB
Total Storage in Cluster: 2.05 TB

Cluster Configuration:

<CouchbaseServer type="Group">
	<Username>[REDACTED]</Username>
	<Password>[REDACTED]</Password>
	<WebAdminPort>8091</WebAdminPort>
	<DataPort>11211</DataPort>
	<TotalRamSizeMB>185000</TotalRamSizeMB> <!-- per server -->
	<InternalClusterPort>11210</InternalClusterPort>
	<ErlangPortMapper>4369</ErlangPortMapper>
	<NodeDataExchangePortRange>21100-21199</NodeDataExchangePortRange>
	<!-- 0 = ServerHostName, 1 = WebAdminPort, 2 = BucketName -->
	<ServerNodeUrlFormat>http://{0}:{1}/pools</ServerNodeUrlFormat>
	<ClusterStatusUrlFormat>http://{0}:{1}/pools/default/buckets/{2}/stats</ClusterStatusUrlFormat>
	<Buckets type="List">
		<!-- Name|SizeInMb|Type|ReplicaCount|Password -->
		<ListItem>BucketName|15360|couchbase|1|</ListItem> <!-- 15360 MB * 4 servers = 60 GB -->
	</Buckets>
</CouchbaseServer>

#4

And all buckets themselves have enough RAM left to keep the whole working set?

Btw, since you are using the enterprise edition, please engage with your Couchbase support crew to get this resolved quickly and to the point. They can also assist you in cluster sizing!