Inconsistent throughput

Hi There,

We have a 12 nodes (6 per datacenter) couchBase cluster with the version 3.0.3-1716 enterprise edition. Per data center, the bucket has 585GB RAM quote. Also the cache metadata is value eviction and we enable the auto compaction.

We are seeing the inconsistent throughput. The throughput is up and down time to time. The highest ops/sec can reach 100k but worst case, it could drop to less 10k which is not acceptable for our user case.

Here is the detail information:

The data in couchBase bucket is about 1.65 billion records and total size is 780GBGB. This data is being upinsert-ed by a online process around 1.5k per second in one data center. Reach of record has a TTL based on a field in the record.

At the read side, the app need at lest 50k “get” per second to meet the SLA. Right now the read hit another data center which the data is being replicated through XDCR. If the system is in good sharp, the read throughput is very good. But if the throughput downgrade to 10K, the app will miss the SLA. I checked the cache miss ratio is around 1%.

I’m suspecting the auto compaction impact the throughput.

Anyone has any suggestion here?

Thanks for your response in advance.

Orlando

You can disable auto-compaction temporarily to see if it has an effect on your workload.

Given the size of your EE setup I suggest you contact support / pre-sales rep and get them to assist you.

You probably want to benchmark the cluster in isolation from the app - for example using a workload generator like pillowfight (part of the Couchbase C SDK tools). Then you can determine if it’s the application itself which isn’t requesting fast enough, or actually an issue with the cluster.