Issue with data in cache

Hi,

I have performance issue on my couchbase cluster.

I don’t known why but memcached process crash. The result is :

  • High disk read, 8k per node
  • High CPU usage, > 80%
  • Cache miss ration, 100%

When I restart the server, it works correctly during ~20min

Other numbers :

  • we have available ram
  • 75 operation / second

Hi, sorry to hear that. What operating system are you on and what version of CB are you running?

I run couchbase 3.0.1 Community edition on ubuntu 14.04 (Amazon EC2 instance)

@matthew I have run cbcollect but I can’t undersatand the result.

Can I send them to you ?

I have detected an issue about item in cache on one node :
Node 1:
active docs resident : < 1% !
replica doc resident : 20%

Node 2:
active docs resident : 16%
replica doc resident : <1%

Node 3:
active docs resident : 20%
replica doc resident : 14%

For RAM usage:
Total Allocated: 37.5 GB
Total in Cluster: 37.5 GB
In Use: 19 GB
Unused: 18.5 GB

Hi fx_algrain,
Can you share the cbcollect logs with me? preferably via link as that would allow me to easy share them with the support team.

@fx_algrain

@martinesmann passed me logs from your cluster. Problems that I’m seeing on your server:

  • node has just 2 cores, minimum recommended is 4.
  • Transparent HugePages is enabled, recommendation is to disable it.
  • Bucket resident ratio is close to 0%, that can leads to loads of other problems.
  • background fetches from disk at peak is around 17K ops/sec per node, which is pretty high! and this is expected because of low residency. And if your disk IO capacity isn’t sufficient, you will see timeout errors.

For info, we have add a new node : solve the issue but only during few days

I know about background fetch and it’s the issue. I want to solve them. I don’t understand why the server store more replica in memory and not active.
Also it’s same servers but we have more doc in RAM on the last server…

How to disable HugePage ?

Hi @fx_algrain,
This doc describes how to disable HugePage,
http://docs.couchbase.com/admin/admin/Install/rhel-installing.html

Hi everybody,

We continue to get the issue after upgrading servers (4 CPU + 30Go of RAM) .

We disable HugePage too.

But the problem continue. We have higher doc resident percentage.

I just try with new couchbase 4 and the issue still exist.

I get a great screenshot of the issue

No operations and disk fetch in while. I think it’s a bug in couchbase

It does appear to be. I’d recommend filing an issue and attaching a cbcollect_info. Also, if you have an enterprise subscription, you may want to raise awareness with Couchbase support since that’ll escalate getting the issue looked at.