Suggested OS System Tweaks for High Performance

rudijs · January 28, 2015, 4:34am

Hi,

I’m new to Couchbase and am wondering are they any OS tweaks that should be done by default for a production cluster.

Coming from SQL for example, Postgresql needs a couple kernel setting tweaks like kernel.shmmax and kernal.shmmal

Those settings bump the shared memory for PGsql use as the default is quite low.

Anything similar for Couchbase?

I’ve just migrated an application to new servers which required an upgrade from Couchbase v1 to v3.

The previous sysadmin is not around, so I’m nutting it out by myself.

It’s up and running now but the 2 node Couchbase cluster is running with a load avg of:

load average: 5.24, 5.25, 5.23

This is higher than I’m used to so I’m wonder where to start looking (as a novice Couchbase administrator).

Does that load avg seem high to you as well?

I pretty much have all the settings at default.
Alerts are enabled and the test works but I haven’t received any Couchbase alerts as yet.

The two servers are quite beefy:

Base machine: E5-1620v2 2x500GB SATA, 16GB RAM

Specification:

1 x E5-1620v2 CPU
16GB RAM (fastest available, please)
2 x 500GB SATA as 500GB software (mdadm) RAID1. Partitions:
- 100MB EXT4 boot partition, mount point ‘/boot’.
- 32GB swap partition.
- remaining capacity as EXT4 root (’/’) partition, Ubuntu 14.04.1 installed here.

I’m running version 3.0.0 Community Edition (build-1209) on Ubuntu 14.04.1.

One default couchbase bucket and four memcached buckets (only two in use currenlty).

The bucket details are:

Couchbase Buckets (hit counter)

Bucket Name: default
Nodes: 2
Item Count: 177390
Ops/sec: 20
Disk Fetches/sec 0
RAM/Quota Usage: 121MB/4GB
Data/Disk Usage: 145MB/152MB
Access Control: None
Replicas: 1 replica copy
Compaction: Not active
Cache Metadata: Value Eviction
Disk I/O priority: Low

Memcached Buckets (sql query cache)
Bucket Name: query-lt
Nodes: 2
Item Count: 427451
Ops/sec: 116
Hit Ratio: 86.4%
RAM/Quota Usage: 1.28GB /4GB

Bucket Name: query-xs
Nodes: 2
Item Count: 510876
Ops/sec: 63
Hit Ratio: 68.8%
RAM/Quota Usage: 1.68GB /4GB

Any tips, suggestions or advice is much appreciated.

Thanks!

rudijs · January 28, 2015, 6:54am

Hrm … in my case it looks like disk i/o is causing high load avg.

jbd2/md2-8 … so mdadm (RAID) is working hard.

The load spiked from <1 to 5+ once we add more load to the default couchbase bucket and query-lt.

I’m digging further into this now …

daschl · January 28, 2015, 7:44am

The only thing that comes to my mind is this one: http://blog.couchbase.com/often-overlooked-linux-os-tweaks

Let us know if you find something specific to couchbase!

rudijs · January 28, 2015, 7:45am

Hi,

Why would memcached be writing to disk so much?

None of the four memcache buckets is using all the alocatted RAM.

TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>   COMMAND
  254 be/3 root        0.00 B/s   47.23 K/s  0.00 % 99.99 % [jbd2/md2-8]
 1822 be/4 couchbas    0.00 B/s   19.68 K/s  0.00 %  4.44 % memcached -C /opt/couchbase/var/lib/couchbase/config/memcached.json
 1821 be/4 couchbas    0.00 B/s   19.68 K/s  0.00 %  4.42 % memcached -C /opt/couchbase/var/lib/couchbase/config/memcached.json
 1820 be/4 couchbas    0.00 B/s   15.74 K/s  0.00 %  0.33 % memcached -C /opt/couchbase/var/lib/couchbase/config/memcached.json
 1823 be/4 couchbas    0.00 B/s   15.74 K/s  0.00 %  0.27 % memcached -C /opt/couchbase/var/lib/couchbase/config/memcached.json

Edit: Ah I don’t think this is the reason why jdb2 is using 100% of the i/o.

Strange though, this machine is a dedicated Couchbase server.

rudijs · January 28, 2015, 7:58am

Since adding traffic, something is causing 100% disk i/o.

Couchbase is the only service on these servers, the two nodes report the same 5+ load avg.

One thing I note is there is only one / root partition. /opt/ is not separate.

I’m surprised if this is the issue though, as RAM is available for all buckets.

asingh · January 28, 2015, 8:21am

@rudijs could you share cbcollect_info log from one node in the cluster? cbcollect_info documentation

rudijs · January 28, 2015, 8:32am

@asingh Hi sure I can share that. There’s alot of data, which in particular?

Just the output from the command like:

# /opt/couchbase/bin/cbcollect_info /tmp/output_file.zip
uname (uname -a) - OK
time and TZ (date; date -u) - OK
raw /etc/sysconfig/clock (cat /etc/sysconfig/clock) - Exit code 1
raw /etc/timezone (cat /etc/timezone) - OK
System Hardware (lshw -json || lshw) - OK
Process list snapshot (export TERM=''; top -Hb -n1 || top -H n1) - OK
Process list  (ps -AwwL -o user,pid,lwp,ppid,nlwp,pcpu,maj_flt,min_flt,pri,nice,vsize,rss,tty,stat,wchan:12,start,bsdtime,command) - OK
Raw /proc/vmstat (cat /proc/vmstat) - OK
Raw /proc/mounts (cat /proc/mounts) - OK
Raw /proc/partitions (cat /proc/partitions) - OK
Raw /proc/diskstats (cat /proc/diskstats) - OK
Raw /proc/interrupts (cat /proc/interrupts) - OK
Swap configuration (free -t) - OK
Swap configuration (swapon -s) - OK
Kernel modules (lsmod) - OK

Or data from the zip file?

 Archive:  output_file.zip
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/couchbase.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.xdcr.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.couchdb.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/stats.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ini.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.error.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.ssl_proxy.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.views.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.info.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.xdcr_errors.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.mapreduce_errors.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/diag.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.xdcr_trace.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.http_access.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/syslog.tar.gz  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.debug.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ddocs.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.reports.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/memcached.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.babysitter.log  
  inflating: cbcollect_info_ns_1@10.30.27.2_20150128-082436/ns_server.stats.log

daschl · January 28, 2015, 8:37am

Btw here is a guide on RHEL perf tuning, maybe you can adapt some stuff to ubuntu/debian: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/Performance_Tuning_Guide/index.html

asingh · January 28, 2015, 8:51am

@rudijs will need the generated zip file.

rudijs · January 28, 2015, 10:53am

@asingh It’s a 45MB file, should I attach it here? Link to a URL is better I guess?

It’s uploading to dropbox now, I’ll PM you the url shortly.

Many thanks!

asingh · January 28, 2015, 12:27pm

@rudijs jump in disk ops is related to compaction operation(required because we use append only file format), graph attached which captures disk queue items:

Relevant log snippets:

[ns_server:debug,2015-01-28T2:06:39.287,ns_1@10.30.27.2:<0.553.0>:mc_connection:adjust_couch_db_version:88]No db open in couchdb for this vbucket. Delete(/opt/couchbase/var/lib/couchbase/data/default/36.couch.6) = ok
[ns_server:info,2015-01-28T2:06:39.288,ns_1@10.30.27.2:<0.13150.38>:compaction_new_daemon:maybe_compact_vbucket:745]Compaction of <<"default/36">> has finished with ok

[ns_server:info,2015-01-28T3:50:44.451,ns_1@10.30.27.2:<0.21552.39>:compaction_new_daemon:maybe_compact_vbucket:743]Compacting `default/98' ({1422157827,0,false})
[ns_server:debug,2015-01-28T3:50:44.854,ns_1@10.30.27.2:<0.553.0>:mc_connection:adjust_couch_db_version:88]No db open in couchdb for this vbucket. Delete(/opt/couchbase/var/lib/couchbase/data/default/91.couch.6) = ok
[ns_server:info,2015-01-28T3:50:44.856,ns_1@10.30.27.2:<0.21523.39>:compaction_new_daemon:maybe_compact_vbucket:745]Compaction of <<"default/91">> has finished with ok

Typically we don’t expect huge jump in CPU usage during this period, you might want to check raid controllers.

rudijs · January 28, 2015, 1:10pm

@asingh OK thanks. I’m following up further with our hosting provider.

We’re using bare metal in a datacenter in Amsterdam for this cluster.

I’ll be sure to report back my findings for posterity.

rudijs · January 30, 2015, 3:22am

Hi,

Still having problems with this, I think continuing to detail it here will help me out.

The problem is now very targeted at disk i/o, not high performance in general.

The new two node Couchbase cluster is unable to handle half the traffic the old cluster was doing.

Old = Ubuntu 11.04 with Couchbase Version: 1.8.1 community edition (build-937)

New = Ubuntu 14.04.1 LTS with Version: 3.0.0 Community Edition (build-1209)

Perhaps only using the supported Ubuntu 12.04 might be wise move?

Anyways, I’ve tried to setup the new clusters the same as the old.

I note the use of the default bucket, which is not best practise, but there’s already application code wrapped around that I don’t want to mess with at this time.

The problem is high load avg (5+) as a result of 100% disk i/o.

couchbase-server is the only custom service running on the box, stopping it make the high load and disk i/o drop to nothing.

Looking at the two clusters here some current stats.

About half the load has been transferred from the old cluster to the new but the transfer process needs to stop until the disk i/o issue is sorted.

All the machines have similar disk layout, using software raid (mdadm) with a single / partition.

Here’s some tech details:

Old Cluster)

Version: 1.8.1 community edition (build-937)

iotop output:

 309 be/3 root        0.00 B/s   11.76 K/s  0.00 % 33.52 % [jbd2/md2-8]
1704 be/4 couchbas    0.00 B/s   70.58 K/s  0.00 %  0.37 % memcached -X /opt/couchbase/lib/memcached/stdin_term_handler.so -l 0~ 10000 -e admin=_admin;default_bucket_name=default;auto_create=false
1396 be/4 couchbas    0.00 B/s   78.42 K/s  0.00 %  0.00 % beam.smp -A 16 -- -root /opt/couchbase/lib/erlang -progname erl -- -~ookiefile "/opt/couchbase/var/lib/couchbase/couchbase-server.cookie"
   1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init

couchbase

Bucket Name     Ops/sec     Disk Fetches/sec     RAM Used     Item Count
default     133         0             59.2%        383784

memcache

Bucket Name     Ops/sec     Hit ratio     RAM Used     Item Count
query-mlt     40         94.7%         44.9%        137599
query-mxs     103         100%         66.6%        218946

New Cluster)

Version: 3.0.0 Community Edition (build-1209)

iotop output:

 254 be/3 root        0.00 B/s   81.98 K/s  0.00 % 99.99 % [jbd2/md2-8]  
5752 be/4 couchbas    0.00 B/s   19.52 K/s  0.00 %  0.48 % memcached -C /opt/couchbase/var/lib/couchbase/config/memcached.json
5753 be/4 couchbas    0.00 B/s   19.52 K/s  0.00 %  0.45 % memcached -C /opt/couchbase/var/lib/couchbase/config/memcached.json
5755 be/4 couchbas    0.00 B/s   19.52 K/s  0.00 %  0.39 % memcached -C /opt/couchbase/var/lib/couchbase/config/memcached.json
5754 be/4 couchbas    0.00 B/s   19.52 K/s  0.00 %  0.37 % memcached -C /opt/couchbase/var/lib/couchbase/config/memcached.json
5632 be/4 couchbas    0.00 B/s    0.00 B/s  0.00 %  0.00 % beam.smp -A 16 -- -root /opt/couchbase/lib/erlang -progname erl -- -~/couchbase-server.cookie-ns-server" -ns_server enable_mlockall false
    1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init

couchbase

Bucket Name    Item Count     Ops/sec     Disk Fetches/sec     RAM/Quota Usage     Data/Disk Usage
default        207717         25         0             126MB /4GB         187MB /247MB

memcache

Bucket Name    Item Count     Ops/sec     Hit Ratio     RAM/Quota Usage
query-lt     269318         70         56%         1.04GB /4GB
query-xs     350134         53        91.3%         1.34GB /4GB

The only thing that jumps out apart from the 99.99% disk i/o are the 4 memcached processes.

I guess this must be a general difference between v1.8 and v3.

Any community input, suggestions or help would greatly appreciated.

Thanks.

rudijs · January 30, 2015, 3:32am

Those spikes appear 2 hours apart though, disk i/o is a constant 99.99% pretty much.

It’s not so simple to track down where exactly the disk i/o is - ie. which file.

/opt/couchbase is not growing greating in size.

asingh · January 30, 2015, 3:50am

Compaction is triggered depending on settings you passed in Admin console, typically it’s set to compact if fragmentation is ~30%. So compaction will trigger when you reach that threshold, which may take minutes/hours or more depending on your workload.

/opt/couchbase is not growing greating in size.

It’s the job of compaction process to keep overall disk usage under control(obviously it depends on settings you have configured). You might want to understand how compaction actually works - Compacting data files

rudijs · January 30, 2015, 4:11am

I’ve ready up on compaction so I understand what’s going on.

I’ve tinkered with those settings as high at 90% to see if that (temporarily) might lower disk i/o.

The v1.8 couchbase cluster I’m trying to migrate away from has over 8GB in data and is very performant.

rudijs · January 30, 2015, 4:15am

In the scenario of a ‘hit counter’ is it possible to set the flush interval between RAM to disk?

rudijs · January 30, 2015, 4:36am

@asingh The v1.8 have a default couchbase data size of 4.1GB which consist of 15 files.

Looks like they rotate a 1.1GB

/opt/couchbase/var/lib/couchbase/data# du -sh default-data/*
60K default-data/default
1.1G default-data/default-0.mb
32K default-data/default-0.mb-shm
1.8M default-data/default-0.mb-wal
1.1G default-data/default-1.mb
32K default-data/default-1.mb-shm
1.8M default-data/default-1.mb-wal
1.1G default-data/default-2.mb
32K default-data/default-2.mb-shm
1.8M default-data/default-2.mb-wal
1.1G default-data/default-3.mb
32K default-data/default-3.mb-shm
1.9M default-data/default-3.mb-wal
32K default-data/default-shm
1.1M default-data/default-wal

The v3.0 default database is currently only 120MB but it consist of 1,036 files

Example:

/opt/couchbase/var/lib/couchbase/data# du -sh default/*
124K default/1000.couch.50
64K default/1001.couch.51
64K default/1002.couch.39
52K default/1003.couch.47
112K default/1004.couch.36
116K default/1005.couch.39
64K default/1006.couch.56
60K default/1007.couch.49
120K default/1008.couch.41
…
…
…
68K default/9.couch.35
3.6M default/access.log.0
3.5M default/access.log.0.old
3.6M default/access.log.1
3.5M default/access.log.1.old
3.6M default/access.log.2
3.5M default/access.log.2.old
3.5M default/access.log.3
3.5M default/access.log.3.old
4.0K default/flushseq
4.0K default/master.couch.1
1.2M default/stats.json
1.2M default/stats.json.old

I am very new to Couchbase, but it appears the way the data is handled is different between version.

v3.0 uses lots of smaller files and compacts more often.

Under high load, this would lead to much more disk i/o.

If that’s all correct, I’m fine with that, we can put /opt/couchbase on it’s own SSD disks (currently it’s all on SATA ‘/’ same as v1.8)

What are your thought on that?

rudijs · January 30, 2015, 3:48pm

Moving the Couchbase /data file to a dedicated SSD disk sorted this problem.

it’s was pretty easy, Hot swap insert new disk, format and add it.

Remove a node from the cluster and re-add it using the dedicated SSD disk for data then rebalance.

Repeat the process for the 2nd node.

All good now.

kirk · January 19, 2016, 9:26pm

Might I also suggest watching this from Couchbase Connect 2015 http://connect15.couchbase.com/agenda/tuning-couchbase-server-os-network-maximum-performance/