Couchbase 4.0.0 load high but CPU percentage low periodicity


#1

Hi all,
OS : CentOS release 6.7 (Final)
Version: couchbase-server-community-4.0.0-4051.x86_64
Cluster: 3 x c3.xlarge (14 ECU, 4 vCPU, 2.8 GHz, 7.5 GiB Memoey)
Using Views(include spatial ): Yes
Using Indexes for N1QL: No

Couchbase setting is using default setting, for example:
Data RAM Quota: 4714 MB (min 4714 MB)
Index RAM Quota: 256 MB (min 256 MB)
Indexer Threads: 4

Here is the Zabbix monitor result:

Here is the log when load high, I login one of the Instance and check:

$ uptime 03:38:09 up 3 days, 18:26, 1 user, load average: 5.48, 2.47, 1.49 $ mpstat -P ALL 1 Linux 2.6.32-642.1.1.el6.x86_64 (xxxx23) 06/15/2016 _x86_64_ (4 CPU) 03:38:20 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 03:38:21 AM all 7.07 0.00 1.52 0.25 0.00 0.00 0.00 0.00 91.16 03:38:21 AM 0 7.07 0.00 2.02 0.00 0.00 0.00 0.00 0.00 90.91 03:38:21 AM 1 4.04 0.00 1.01 0.00 0.00 0.00 0.00 0.00 94.95 03:38:21 AM 2 5.10 0.00 2.04 0.00 0.00 0.00 0.00 0.00 92.86 03:38:21 AM 3 12.12 0.00 1.01 0.00 0.00 0.00 0.00 0.00 86.87 03:38:21 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 03:38:22 AM all 4.03 0.00 1.76 0.00 0.00 0.25 0.00 0.00 93.95 03:38:22 AM 0 6.06 0.00 2.02 0.00 0.00 0.00 0.00 0.00 91.92 03:38:22 AM 1 2.06 0.00 0.00 0.00 0.00 0.00 0.00 0.00 97.94 03:38:22 AM 2 5.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 91.00 03:38:22 AM 3 3.03 0.00 1.01 0.00 0.00 0.00 0.00 0.00 95.96 03:38:22 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 03:38:23 AM all 2.51 0.00 0.50 0.00 0.00 0.00 0.00 0.00 96.98 03:38:23 AM 0 3.96 0.00 0.99 0.99 0.00 0.00 0.00 0.00 94.06 03:38:23 AM 1 1.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 98.99 03:38:23 AM 2 2.97 0.00 0.99 0.00 0.00 0.00 0.00 0.00 96.04 03:38:23 AM 3 2.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 97.98 03:38:23 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 03:38:24 AM all 4.04 0.00 1.26 0.00 0.00 0.00 0.00 0.00 94.70 03:38:24 AM 0 7.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 91.00 03:38:24 AM 1 2.00 0.00 1.00 0.00 0.00 1.00 0.00 0.00 96.00 03:38:24 AM 2 5.15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 94.85 03:38:24 AM 3 3.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 96.00 03:38:24 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 03:38:25 AM all 2.78 0.00 0.25 0.00 0.00 0.00 0.00 0.00 96.97 03:38:25 AM 0 3.03 0.00 1.01 0.00 0.00 0.00 0.00 0.00 95.96 03:38:25 AM 1 2.02 0.00 1.01 0.00 0.00 0.00 0.00 0.00 96.97 03:38:25 AM 2 4.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 95.00 03:38:25 AM 3 2.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 97.98 03:38:25 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 03:38:26 AM all 3.53 0.00 1.01 0.25 0.00 0.00 0.25 0.00 94.96 03:38:26 AM 0 6.06 0.00 2.02 0.00 0.00 0.00 0.00 0.00 91.92 03:38:26 AM 1 2.04 0.00 0.00 0.00 0.00 0.00 0.00 0.00 97.96 03:38:26 AM 2 3.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 96.00 03:38:26 AM 3 3.06 0.00 0.00 0.00 0.00 0.00 0.00 0.00 96.94 03:38:26 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 03:38:27 AM all 2.53 0.00 0.76 0.00 0.00 0.00 0.00 0.00 96.71 03:38:27 AM 0 2.06 0.00 1.03 0.00 0.00 0.00 0.00 0.00 96.91 03:38:27 AM 1 3.06 0.00 0.00 0.00 0.00 0.00 0.00 0.00 96.94 03:38:27 AM 2 3.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 96.00 03:38:27 AM 3 2.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 97.00 ^C $ pidstat 1 Linux 2.6.32-642.1.1.el6.x86_64 (xxxx23) 06/15/2016 _x86_64_ (4 CPU) 03:38:30 AM PID %usr %system %guest %CPU CPU Command 03:38:31 AM 1536 71.29 0.99 0.00 72.28 2 beam.smp 03:38:31 AM 1615 9.90 0.00 0.00 9.90 0 memcached 03:38:31 AM 1791 0.99 0.00 0.00 0.99 0 goport 03:38:31 AM 27946 0.00 0.99 0.00 0.99 2 pidstat 03:38:31 AM PID %usr %system %guest %CPU CPU Command 03:38:32 AM 1536 8.00 2.00 0.00 10.00 2 beam.smp 03:38:32 AM 1615 3.00 0.00 0.00 3.00 0 memcached 03:38:32 AM 1684 4.00 0.00 0.00 4.00 3 beam.smp 03:38:32 AM 1729 1.00 0.00 0.00 1.00 0 goxdcr 03:38:32 AM PID %usr %system %guest %CPU CPU Command 03:38:33 AM 1536 8.00 2.00 0.00 10.00 2 beam.smp 03:38:33 AM 1615 2.00 1.00 0.00 3.00 0 memcached 03:38:33 AM 1727 0.00 1.00 0.00 1.00 3 goport 03:38:33 AM 27946 0.00 1.00 0.00 1.00 2 pidstat 03:38:33 AM PID %usr %system %guest %CPU CPU Command 03:38:34 AM 1536 12.00 4.00 0.00 16.00 2 beam.smp 03:38:34 AM 1615 1.00 0.00 0.00 1.00 0 memcached 03:38:34 AM 1684 1.00 1.00 0.00 2.00 3 beam.smp 03:38:34 AM 1716 0.00 1.00 0.00 1.00 0 godu 03:38:34 AM 1793 3.00 0.00 0.00 3.00 2 indexer 03:38:34 AM 1808 1.00 0.00 0.00 1.00 1 goport 03:38:34 AM 27946 1.00 0.00 0.00 1.00 2 pidstat 03:38:34 AM PID %usr %system %guest %CPU CPU Command 03:38:35 AM 77 0.00 1.00 0.00 1.00 0 khugepaged 03:38:35 AM 1536 6.00 2.00 0.00 8.00 2 beam.smp 03:38:35 AM 1615 4.00 1.00 0.00 5.00 0 memcached 03:38:35 AM 1684 10.00 2.00 0.00 12.00 3 beam.smp 03:38:35 AM 1716 1.00 0.00 0.00 1.00 0 godu 03:38:35 AM 27946 0.00 1.00 0.00 1.00 2 pidstat 03:38:35 AM PID %usr %system %guest %CPU CPU Command 03:38:36 AM 13 0.00 1.00 0.00 1.00 2 ksoftirqd/2 03:38:36 AM 1536 11.00 5.00 0.00 16.00 2 beam.smp 03:38:36 AM 1615 3.00 0.00 0.00 3.00 0 memcached 03:38:36 AM 1684 4.00 1.00 0.00 5.00 3 beam.smp 03:38:36 AM PID %usr %system %guest %CPU CPU Command 03:38:37 AM 1536 8.00 2.00 0.00 10.00 2 beam.smp 03:38:37 AM 1615 2.00 0.00 0.00 2.00 0 memcached 03:38:37 AM 1684 2.00 1.00 0.00 3.00 3 beam.smp 03:38:37 AM 1729 1.00 0.00 0.00 1.00 3 goxdcr 03:38:37 AM 1810 1.00 0.00 0.00 1.00 2 cbq-engine 03:38:37 AM 27946 0.00 1.00 0.00 1.00 2 pidstat 03:38:37 AM PID %usr %system %guest %CPU CPU Command 03:38:38 AM 1536 8.00 2.00 0.00 10.00 2 beam.smp 03:38:38 AM 1615 4.00 1.00 0.00 5.00 0 memcached 03:38:38 AM 1684 5.00 2.00 0.00 7.00 3 beam.smp 03:38:38 AM 27946 1.00 0.00 0.00 1.00 0 pidstat ^C $ uptime 03:38:41 up 3 days, 18:26, 1 user, load average: 4.68, 2.56, 1.55 $ sar -n DEV 1 Linux 2.6.32-642.1.1.el6.x86_64 (xxxx23) 06/15/2016 _x86_64_ (4 CPU) 03:38:49 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 03:38:50 AM lo 455.10 455.10 556.61 556.61 0.00 0.00 0.00 03:38:50 AM eth0 414.29 350.00 76.89 163.30 0.00 0.00 0.00 03:38:50 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 03:38:51 AM lo 357.14 357.14 413.89 413.89 0.00 0.00 0.00 03:38:51 AM eth0 309.18 261.22 92.02 151.82 0.00 0.00 0.00 03:38:51 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 03:38:52 AM lo 338.38 338.38 329.63 329.63 0.00 0.00 0.00 03:38:52 AM eth0 377.78 313.13 53.96 176.99 0.00 0.00 0.00 03:38:52 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 03:38:53 AM lo 240.40 240.40 173.63 173.63 0.00 0.00 0.00 03:38:53 AM eth0 298.99 243.43 26.31 99.06 0.00 0.00 0.00 03:38:53 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 03:38:54 AM lo 317.35 317.35 217.77 217.77 0.00 0.00 0.00 03:38:54 AM eth0 238.78 206.12 25.06 108.86 0.00 0.00 0.00 03:38:54 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s 03:38:55 AM lo 329.70 329.70 279.95 279.95 0.00 0.00 0.00 03:38:55 AM eth0 63.37 55.45 5.48 37.29 0.00 0.00 0.00 ^C $ sar -n TCP,ETCP 1 Linux 2.6.32-642.1.1.el6.x86_64 (xxxx23) 06/15/2016 _x86_64_ (4 CPU) 03:39:12 AM active/s passive/s iseg/s oseg/s 03:39:13 AM 4.04 16.16 345.45 322.22 03:39:12 AM atmptf/s estres/s retrans/s isegerr/s orsts/s 03:39:13 AM 0.00 0.00 0.00 0.00 0.00 03:39:13 AM active/s passive/s iseg/s oseg/s 03:39:14 AM 1.03 53.61 936.08 824.74 03:39:13 AM atmptf/s estres/s retrans/s isegerr/s orsts/s 03:39:14 AM 0.00 0.00 0.00 0.00 0.00 03:39:14 AM active/s passive/s iseg/s oseg/s 03:39:15 AM 7.22 20.62 587.63 555.67 03:39:14 AM atmptf/s estres/s retrans/s isegerr/s orsts/s 03:39:15 AM 0.00 2.06 0.00 0.00 1.03 03:39:15 AM active/s passive/s iseg/s oseg/s 03:39:16 AM 1.02 29.59 656.12 590.82 03:39:15 AM atmptf/s estres/s retrans/s isegerr/s orsts/s 03:39:16 AM 0.00 1.02 0.00 0.00 1.02 03:39:16 AM active/s passive/s iseg/s oseg/s 03:39:17 AM 2.02 12.12 817.17 797.98 03:39:16 AM atmptf/s estres/s retrans/s isegerr/s orsts/s 03:39:17 AM 0.00 0.00 0.00 0.00 0.00 ^C $ uptime 03:39:20 up 3 days, 18:27, 1 user, load average: 4.11, 2.69, 1.63 $ w 03:39:21 up 3 days, 18:27, 1 user, load average: 4.11, 2.69, 1.63 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT xxxuser pts/0 xxxip 03:37 0.00s 0.00s 0.00s w $ iostat -xz 1 Linux 2.6.32-642.1.1.el6.x86_64 (xxxx23) 06/15/2016 _x86_64_ (4 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 5.55 0.00 1.36 0.16 0.04 92.89 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvda 0.01 0.13 0.03 0.12 1.11 1.99 21.95 0.00 2.26 0.59 2.62 0.36 0.01 xvdb 0.00 106.17 0.00 25.48 0.01 1053.17 41.33 0.07 2.60 0.34 2.60 0.50 1.27 dm-0 0.00 0.00 0.03 0.25 1.09 1.99 11.14 0.01 28.11 0.70 31.10 0.18 0.01 dm-1 0.00 0.00 0.00 0.00 0.01 0.00 8.00 0.00 0.34 0.34 0.00 0.27 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 1.77 0.00 0.51 0.00 0.00 97.72 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvda 0.00 0.00 1.00 0.00 16.00 0.00 16.00 0.00 0.00 0.00 0.00 0.00 0.00 xvdb 0.00 23.00 0.00 3.00 0.00 208.00 69.33 0.00 0.67 0.00 0.67 0.67 0.20 avg-cpu: %user %nice %system %iowait %steal %idle 2.27 0.00 0.76 0.00 0.00 96.98 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util avg-cpu: %user %nice %system %iowait %steal %idle 3.02 0.00 1.01 0.00 0.00 95.97 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvdb 0.00 908.00 0.00 46.00 0.00 7632.00 165.91 0.38 8.17 0.00 8.17 0.48 2.20 avg-cpu: %user %nice %system %iowait %steal %idle 5.54 0.00 1.26 0.00 0.00 93.20 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util avg-cpu: %user %nice %system %iowait %steal %idle 2.53 0.00 0.76 0.00 0.00 96.72 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util avg-cpu: %user %nice %system %iowait %steal %idle 2.01 0.00 0.75 0.25 0.00 96.99 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util xvda 0.00 1.00 0.00 3.00 0.00 32.00 10.67 0.00 0.33 0.00 0.33 0.33 0.10 dm-0 0.00 0.00 0.00 4.00 0.00 32.00 8.00 0.00 0.25 0.00 0.25 0.25 0.10 avg-cpu: %user %nice %system %iowait %steal %idle 2.78 0.00 0.51 0.00 0.00 96.72 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util ^C $ w 03:39:41 up 3 days, 18:27, 1 user, load average: 3.73, 2.69, 1.66 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT xxxuser pts/0 xxxip 03:37 0.00s 0.01s 0.00s w $ free -m total used free shared buffers cached Mem: 7362 4291 3071 0 235 1419 -/+ buffers/cache: 2637 4725 Swap: 1983 0 1983 $ mpstat -P ALL 1 Linux 2.6.32-642.1.1.el6.x86_64 (xxxx23) 06/15/2016 _x86_64_ (4 CPU) 03:39:53 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 03:39:54 AM all 2.27 0.00 0.50 0.00 0.00 0.00 0.00 0.00 97.23 03:39:54 AM 0 3.03 0.00 1.01 0.00 0.00 0.00 0.00 0.00 95.96 03:39:54 AM 1 1.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 98.99 03:39:54 AM 2 2.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 97.98 03:39:54 AM 3 2.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 97.98 03:39:54 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 03:39:55 AM all 2.51 0.00 1.01 0.00 0.00 0.00 0.25 0.00 96.23 03:39:55 AM 0 2.97 0.00 0.99 0.00 0.00 0.00 0.00 0.00 96.04 03:39:55 AM 1 3.03 0.00 1.01 0.00 0.00 0.00 0.00 0.00 95.96 03:39:55 AM 2 3.03 0.00 1.01 0.00 0.00 0.00 0.00 0.00 95.96 03:39:55 AM 3 3.03 0.00 1.01 0.00 0.00 0.00 0.00 0.00 95.96 03:39:55 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 03:39:56 AM all 4.30 0.00 1.01 0.00 0.00 0.00 0.00 0.00 94.68 03:39:56 AM 0 3.06 0.00 1.02 0.00 0.00 0.00 0.00 0.00 95.92 03:39:56 AM 1 6.12 0.00 1.02 0.00 0.00 0.00 1.02 0.00 91.84 03:39:56 AM 2 4.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 95.00 03:39:56 AM 3 3.03 0.00 1.01 0.00 0.00 0.00 0.00 0.00 95.96 ^C $ w 03:39:59 up 3 days, 18:28, 1 user, load average: 3.56, 2.73, 1.69 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT xxxuser pts/0 xxxip 03:37 0.00s 0.01s 0.00s w $

So, How to fix this problem?


#2

Hi,

Thank you for using CB 4.0.
Can you please give us more details about what the problem is and where do you need help on?

Based on the stat, i can see the following:
CPU spikes from time to time but it remains constant/low most of the time.

Without knowing some background, here’s my guess:
CPU spikes either during compaction or during the view query execution/high workload time.
Can you please share more information about the background workload here?

Do you have a constant high workload accessing the system?

Thanks,
Qi


#3

Hi,
There is no a constant high workload accessing the system, even when there is no application access the system, it remain the sam problem…

here is another cluster system, it only has data(from XDCR replicate) and views, no application or client accessing the system

How do I know the system was doing compaction? And how many processes or threads doing compaction, Can I configure it?

Thanks


#4

You can observe the compaction progress by looking at the UI. If any bucket is going through compaction, it will show the compaction progress on the UI.

Compaction interval are configured in the settings tab --> auto compaction.
You can see documentation here: http://docs.couchbase.com/admin/admin/Tasks/tasks-autocompact-strategies.html

The default compaction thread is 3. And you can change it by using this REST API call:

curl -X POST -u Administrator:password http://localhost:8091/diag/eval -d ‘ns_config:set(compaction_number_of_kv_workers, targetNumber).’

Thanks,
Qi


#6

Thanks qicb.
Now I want to know, what is compaction thread’s process name? And I had change compaction interval are configured in the settings tab and also the compaction thread vira REST API call you gave me , but the problem is not resolve.

Thanks