couch_view_group_compactor crashes with core dump

#1

We are seeing compaction always crash and generates core files.

here is one core file with GDB.

gdb -c core.30293

GNU gdb (GDB) Amazon Linux (7.6.1-51.24.amzn1)
Copyright © 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and “show warranty” for details.
This GDB was configured as “x86_64-amazon-linux-gnu”.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
[New LWP 30293]
[New LWP 30294]
[New LWP 30350]
[New LWP 30353]
[New LWP 30351]
[New LWP 30352]
Missing separate debuginfo for the main executable file
Try: yum --enablerepo=‘debug’ install /usr/lib/debug/.build-id/cf/8ad2e2f211f1221fec56fcdb60d1975166d167
Core was generated by `/opt/couchbase/bin/couch_view_group_compactor’.
Program terminated with signal 6, Aborted.
#0 0x00007f62eaaa4be9 in ?? ()

Here is log from /var/log/messages:

Aug 4 07:07:52 ip-172-31-23-150 kernel: [3234847.720645] [29953] 220 29953 2121411 1143942 3857 808611 0 couch_view_grou
Aug 4 07:07:52 ip-172-31-23-150 kernel: [3234847.720656] Out of memory: Kill process 29953 (couch_view_grou) score 658 or sacrifice child
Aug 4 07:07:52 ip-172-31-23-150 kernel: [3234847.720668] Killed process 29953 (couch_view_grou) total-vm:8485644kB, anon-rss:4575768kB, file-rss:0kB

is there any work-around to get rid of this ?
Your help is appreciated.

#2

Hi @kbaswaraj,
I’m sorry you are having problems, to make it easier for us to help you could I ask you run cbcollect and share the logs with me?

http://docs.couchbase.com/admin/admin/CLI/cbcollect-cluster-wide-info.html

Thanks
Martin

#3

It looks like Linux has run out of RAM (OOM) and hence had to kill something to be able to continue operating.

This could simply be a sizing problem - how much RAM have your nodes got, and what have you set the Server Quoa to? Could you list post the previous lines of /var/log/messages - those listing the sizes of the various process when the OOM-killer was invoked.

#4

Thanks @drigby for the information.
We have 4 nodes each with 8 GB RAM. Out of which 2 GB from each node is allocated as cluster Quota (8 GB total).

Following are the only lines for this process in /var/log/messages

Aug 4 07:07:52 ip-172-31-23-150 kernel: [3234847.720645] [29953] 220 29953 2121411 1143942 3857 808611 0 couch_view_grou
Aug 4 07:07:52 ip-172-31-23-150 kernel: [3234847.720656] Out of memory: Kill process 29953 (couch_view_grou) score 658 or sacrifice child
Aug 4 07:07:52 ip-172-31-23-150 kernel: [3234847.720668] Killed process 29953 (couch_view_grou) total-vm:8485644kB, anon-rss:4575768kB, file-rss:0kB

Aslo, if this is sizing problem, do we have sizing guidelines describing what should be the correct parameters/configurations for 4 node cluster. We use views very heavily. i would say at all time we fetch data through view and their size is few 100 times more than actual data size. What wiould be the c ideal RAM/Disk for such use case for couchbase to function smoothly?

#5

thanks @martinesmann .
Could you please let me know which all log you will need ?
cbcollect_info generated a zip file with 140 MB, which am not able to upload.
Please let me know files you would need and i will only upload them.