FTS restarting / failing (core dumped)


#1

I am seeing messages like this from time to time when we are running FTS on our buckets, it has occurred on our larger buckets with 2m+ items.

It’s now happening ever 20-40 min., while building a new FTS index. We get to about 97%, it restarts and drops back down to 65%, this seems to be in a loop.

Is there something I should be doing differently?

This is a sample of the logged event in the UI.

Service ‘fts’ exited with status 1. Restarting. Messages: 2016-08-11T13:09:22.570-05:00 [INFO] cbdatasource: server: xxx.xxx.xxx.xxx:11210, uprOpenName: fts:NIH-idx_74cf8b116862df43-160916fc, worker, looping beg, vbucketState: “running” (has 85 vbuckets), 320-340, 544-575, 864-895
2016-08-11T13:09:26.388-05:00 [INFO] cbdatasource: server: xxx.xxx.xxx.xxx:11210, uprOpenName: fts:NIH-idx_74cf8b116862df43-160916fc, worker, looping beg, vbucketState: “running” (has 76 vbuckets), 160-170, 341, 800-831, 992-1023
2016-08-11T13:09:30.381-05:00 [INFO] cbdatasource: server: xxx.xxx.xxx.xxx:11210, uprOpenName: fts:NIH-idx_74cf8b116862df43-160916fc, worker, looping beg, vbucketState: “running” (has 191 vbuckets), 0-31, 171-255, 342-383, 480-511
cbft: malloc.c:2905: __libc_malloc: Assertion `!victim || ((((mchunkptr)((char*)(victim) - 2*(sizeof(size_t)))))->size & 0x2) || ar_ptr == (((((mchunkptr)((char*)(victim) - 2*(sizeof(size_t)))))->size & 0x4) ? ((heap_info ) ((unsigned long) (((mchunkptr)((char)(victim) - 2*(sizeof(size_t))))) & ~((2 * (4 * 1024 * 1024 * sizeof(long))) - 1)))->ar_ptr : &main_arena)’ failed.
[goport] 2016/08/11 13:10:53 /opt/couchbase/bin/cbft terminated: signal: aborted (core dumped)


#2

Hey Brian -

The CBFT process appears to be crashing and restarting, and it’s probably not something you did wrong. Can you you post your index definitions and the cbcollect_info? If you don’t want to post them here, you can email them to me at will.gardella at couchbase.com.

Best,
-Will


#3

Hi Brian,

Can you grab a backtrace and email it to me?

Execute
gdb /opt/couchbase/bin/cbft /opt/couchbase/var/lib/couchbase/core

While in gdb, use the thread apply all bt command.

Save the results and send them to me by email.

It’s possible that there won’t be any output, just so you know.

Other than that, you should set kernel swappiness to 0, as in the guidelines here: http://developer.couchbase.com/documentation/server/4.5/install/install-swap-space.html

That’s just something you want to do on Linux, not related to your particular issue.
-Will


#4

Hi Brian, - I have a theory about what this is but I need to get some time from the engineering team to confirm. I’m hoping it’s a known issue that’s been fixed in 4.5.1.
Best, -Will


#5

I just installed 4.5.1, I’ll let you know how it goes.