Index build showing inconsistent percentages and Crashing

n1ql

#1

On couchbase 4.5.0 when attempting to build a secondary index on a bucket with 85 million documents I have observed strange behavior. The index builds about 15% (in about 8 hours) or so and when I check back a few hours later the percentage is back down to 2 or 3% this constantly keeps going back and forth. I’m viewing these percentages via the couchbase admin console. I attempted to let the index finish building but then the index started crashing and printing out error logs(see below) every 0-10 minutes and causing a CORE DUMP which filled up my server disk space.

Is this a known issue? is this due to the large bucket size is there a recommendation on how to proceed or what could have caused this issue (didn’t see this issue in my other environments)? we had to delete core files from /opt/couchbase/var/lib/couchbase/ to recover disk space.

There are three other indexes on this server (however they are against different buckets). 2 indexes against a bucket size of 28 million docs (Data/Disk Usage 27.6GB/36.5GB) and the 3rd against a bucket size of 4K docs (Data/Disk Usage:241MB/253MB) and this index is again a bucket of 85 Million docs (Data/Disk Usage 90.7GB/115GB). All secondary indexes are “Standard GSI”, DB fragmentation is 30%; index fragmentation is "Write Mode set at 30% also; same for view fragmentation 30%; compaction mode appears to be be Auto-compaction. For completion Metadata purge interval is 3

Here is the query CREATE INDEX `idDocTypeVenueIdBeaconUUIDMajorMinor` ON `write`(`docType`,`timestamp`,`beacon.proximityUUID`,`beacon.major`,`beacon.minor`)

Error Logs <

Service ‘indexer’ exited with status 1. Restarting. Messages: runtime.throw(0x1214980, 0x2a)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.6/go/src/runtime/panic.go:530 +0x90 fp=0x7fffe95c6308 sp=0x7fffe95c62f0
runtime.sigpanic()
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.6/go/src/runtime/sigpanic_unix.go:12 +0x5a fp=0x7fffe95c6358 sp=0x7fffe95c6308
[goport] 2016/09/16 20:09:06 /opt/couchbase/bin/indexer terminated: signal: aborted (core dumped)

<>

Service ‘indexer’ exited with status 1. Restarting. Messages: 2016-09-17T04:54:47.282+00:00 [Info] logWriterStat:: 7669882482093328323 FlushedCount 5470000 QueuedCount 50000
2016-09-17T04:54:47.303+00:00 [Info] logReaderStat:: INIT_STREAM MutationCount 5740000
Caught SIGSEGV in libforestdb at (null)
Breakpad caught a crash in forestdb. Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/74e9a5b5-972f-8da6-03259db2-51e216c0.dmp before terminating.
[goport] 2016/09/17 04:55:09 /opt/couchbase/bin/indexer terminated: signal: segmentation fault (core dumped)

<>

Service ‘indexer’ exited with status 1. Restarting. Messages: 2016-09-17T09:47:29.681+00:00 [Info] StorageMgr::handleCreateSnapshot Skip Snapshot For INIT_STREAM write SnapType NO_SNAP
2016-09-17T09:47:29.705+00:00 [Info] logReaderStat:: INIT_STREAM MutationCount 16670000
2016-09-17T09:47:29.727+00:00 [Info] logReaderStat:: INIT_STREAM MutationCount 16680000
2016-09-17T09:47:29.752+00:00 [Info] logReaderStat:: INIT_STREAM MutationCount 16690000
[goport] 2016/09/17 09:47:49 /opt/couchbase/bin/indexer terminated: signal: segmentation fault (core dumped)

<>

Service ‘indexer’ exited with status 1. Restarting. Messages: 2016-09-17T10:20:05.542+00:00 [Info] DATP[->dataport “:9103”] DATP -> Indexer 98.320085% blocked
2016-09-17T10:20:05.584+00:00 [Info] Indexer::ReadMemstats Time Taken 2.635283ms
Caught SIGSEGV in libforestdb at (null)
Breakpad caught a crash in forestdb. Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/0fc790e6-2877-323f-1afebe3e-38e27c7b.dmp before terminating.
[goport] 2016/09/17 10:20:27 /opt/couchbase/bin/indexer terminated: signal: segmentation fault (core dumped)

<>

Service ‘indexer’ exited with status 1. Restarting. Messages: 2016-09-17T10:46:12.541+00:00 [Info] logReaderStat:: INIT_STREAM MutationCount 5630000
2016-09-17T10:46:12.908+00:00 [Info] logReaderStat:: INIT_STREAM MutationCount 5640000
Caught SIGSEGV in libforestdb at (null)
Breakpad caught a crash in forestdb. Writing crash dump to /opt/couchbase/var/lib/couchbase/crash/1ae13168-487a-2fa9-69e84da4-0985b21a.dmp before terminating.
[goport] 2016/09/17 10:46:30 /opt/couchbase/bin/indexer terminated: signal: segmentation fault (core dumped)


#2

It looks like the Indexer is crashing during the index build causing the index build to reset midway. From the log messages posted here, the crash appears to be coming from the storage layer.

Can you please share your indexer.log file from the time of the crash? I am not sure if the storage team can fully investigate without the core dump file but indexer.log may have some clues about it.


#3

@deepkaran.salooja please check your personal messages I have reproduced the problem i have captured some indexer.logs as well as a core dump.


#4

Thanks for sharing the logs. I see its a storage engine crash when compaction is running. Can you please switch the compaction mode from UI “Settings” -> “Auto Compaction” -> “Index Fragmentation” -> “Circular Write Mode” and see if that helps. It would be better to drop the index idDocTypeVenueIdBeaconUUIDMajorMinor first, change the compaction mode and then recreate the index.

Also, if you can send me the core file, that would be helpful for further investigation. I only got the log files in your messages.


#5

@mlblount45, thanks for sharing the core file. I’ll pass on the information to the storage team for investigation.

Btw, did you get a chance to try my suggestion to switch to “Circular Write Mode” for compaction. (change compaction mode from UI “Settings” -> “Auto Compaction” -> “Index Fragmentation” -> “Circular Write Mode”)


#6

To update this forum in case anyone else come across the suggestion by @deepkaran.salooja solved this issue more details can be read here about index fragmentation http://developer.couchbase.com/documentation/server/4.5/indexes/gsi-for-n1ql.html