5.1.1 EE: Service 'indexer' exited with status 134

n1ql
index

#1

I run two indexer nodes in a cluster. OS is Ubuntu 16.04 nextcloud. The machine has 6 cores, 12 threads, 64gb ram. I have around 4.2M documents and 3 indexes which index 400k documents.

The following error occurs:

Service ‘indexer’ exited with status 134. Restarting. Messages:
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/proc.go:259 +0x13a fp=0xc4275cec30 sp=0xc4275cec00
runtime.selectgoImpl(0xc4275cef28, 0x0, 0x18)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/select.go:423 +0x11d9 fp=0xc4275cee58 sp=0xc4275cec30
runtime.selectgo(0xc4275cef28)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/select.go:238 +0x1c fp=0xc4275cee80 sp=0xc4275cee58
github.com/couchbase/gometa/protocol.(*messageListener).start(0xc42ef8ca50)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/gometa/protocol/leader.go:403 +0x4d3 fp=0xc4275cefb8 sp=0xc4275cee80
runtime.goexit()
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4275cefc0 sp=0xc4275cefb8
created by github.com/couchbase/gometa/protocol.(*Leader).AddWatcher
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/gometa/protocol/leader.go:256 +0x45d
[goport(/opt/couchbase/bin/indexer)] 2018/07/09 04:39:58 child process exited with status 134

Any hints how to solve this?

Edit:

Then indexer log shows at the time of error:

2018-07-08T22:03:31.166+02:00 [Info] StorageMgr::handleCreateSnapshot Added New Snapshot Index: 12999013764222219116 PartitionId: 0 SliceId: 0 Crc64: 12308215862641183311 (SnapshotInfo: count:4199837 committed:false) SnapCreateDur 26.861µs SnapOpenDur 13.4µs
panic: runtime error: index out of range

goroutine 234882 [running]:
panic(0xd4a860, 0xc4200160c0)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/panic.go:500 +0x1a1 fp=0xc425c0b798 sp=0xc425c0b708
runtime.panicindex()
	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/panic.go:27 +0x6d fp=0xc425c0b7c8 sp=0xc425c0b798
github.com/couchbase/indexing/secondary/collatejson.(*Codec).ExplodeArray(0xc425c0b968, 0xc424f46000, 0x69, 0x3800, 0xc424f3e000, 0x0, 0x3800, 0xc42b10e0c0, 0x5, 0x0, ...)
	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/collatejson/array.go:23 +0x3ea fp=0xc425c0b8b8 sp=0xc425c0b7c8
github.com/couchbase/indexing/secondary/indexer.filterScanRow(0xc424f46000, 0x69, 0x3800, 0x1675ae0, 0xc429948a40, 0x1675ae0, 0xc429948a80, 0x3, 0xe76224, 0xb, ...)

#2

This issue happens when scanning an index with large index entries. Please see MB-30226.

What is the expected index entry size in your case?


#3

This project isn’t available

https://issues.couchbase.com/browse/MB-30226

Is says fixed in v5.1.2. Any info on when v5.1.2 is getting released?


#4

5.1.2 is probably going to be released in 2 months.


#5

I am using 5.1 of CB, installed on a VirtualBox VM running 2 indexers. I believe (but am not sure) I got into this state (indexers are perpetually warming up) by rebooting my VM a few times. I am confident it has nothing to do with number of objects in CB as we currently only have a few dozen. Would like to resolve this issue. If you need to see log information, please let me know what would be helpful.


#6

@steve.cecutti, which index storage mode are you using? you can search for “panic” in indexer.log and share the stack below it.


#7

Hello Deepkaran, our storage type is standard gsi and here is the most recent panic in indexer.log:

goroutine 78 [running]:
panic(0xd45380, 0xc4200160b0)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/panic.go:500 +0x1a1 fp=0xc42004f698 sp=0xc42004f608
runtime.panicindex()
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/panic.go:27 +0x6d fp=0xc42004f6c8 sp=0xc42004f698
github. com/couchbase/plasma.(*Plasma).doRecovery.func1(0x1e5cf2e, 0xc424af0000, 0x0, 0x23ba4, 0x0, 0x0, 0x0)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/plasma.go:556 +0xd32 fp=0xc42004f7c0 sp=0xc42004f6c8
github.c om/couchbase/plasma.(*lsStore).visitor(0xc42119a8c0, 0x1079a15, 0x1e8c984, 0xc424ab8000, 0xc424595fe0, 0xd8b760, 0xc41fdd3501)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/lss.go:350 +0x167 fp=0xc42004f830 sp=0xc42004f7c0
github. com/couchbase/plasma.(*lsStore).Visitor(0xc42119a8c0, 0xc424ab8000, 0xc424595fe0, 0xc424ab6000, 0x1000)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/lss.go:339 +0x9b fp=0xc42004f880 sp=0xc42004f830
github. com/couchbase/plasma.(*Plasma).doRecovery(0xc42452cb00, 0xc424595fc0, 0x21)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/plasma.go:633 +0x231 fp=0xc42004f948 sp=0xc42004f880
github. com/couchbase/plasma.New(0x1e, 0x12c, 0x5, 0x4, 0xf2a758, 0xf2a720, 0xf2a760, 0xc424580d20, 0xc424580d30, 0xf2a720, …)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/plasma.go:429 +0xf9a fp=0xc42004fdf8 sp=0xc42004f948
github. com/couchbase/indexing/secondary/indexer.(*plasmaSlice).initStores.func2(0xc424580cf0, 0xc42459a300, 0xc424580ce0, 0xc420075300, 0xc424580d10)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/plasma_slice.go:256 +0x91 fp=0xc42004ff78 sp=0x
c42004fdf8
runtime.goexit()
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc42004ff80 sp=0xc42004ff78
created by github. com/couchbase/indexing/secondary/indexer.(*plasmaSlice).initStores
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/plasma_slice.go:268 +0x1a48


#8

@steve.cecutti, the rebooting seems to have corrupted the indexes. You can read more about the issue here - https://issues.couchbase.com/browse/MB-25086?focusedCommentId=222525&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-222525

For now, you’ll have to failover and rebalance out the node giving this error, bring it back in and recreate the indexes.


#9

Hi Deepkaran, right now our CB “sandbox” is a singular node of CB; if the indexes are corrupt on this single-node, we obviously cannot failover/rebalance. How would we go about rebuilding the indexes? Dropping them and recreating?

In any event, in a multi-node scenario, would the failover/rebalance sequence be done via the dashboard or CLI? If CLI, what would be the commands to execute?

thank you


#10

@steve.cecutti, you may try to drop the index but if indexer is repeatedly failing due to corrupt disk file, it may not be able to process it. In that cases, for single node setup, the only option is to do a fresh install.

For multi-node scenario, failover/rebalance can be done both via UI as well as CLI. You can checkout the CLI documentation here.