5.1.1 EE: Service 'indexer' exited with status 134

index
n1ql

#1

I run two indexer nodes in a cluster. OS is Ubuntu 16.04 nextcloud. The machine has 6 cores, 12 threads, 64gb ram. I have around 4.2M documents and 3 indexes which index 400k documents.

The following error occurs:

Service ‘indexer’ exited with status 134. Restarting. Messages:
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/proc.go:259 +0x13a fp=0xc4275cec30 sp=0xc4275cec00
runtime.selectgoImpl(0xc4275cef28, 0x0, 0x18)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/select.go:423 +0x11d9 fp=0xc4275cee58 sp=0xc4275cec30
runtime.selectgo(0xc4275cef28)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/select.go:238 +0x1c fp=0xc4275cee80 sp=0xc4275cee58
github.com/couchbase/gometa/protocol.(*messageListener).start(0xc42ef8ca50)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/gometa/protocol/leader.go:403 +0x4d3 fp=0xc4275cefb8 sp=0xc4275cee80
runtime.goexit()
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4275cefc0 sp=0xc4275cefb8
created by github.com/couchbase/gometa/protocol.(*Leader).AddWatcher
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/gometa/protocol/leader.go:256 +0x45d
[goport(/opt/couchbase/bin/indexer)] 2018/07/09 04:39:58 child process exited with status 134

Any hints how to solve this?

Edit:

Then indexer log shows at the time of error:

2018-07-08T22:03:31.166+02:00 [Info] StorageMgr::handleCreateSnapshot Added New Snapshot Index: 12999013764222219116 PartitionId: 0 SliceId: 0 Crc64: 12308215862641183311 (SnapshotInfo: count:4199837 committed:false) SnapCreateDur 26.861µs SnapOpenDur 13.4µs
panic: runtime error: index out of range

goroutine 234882 [running]:
panic(0xd4a860, 0xc4200160c0)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/panic.go:500 +0x1a1 fp=0xc425c0b798 sp=0xc425c0b708
runtime.panicindex()
	/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/panic.go:27 +0x6d fp=0xc425c0b7c8 sp=0xc425c0b798
github.com/couchbase/indexing/secondary/collatejson.(*Codec).ExplodeArray(0xc425c0b968, 0xc424f46000, 0x69, 0x3800, 0xc424f3e000, 0x0, 0x3800, 0xc42b10e0c0, 0x5, 0x0, ...)
	/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/collatejson/array.go:23 +0x3ea fp=0xc425c0b8b8 sp=0xc425c0b7c8
github.com/couchbase/indexing/secondary/indexer.filterScanRow(0xc424f46000, 0x69, 0x3800, 0x1675ae0, 0xc429948a40, 0x1675ae0, 0xc429948a80, 0x3, 0xe76224, 0xb, ...)

#2

This issue happens when scanning an index with large index entries. Please see MB-30226.

What is the expected index entry size in your case?


#3

This project isn’t available

https://issues.couchbase.com/browse/MB-30226

Is says fixed in v5.1.2. Any info on when v5.1.2 is getting released?


#4

5.1.2 is probably going to be released in 2 months.


#5

I am using 5.1 of CB, installed on a VirtualBox VM running 2 indexers. I believe (but am not sure) I got into this state (indexers are perpetually warming up) by rebooting my VM a few times. I am confident it has nothing to do with number of objects in CB as we currently only have a few dozen. Would like to resolve this issue. If you need to see log information, please let me know what would be helpful.


#6

@steve.cecutti, which index storage mode are you using? you can search for “panic” in indexer.log and share the stack below it.


#7

Hello Deepkaran, our storage type is standard gsi and here is the most recent panic in indexer.log:

goroutine 78 [running]:
panic(0xd45380, 0xc4200160b0)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/panic.go:500 +0x1a1 fp=0xc42004f698 sp=0xc42004f608
runtime.panicindex()
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/panic.go:27 +0x6d fp=0xc42004f6c8 sp=0xc42004f698
github. com/couchbase/plasma.(*Plasma).doRecovery.func1(0x1e5cf2e, 0xc424af0000, 0x0, 0x23ba4, 0x0, 0x0, 0x0)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/plasma.go:556 +0xd32 fp=0xc42004f7c0 sp=0xc42004f6c8
github.c om/couchbase/plasma.(*lsStore).visitor(0xc42119a8c0, 0x1079a15, 0x1e8c984, 0xc424ab8000, 0xc424595fe0, 0xd8b760, 0xc41fdd3501)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/lss.go:350 +0x167 fp=0xc42004f830 sp=0xc42004f7c0
github. com/couchbase/plasma.(*lsStore).Visitor(0xc42119a8c0, 0xc424ab8000, 0xc424595fe0, 0xc424ab6000, 0x1000)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/lss.go:339 +0x9b fp=0xc42004f880 sp=0xc42004f830
github. com/couchbase/plasma.(*Plasma).doRecovery(0xc42452cb00, 0xc424595fc0, 0x21)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/plasma.go:633 +0x231 fp=0xc42004f948 sp=0xc42004f880
github. com/couchbase/plasma.New(0x1e, 0x12c, 0x5, 0x4, 0xf2a758, 0xf2a720, 0xf2a760, 0xc424580d20, 0xc424580d30, 0xf2a720, …)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/plasma/plasma.go:429 +0xf9a fp=0xc42004fdf8 sp=0xc42004f948
github. com/couchbase/indexing/secondary/indexer.(*plasmaSlice).initStores.func2(0xc424580cf0, 0xc42459a300, 0xc424580ce0, 0xc420075300, 0xc424580d10)
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/plasma_slice.go:256 +0x91 fp=0xc42004ff78 sp=0x
c42004fdf8
runtime.goexit()
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc42004ff80 sp=0xc42004ff78
created by github. com/couchbase/indexing/secondary/indexer.(*plasmaSlice).initStores
/home/couchbase/jenkins/workspace/couchbase-server-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/plasma_slice.go:268 +0x1a48


#8

@steve.cecutti, the rebooting seems to have corrupted the indexes. You can read more about the issue here - https://issues.couchbase.com/browse/MB-25086?focusedCommentId=222525&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-222525

For now, you’ll have to failover and rebalance out the node giving this error, bring it back in and recreate the indexes.


#9

Hi Deepkaran, right now our CB “sandbox” is a singular node of CB; if the indexes are corrupt on this single-node, we obviously cannot failover/rebalance. How would we go about rebuilding the indexes? Dropping them and recreating?

In any event, in a multi-node scenario, would the failover/rebalance sequence be done via the dashboard or CLI? If CLI, what would be the commands to execute?

thank you


#10

@steve.cecutti, you may try to drop the index but if indexer is repeatedly failing due to corrupt disk file, it may not be able to process it. In that cases, for single node setup, the only option is to do a fresh install.

For multi-node scenario, failover/rebalance can be done both via UI as well as CLI. You can checkout the CLI documentation here.


#11

Sorry for reviving an old thread but I am running into exactly the same problem and I am on the latest EC2 AMI build.

I have a single node as I am evaluating Couchbase and my indexes keep rotating over and over into warming up state. I don’t see any panic messages in the logs, the dashboard reports 4.8GB RAM unallocated and 70GB of free disk space. however I do see either of the following 2 messages when this happens:

Service ‘indexer’ exited with status 137. Restarting. Messages:
2019-02-19T02:45:34.813+00:00 [Info] TK StreamBegin INIT_STREAM LOL 534 30429784906482 0
2019-02-19T02:45:34.834+00:00 [Info] StorageMgr::handleCreateSnapshot Skip Snapshot For INIT_STREAM LOL SnapType NO_SNAP
2019-02-19T02:45:35.197+00:00 [Info] TK StreamBegin INIT_STREAM LOL 535 259123049119078 0
2019-02-19T02:45:35.229+00:00 [Info] TK StreamBegin INIT_STREAM LOL 537 139340997435837 0
2019-02-19T02:45:35.286+00:00 [Info] StorageMgr::handleCreateSnapshot Skip Snapshot For INIT_STREAM LOL SnapType NO_SNAP
2019-02-19T02:45:35.431+00:00 [Info] TK StreamBegin MAINT_STREAM LOL 610 120897330018654 195
2019-02-19T02:45:35.561+00:00 [Info] TK StreamBegin INIT_STREAM LOL 536 45491851494691 0
2019-02-19T02:45:35.590+00:00 [Info] StorageMgr::handleCreateSnapshot Skip Snapshot For INIT_STREAM LOL SnapType NO_SNAP
2019-02-19T02:45:35.591+00:00 [Info] TK StreamBegin MAINT_STREAM LOL 612 132292772107855 268
2019-02-19T02:45:35.591+00:00 [Info] TK StreamBegin MAINT_STREAM LOL 611 101801998083661 319
[goport(/opt/couchbase/bin/indexer)] 2019/02/19 02:46:51 child process exited with status 137

Service ‘indexer’ exited with status 134. Restarting. Messages:
reflect.Value.call(0xc442ae42a0, 0xc420030f78, 0x13, 0xe69168, 0x4, 0xc423305ee0, 0x3, 0x3, 0xc438ed06f0, 0x40ba1b, …)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/reflect/value.go:434 +0x5c8 fp=0xc423305e10 sp=0xc423305ac0
reflect.Value.Call(0xc442ae42a0, 0xc420030f78, 0x13, 0xc423305ee0, 0x3, 0x3, 0xc5706fac00, 0xc438ed0748, 0xd94401)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/reflect/value.go:302 +0xa4 fp=0xc423305e78 sp=0xc423305e10
net/rpc.(*service).call(0xc4273218c0, 0xc427321880, 0xc427377178, 0xc423774780, 0xc442b759c0, 0xd78a80, 0xc45a693860, 0x199, 0xcca400, 0xc44c619400, …)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/net/rpc/server.go:383 +0x148 fp=0xc423305f38 sp=0xc423305e78
runtime.goexit()
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc423305f40 sp=0xc423305f38
created by net/rpc.(*Server).ServeCodec
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.3/go/src/net/rpc/server.go:477 +0x421
[goport(/opt/couchbase/bin/indexer)] 2019/02/19 02:34:01 child process exited with status 134


#12

Hi Antek,

Can you please share the indexer log to look into this issue further? A cbcollect of the entire node would be even better.

Thanks,
Prathibha


#13

Unfortunately I lost the node as I shut it down and the AWS autoscaling group terminated it, building a new node now.


#14

@Antek, if you any face issue with new node, please take a cbcollect of logs and share it with us.


#15

So I have a new node up and running, but to actually do a cbcollect I need shell access to the system, unfortunately it seems that the AWS AMI provided by CouchBase disables this functionality or at the very least renames the default user name associated with the private key that is installed on the system during the bootstrap process.


#16

@Antek - You can do a cbcollect from Logs -> Collect Information tab in UI and choose “upload to Couchbase” option and share the link with me if you are still seeing indexer exited issue.


#17

@Antek if you still need shell access, login as the default user couchbase creates by default, and then you can always change the default ssh configuration that Couchbase AMI offers using sudo.


#18

You mean the default admin user & password?