Couchbase index struck in warmup in CE 6.6

Index on one of our node went into warmup with a panic message. We created the index and data is being loaded into this cluster using XDCR (about 248 million docs into one bucket). Index is catching up with XDCR load but after loading 223 million into the bucket the index went to stale and status showing warmup on one of the node.

we are using Community edition 6.6 and it is a 6 node cluster and we have indexes on 4 nodes and one of them is throwing the above error. I restarted couchbase service to see if that will fix the issue but it did not. I checked disk space and we got more than 70% free on disks.

Can someone please help. Thanks in advance.

Below is the log message in the UI:

Service ‘indexer’ exited with status 2. Restarting. Messages:*storageMgr).updateIndexSnapMapForIndex(0xc423e80700, 0x2b211001685116ef, 0x2b211001685116ef, 0xaeb671582be8ce5, 0xc423dc4810, 0xa, 0xc423dc47f8, 0x8, 0xc423dc4820, 0x6, …)
goproj/src/ +0x249*storageMgr).handleUpdateIndexSnapMapForIndex(0xc423e80700, 0x199f2a0, 0xc4200b0a00)
goproj/src/ +0x1d9*storageMgr).handleSupvervisorCommands(0xc423e80700, 0x199f2a0, 0xc4200b0a00)
goproj/src/ +0x19b*storageMgr).run(0xc423e80700)
goproj/src/ +0xbb
created by
goproj/src/ +0x27a

And this is the log message from Indexer log:

2020-12-02T20:46:13.014-05:00 [Error] ForestDBSnapshot::Open
Unexpected Error Opening Main DB Snapshot (/indexes/couchbase/indexes/@2i/orders_idx_aux_08_3107782816093509359_0.index) SeqNum 47711727 FDB_RESULT_NO_DB_INSTANCE
panic: Unable to open snapshot -FDB_RESULT_NO_DB_INSTANCE

goroutine 243 [running]:
panic(0xeb12c0, 0xc4241348d0)
/home/couchbase/.cbdepscache/exploded/x86_64/go-1.7.6/go/src/runtime/panic.go:500 +0x1a1*storageMgr).openSnapshot(0xc42373b180, 0x2b211001685116ef, 0x19ab660, 0xc4239abd40, 0x19b2040, 0xc423eb2990, 0xc423f6a3f0, 0xc4236f7140, 0x1, 0x4, …)
goproj/src/ +0x470*storageMgr).updateIndexSnapMapForIndex(0xc42373b180, 0x2b211001685116ef, 0x2b211001685116ef, 0xaeb671582be8ce5, 0xc423ace950, 0xa, 0xc423ace938, 0x8, 0xc423ace960, 0x6, …)
goproj/src/ +0x249*storageMgr).handleUpdateIndexSnapMapForIndex(0xc42373b180, 0x199f2a0, 0xc42405e000)
goproj/src/ +0x1d9*storageMgr).handleSupvervisorCommands(0xc42373b180, 0x199f2a0, 0xc42405e000)
goproj/src/ +0x19b*storageMgr).run(0xc42373b180)
goproj/src/ +0xbb
created by
goproj/src/ +0x27a

Hi @simpledba,

Requesting the complete indexer.log file.

Just to rule out any misconfiguration, please check index service memory quota on the UI. The default value for index quota can be small.

CC @sduvuru for forestdb related question.

Hi @amit.kulkarni

I confirm we had made changes to the index service memory quota and it has 26 GB. And we created the index before on a different node. There was a change with index definition, so instead of creating an index on a bucket which has 248 million docs, we flushed the bucket first and created index on empty bucket and then started loading data back into the bucket using XDCR. (because the traditional way of creating index on a full bucket is taking hours, so we tried to see if this works faster). Then we ran into the issue on one node where it struck in warmup.

Forums is not allowing me to upload the logs (I am new user so I cannot do it :slight_smile:) so I am sharing it in onedrive.!AhRrNXeSnd8PgblDpW4sjISeqOXbWA?e=9UFLUp

Thank you.

The issue is a snapshot that is to be used for rollback is not present and forestdb fails with the error FDB_RESULT_NO_DB_INSTANCE. I am not able to determine the root cause from the log. The snapshot was created successfully from the log, but for some reason the file handle does not seem to have the snapshot information.
A drop and recreate is the best work around I can think of. I will investigate further and see if I can find the reason for the issue in code.
Thank you.

Thank you @sduvuru.

I did end up removing the node from the cluster restarting the couch service and added it back as a new node joining into the cluster.
After that Index creation took about 23 hours, which brings a bigger question if this happen after we finish loading 750 Million documents it will take 3 to 4 days to create an index if we run into this issue.

Are there any recommended practices on creating indexes on a large volume buckets?

Thank you.