Troubles communicating to the indexer process


#1

We are having constant problems creating GSIs on our Couchbase server.

Server version is: 4.1.0-5005 Enterprise Edition (build-5005).

The issue persists constantly while trying to create either a single GSI or multiple GSIs at once. Our cluster configuration consists of three servers running CentOS, each with 16 GB memory and 200 GB HDD storage.

Currently, there is a single bucket containing 20 million JSON documents with different schemas. We are trying to create 10 GSIs on each node. After running for some time, the indexing stops and the error message appears under ‘Index’ tab in the console: Warning: We are having troubles communicating to the indexer process. The information might be stale.

Additionally, although we have swappiness level set to 0, we are often running out of RAM. For example, during the most recent crash, used RAM is 57%, however swap usage is over 99%. Something must be clearly wrong with memory management here.

One of the log messages is:
`Service ‘indexer’ exited with status 1. Restarting. Messages:
github.com/couchbase/indexing/secondary/indexer.(*fdbSlice).handleCommandsWorker(0xc208056600,
0x0)

/home/couchbase/jenkins/workspace/sherlock-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/forestdb_slice_writer.go:274
+0x7d0
created by github.com/couchbase/indexing/secondary/indexer.NewForestDBSlice

/home/couchbase/jenkins/workspace/sherlock-unix/goproj/src/github.com/couchbase/indexing/secondary/indexer/forestdb_slice_writer.go:137
+0x13a1
[goport] 2016/03/18 09:37:33 /opt/couchbase/bin/indexer terminated: exit status 2`

There are a few ways to fix it:

  1. Removing nodes from cluster so the indexes get dropped and trying to recreate indexes (not viable);
  2. Doing fail overs and waiting for unlimited amount of time and hoping it will fix itself at some point (not viable);

I would like to know how to restart the indexer process separately when it crashes and how to troubleshoot such problems?

Also, what additional information should we provide?


#2

could you share the cb_collectinfo pls? I suspect you have be suffering from some config issue.
thanks


#3

Hi,

I tried running cbcollect_info several times, however it freezes during this step:

couch_dbinfo (['find', '/data/couchbase', '-type', 'f', '-name', '*.couch.*', '-exec', 'couch_dbinfo', '-i', '{}', '+']) -

Tried waiting several hours, but no luck.


#4

ok thanks could you provide the indexer.log file? it is in /opt/couchbase/var/lib/couchbase/logs unless you are on windows or mac.