Index Service and Auto Failover

Hi,

We have a 9 node cluster:
5 x data
2 x index/query
2 x search

Today we lost one index/query node because of a hardware failure. And from that point 50% of the queries returned a error?! I saw then “Auto-failover for index service is disabled.” in the log. Triggered manual failover, and everything was working again. Puh… And definitely my fault. It’s clearly documented, that index service is not automatically failed over unless a data service is running on the same host. I completely missed that, and seems I never lost a index node in years of couchbase use :slight_smile:

Now the question is… how should I setup my cluster, so I have multi dimensional scaling AND auto failover?

I think this would work for example:
6 x data/index
3 x query/search
That will failover index because there is also a data service. But I lose the split of data and index, which would be good for scaling and performance.

As I now know, my setup
5 x data
2 x index/query
2 x search
will not auto failover, because index can’t do that and there is no data service on the same host.

What about this:
5 x data
2 x index
2 x search/query
index will not failover, but maybe query service knows that one of the index nodes is not available and uses the other one. Will that work?

Thanks for any ideas how to setup this correctly.

Thanks, Pascal

@gizmo74,

You can setup equivalent indexes on the indexer nodes so that your scan’s are not impacted even incase of a failover of one of the nodes. Query would automatically pick the index on the other node in that case. Regarding your question, your understanding is correct and the below configuration would work

5 x data
2 x index
2 x search/query

Thanks,
Varun

My query is exactly about what is asked here and hence I am top-posting on an old post.

@varun.velamuri , docs say that Index service will not failover. You mentioned

If query would automatically pick the index on other node then why do the docs say that Index service will not failover?

Also, if query will pick up the index on other node, then how does

vs

matter?

Thanks

The problem is not the query service itself, that will use another node.

But if you have nodes with index/query, it will not failover. So the client library will always contact a dead query service.

@gizmo74 You are correct that a lost Index node not co-located with Data Service currently requires manual failover so Query nodes know that node is gone, as Index nodes do not fail over automatically a this time. However, the soon-upcoming release 7.1.0 is slated to include Autofailover for Index Service as a new feature. This will enable Index Service to automatically fail over even when not co-located with Data.

Note that we do not recommend putting Data and Index on the same node as they are both high consumers of CPU and memory. Also, when co-located, Data’s decision whether to fail over or not is the only consideration for Autofailover (even with the new feature), so if Data is healthy but Index is dead, the node will not automatically fail over even with the new Autofailover feature in 7.1.0. (This is because failover is currently done at the node level, not the service level. This behavior may be revisited in the future.)

Disclaimer: Couchbase does not officially commit to feature content of new releases prior to public release announcement, so the above is not an official commitment. However this is one of the most customer-requested features, and I already coded it in the 7.1.0 code base and it’s been through heavy testing there, so it would really surprise me if it somehow was pulled from the 7.1.0 release.