We have a Couchbase cluster of 5 servers, all with CE version 4.1.0, and after one server failover,
-
we get no info after logging in the rest 4 servers’ web console.It looks like this.
some url like http://10.49.56.202:8091/pools/default?uuid=5434c08006aca68113f479e53418008c&waitChange=3000&_=1560762055766 returned status code of 500,
and such info [“Unexpected server error, request logged.”] -
So we add several servers into the cluster, the new added servers’ web console works.
3.And we rebalance the cluster, removing the old 4 servers. Things happened, the CPU LOAD of the old 4 servers became too high, accessing about 40 to 50, and the old 4 servers became pending, and the rebalance failed.
4.The logs says
[user:info,2019-06-17T16:55:47.164+08:00,ns_1@10.49.56.202:<0.20265.1924>:ns_orchestrator:handle_info:493]Rebalance exited with reason {timeout,
{gen_server,call,
[ns_config,
{update_with_changes,
#Fun<ns_config.6.55748145>}]}}
and
{none,<<"Rebalance stopped by janitor.">>}]},
I wonder who is the guy janitor? Why did he stop our rebalance?
5.When the CPU LOAD raise, I found there are many beam.smp with the status of top Dsl. Does the dead beam.smp raise the cpu load?
Any help would be appreciated.