RSET API : rebalance fail


#1

http://192.168.103.131:8091/controller/rebalance?ejectedNodes=&knownNodes=ns_1@192.168.103.131,ns_1@192.168.103.132,ns_1@192.168.103.133,ns_1@192.168.103.134

Not found.
Why?


#2

are you doing an http POST? also I think ejectedNodes should have the failed-over node that you have shut down (but I’d prefer someone more familiar with the REST API syntax to confirm, maybe @dhaikney?)


#3

No, and I linked it in the browser,


#4

well, as stated in the documentation, this is a POST operation, can’t work with a GET


#5

I have tried it with curl command ,but failed with the alert ' not found'.

来自我的喜


#6

I have tried to curl it,but failed with the error code 400

来自我的喜


#7

400 == BAD REQUEST
what curl command did you use exactly?

there must be a parameter of the request that is inconsistent with what the cluster expects…

have you tried with ejectedNodes as I suggested the first time around? documentation suggests you have to make a few other REST calls before to get the correct values for the rebalance parameters…


#8

I have tried with ejectednodes, and also the other rest api with returning value successfully

.

I donot know why?

来自我的喜


#9

I’m afraid I’m out of ideas there, the REST API isn’t what I know best :confused:


#10

I used the curl command:

curl -v -u admi:pawd -X POST ‘http://…/controller/reblance’ -d ‘ejectedNodes=&knownNodes=…’

来自我的喜


#11

After one of the nodes in the cluster is down and even auto-faiover ends successsfully, the cluster is not a clean cluster. And i click the button rebalance, the cluster become clean.Why? And does the cb do the rebalance automatically after successful auto-failover?

来自我的喜


#12

@xiger I encourage you to read the documentation a bit more in depth, this kind of information is definitely covered in there.

Short version:

  • failover is the act of marking a node A as down and choosing one of its replicas, B, to be the new active node. As a consequence, depending on the replication factor it could be that data from A isn’t replicated anymore (B was the only replica, but is now master). It is manual, has to be triggered by an operator

  • auto-failover is the possibility to watch the nodes, detect a down node and fail it over once. It is automatically done by couchbase, but since it runs unattended, we argue that it should only run on a cluster with 3 nodes (replication factor 2) to avoid the problem discussed previously (an auto-failed-over node should always still have 1 replica). Auto-failover will only run on the first node down, other failures will need to be addressed manually.

  • rebalance is the act of putting back the cluster in a solid and balanced state. It will redistribute the data between the nodes and make sure that each and everyone of them honors the replication factor (unless there’s not enough nodes for it, which the console will warn you about). It is at this point that you can choose to remove a down node completely (eg. the hardware crashed) and replace it with a new one, grow your cluster, etc… After rebalance, the cluster is in a stable state again. This is manual, has to be triggered by an operator


#13

I have met a difficulty with lower performance :

loading 390000 items into cb needs more than 12min unsecurely, which is not what i want.

And another question: the sdk is sensitive to timeout? and how should i set the getandlock’ expire in case of upsert failing?