Rebalancing fails 2.5.1. Windows azure


#1

After adding a new server to the cluster Reabalncing fails. consistently

Hi I have until now only used 1 server in my cluster.
After adding a 2. server
Rebalance exited with reason {unexpected_exit,
{‘EXIT’,<0.22254.387>,
{badmatch,
[{‘EXIT’,
{{badmatch,{error,etimedout}},
{gen_server,call,
[<13027.9133.0>,had_backfill,
infinity]}}}]}}}

I Have placed data and index folders on a different drive than couchbase installation on the newly added server.
Any ideas ?


#2

Sorry for the hassle. One thing is that when rebalance fails with a timeout, that may mean it’s making progress for quite some time, and then timing out during one particular stage.

On the vbuckets, there’s a small arrow and if you view it by server, see if it’s making progress when you request a rebalance.

As for root cause, it could be one of the things fixed in 3.0.1/2 or it could be a lack of resources. My guess is that there’s likely a fix. For further analysis of what the overall system looked like when this timeout happened, we’d have to get a cbcollect_info.


#3

Does rebalance fail on retries? What VM size are you using? recommend >=A3 or >=D3 VMs
thanks
-cihan


#4

A3 meaning 4 cores and 8 Gig Ram


#5

“On the vbuckets, there’s a small arrow and if you view it by server, see if it’s making progress when you request a rebalance.”

  • No Progress is reported.

It seems to be quite many users have reported similar behavior during the past years.
Sounds a bit alarming since its core functionality. I only have very few documents. Less than 100.

But maybe its a azure specific problem.
I will double check firewall settings.

will upgrading to 3.0.2 cause any problems? Luckily I can afford to delete old buckets and just re install couchbase. 3.0.2
Question is if my mappings will keep working as in 2.5.1

What do you recommend?

I’ve now upgraded to 3.0.2 But I Still have problems setting up a 2 node cluster.

I Keep getting errors.

I’ve collected the log info from the two servers.
I’ve enabled the ports mentioned in documentation in firewall both inbound and out bound for both servers. I Can ping one server from the other server using ip addresses.
In Azure i’ve opened port 8091 and 8092 in order to use admin web console from my developer pc.

BR Christian


#6

My apologies, I wasn’t very clear. When I said “making progress” what I was referring to is that you see vbuckets moving between the different nodes of the cluster. For example, if you start with one node you’ll have 1024 vbuckets on that one, and then when you add the second node and rebalance, you’ll eventually have 512 on each.

I suspect 3.0.2 would be a good upgrade. @cihangirb can probably confirm that. On mappings, if you mean configuration and code, yes, that’ll work fine with 3.0


#7

If you have more info, can you file an issue and point to the cbcollect_info that has been uploaded? Of course, if you have support go through that channel but I can also ask someone to have a look since this isn’t what we’d expect.


#8

OK I have opened issue Couchbase Server MB-13067
and attached collected info files for the two servers.
Cihan mentions host header files in http://blog.couchbase.com/step-step-production-deployment-couchbase-windows-azure-virtual-machines

But im wondering if he put his machines behind one cloud service or he used 3 different services

I havent set up any host headers but just use ip addresses


#9

One thing you might want to verify is that you have the MaxUserPort registry setting set to at least 60000 on each server. Rebalancing can fail if there not enough ephemeral ports.

http://support.microsoft.com/kb/196271


#10

Ok Thx. I will check that out immidietly


#11

Changing MaxUserPort didnt help. Rebalancing still fails


#12

Hi Couchbase team. How Do you suggest I move on trying to solve my couchbase rebalancing issues?
Thre is no activity on the MB-13067.

BR Christian


#13

Hi @cjjohansen, I may have missed this in the above posts but has rebalance ever worked in your setup or is this a brand new setup?
Can you look under the server nodes tab and send me the names/IPs you registered?

FYI: for the post you reference, I did use different services. Since then, I have moved to using single service and it has been much easier to setup. I’ll refresh this posts soon with details of Azure deployments.
thanks
-cihan


#14

Hi Cihan
Thx for returning to my Issue.
No its has never worked for me. So its likely its just a setup7configuration issue.

Will look forward to read your update to that post.
BR Christian


#15

Aleksey Kondratenko (Couchbase Issue Tracker) suggest Port 11209 is the problem. That could very well be the issue.

I Have the following ports open.

8091, 8092, 11211, 11210, 4369, 21100-21199

http://docs.couchbase.com/admin/admin/Install/install-networkPorts.html

Should I just open all ports mentioned in that link.?

I Guess at lest the Node to Node column should have all ports marked with Yes opened.

The ports i have open i Took form some other couchbase article.

BR Christian