Failure during rebalance due to badmatch?


#1

I have two virtual machines, they are a bit different. They are in the same domain and can talk to eachother.

ServerA: Windows Server Standard (2007) SP2. 32bit, 2GB RAM.
ServerB: Windows Server 2008 R2 SP1. 64bit, 4GB RAM.

I have only one bucket that i created “default” with one view that returns all items. That’s it. (I deleted the sample buckets).

I can make a cluster out of these two machines, and when joined - the console wants me to do a Rebalance. Everytime i try to do a Rebalance I get weird exceptions in the log. I googled myself to dust and see entries in your issue system, and some other wiki, but nothing helps. I tried uninstalling and re-installing Couchbase on ServerB several times to no avail. (Didn’t touch ServerA since it seems to work on its own.)

This is the log:

Rebalance exited with reason {unexpected_exit,
{‘EXIT’,<0.13550.3>,
{badmatch,
[{‘EXIT’,
{downstream_closed,
{gen_server,call,
[<12830.21361.0>,had_backfill,
30000]}}}]}}}

ns_orchestrator002

ns_1@192.168.50.135

12:19:54 - Wed Jul 10, 2013

<0.13543.3> exited with {unexpected_exit,
{‘EXIT’,<0.13550.3>,
{badmatch,
[{‘EXIT’,
{downstream_closed,
{gen_server,call,
[<12830.21361.0>,had_backfill,30000]}}}]}}}

ns_vbucket_mover000

ns_1@192.168.50.135

12:19:54 - Wed Jul 10, 2013

Bucket “default” rebalance does not seem to be swap rebalance

ns_vbucket_mover000

ns_1@192.168.50.135

11:55:06 - Wed Jul 10, 2013

Bucket “default” loaded on node ‘ns_1@192.168.0.61’ in 0 seconds.

ns_memcached001

ns_1@192.168.0.61

11:55:05 - Wed Jul 10, 2013

Started rebalancing bucket default

ns_rebalancer000

ns_1@192.168.50.135

11:55:05 - Wed Jul 10, 2013

Starting rebalance, KeepNodes = [‘ns_1@192.168.50.135’,‘ns_1@192.168.0.61’], EjectNodes = []

Thanks in advance for any help.


#2

Update:

I completely uninstalled Couchbase and re-did the whole thing a couple of times, and the exact same thing happens.

So, i decided i would let the servers look more alike, so i brought ServerA (32bit) out of the equation, and use the same installationpackage for the servers: couchbase-server-enterprise_x86_64_2.1.0.setup.exe.

Now the setup looks like:

ServerB: Windows Server 2008 R2 Std SP1. 64bit, 4GB RAM.
ServerC: Windows Server 2008 R2 Ent SP1. 64bit, 32GB RAM.

I did the same thing, but creating the cluster on ServerC this time and have ServerB join it. Exactly the same rebalance errors occur.
What to do? I am completely stuck and unable to move forward.


#3

Hello,

Couchbase does not support clustering over different OS, since you have an issue do you mind doing a test with the same OS on the different machines? (just to be sure)

So a work around is to create the cluster first (without any bucket) and then create the bucket, once you have the 2 nodes up and installed.

Finally, next time you do the rebalance and it fails can you run the cb_collect_info tools, in fact follow the steps described here:

once you have uploaded the logs, let me know the location using a private message.

Regards
Tug
@tgrall


#4

Thank you for the answer. We already killed these servers and brought up servers that has the exact same specs: (4cpu kernels, 4gb ram, win server 2008 R2 Ent) and the issues disappeared.

Thanks!