Couchbase down urgent help


#1

Hi, i got a problem. i dont have a lot of experience in couchbase so please help me please.
i am using couchbase 2.1.0 in production.
we got 2 nodes in amazon which worked fine.
i wanted to upgrade the disk capacity, i took down a node changed disk size and it didnt come back up.
then i made a big mistake by restarting the second server and now he is not coming up.
(the machine is working. but couchbase doesnt start)
few mistakes that i made (learn from now on)

  1. i didnt backup using cbbackup
  2. no elastic ip. so the ip changed after restart (i think this is the main problem why the server wont come back up. i did used our host name for them. but its not working)
  3. only 2 nodes.
  4. staying with only one node

when i try to use “chouchbase-server start” i get the following error:

{error_logger,{{2014,2,5},{11,2,9}},“Protocol: ~p: register error: ~p~n”,[“inet_tcp”,{{badmatch,{error,duplicate_name}},[{inet_tcp_dist,listen,1},{net_kernel,start_protos,4},{net_kernel,start_protos,3},{net_kernel,init_node,2},{net_kernel,init,1},{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}]}
{error_logger,{{2014,2,5},{11,2,9}},crash_report,[[{initial_call,{net_kernel,init,[‘Argument__1’]}},{pid,<0.21.0>},{registered_name,[]},{error_info,{exit,{error,badarg},[{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}},{ancestors,[net_sup,kernel_sup,<0.10.0>]},{messages,[]},{links,[#Port<0.64>,<0.18.0>]},{dictionary,[{longnames,true}]},{trap_exit,true},{status,running},{heap_size,987},{stack_size,24},{reductions,665}],[]]}
{error_logger,{{2014,2,5},{11,2,9}},supervisor_report,[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{‘EXIT’,nodistribution}},{offender,[{pid,undefined},{name,net_kernel},{mfargs,{net_kernel,start_link,[[‘babysitter_of_ns_1@127.0.0.1’,longnames]]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]}
{error_logger,{{2014,2,5},{11,2,9}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,shutdown},{offender,[{pid,undefined},{name,net_sup},{mfargs,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]}
{error_logger,{{2014,2,5},{11,2,9}},std_info,[{application,kernel},{exited,{shutdown,{kernel,start,[normal,[]]}}},{type,permanent}]}
{“Kernel pid terminated”,application_controller,"{application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}"}

Crash dump was written to: erl_crash.dump.02-05-2014-11:02:08.13152
Kernel pid terminated (application_controller) ({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}})

Just to clarify. i got no instances working. how can i bring back one of them (the data should be there on the EBS). then i will backup and create the cluster with more nodes and everything.

update (just discovered):
the servers respond to port 8092 with
{
couchdb: “Welcome”,
version: “1.2.0a-1969a70-git”,
couchbase: “2.1.1-764-rel-community”
}

but still no thing :frowning:


#2

Did you have replication on between the nodes? If so node1 will have all your data, spin up a new AWS instance, add it to the cluster and perform a rebalance. Can you do that?


#3

a single node has all the data. but it doesnt come up. so i cant rebalance.


#4

Is it possible to detach the esb and attach to a new node? You’ve made some fundamental mistakes that are going to be hard to recover from. From a start get a copy of that data and have it safe somewhere else. Worst comes to worst perhaps you’ll need to bring up a new node, replicate between the two and then have a script to input the missing data back into the cluster.


#5

Alredy made a snapshot of the ebs.

i am trying to to use cbbackup on the files, then i will create a new cluster and restore to it.
but its taking alot of time.

isnt there a simple way just to make the node run again after restart?


#6

Well couchbase-server start should work fine but as you’ve said your ip has altered.

http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-bestpractice-cloud-ip.html

Sadly it sounds like you are pretty screwed, go with the script attempt. Good luck!