Syncgateway databases offline


#1

Below are the observation found in sync gateway during couchbase reboot

  1. sync gateway is deleting the database while bring database online from offline when underlying couchbase is still not reachable
  2. One of sync database(i have two database) is going to offline when one of couchbase node (cluster of 2 nodes)stopped. Database gets deleted when POST /_online request to make database online though one couchbase node still healthy
    This should not be a expected behavior, if one node goes down in couchbase cluster , it will effect client as sync databases going offline though some of nodes are available to server

please let me now if this is how syncgateway behaves. and is there any config we can do changes to bring database online state.
is there any auto recovery from syncgateway offline to online state when couchbase up and running .
i have got below error

11:34:37.243565 2016-04-18T11:34:37.243+05:30 WARNING: Lost TAP feed for bucket pttdata, with error: Get http://xx:8091/pools/default/bucketsStreaming/pttdata: dial tcp xx:8091: getsockopt: connection refused – rest.(*ServerContext)._getOrAddDatabaseFromConfig.func1() at server_context.go:646
11:34:37.243587 2016-04-18T11:34:37.243+05:30 CRUD: Taking Database : pttdata, offline
11:34:37.243596 2016-04-18T11:34:37.243+05:30 CRUD: Waiting for all active calls to complete on Database : pttdata
11:34:37.243604 2016-04-18T11:34:37.243+05:30 CRUD: Database : pttdata, is offline


#2

What version of SG and CB are your running?


#3

i am using couchbase 4.0 community and syncgateway 1.2 community…


#4

SG is completely stateless so all non-cached operations go to CB server.

Couchbase will not auto failover on a two node cluster + your losing half of your capacity. Try a 3 node cluster

also in your sync gateway config file what ip address do you have hard coded in? The one that is down?


#5

Thanks househippo for quick reply . i tried with configuring both nodes. but this behavior is not consistent. some time all syn databases will be up and sometime one will offline. below warnig i received
WARNING: Bucket Updater for bucket ptxdata returned error: Get http://xx:8091/pools/default/bucketsStreaming/ptxdata: dial tcp 10.2.0.184:8091: getsockopt: connection refused – base.GetCouchbaseBucket.func1() at bucket.go:469
11:34:37.243467 2016-04-18T11:34:37.243+05:30 WARNING: Lost TAP feed for bucket ptxdata, with error: Get http://xx:8091/pools/default/bucketsStreaming/ptxdata: dial tcp xx:8091: getsockopt: connection refused – rest.(*ServerContext)._getOrAddDatabaseFromConfig.func1() at server_context.go:646
11:34:37.243482 2016-04-18T11:34:37.243+05:30 CRUD: Taking Database : ptxdata, offline
11:34:37.243492 2016-04-18T11:34:37.243+05:30 CRUD: Waiting for all active calls to complete on Database : ptxdata

syncgateway will get all ips configured once it connect to couchbase, so when one node which is configured in config goes down, it has to connect to healthy node and operate with half of capacity.
do we have option to config more than one ip in sync config?
i will try with 3 nodes and verify.


#6

When SG first connects to CB it creates these three documents.

keys are distributed via CRC32(“key-name-here”)%1024 = some vBucket number. So the:

seq is the global incremented number for ever change in the DB.
syncdata is the default sync function
user reference doc for user.

so what probably happening in one bucket its has 2 of the 3 when failover happens so it sorta still runs while the other only one 1 of the 3 which it shut downs or some weird mix.

Remember to turn on auto failover so that when you loose a node in about 100ms to 2 minutes the replica in the other machine will be promoted to active.


#7

househippo, i dont have user defined in sync config. i have only 2 documents created while starting syncgateway. i am getting this database offline when node goes down before node gets failover.
i suspect bucket updater is not called bucket in other active node in this case.


#8

Thanks for the detailed information on this thread, @arihant_rk. This sounds like a case that should be getting handled by Sync Gateway. I’ve filed an issue to follow up: https://github.com/couchbase/sync_gateway/issues/1709