I have a two node cluster and yesterday one of the nodes went down. After about 10 minutes it was back up and rejoined the cluster, and after about 25 minutes it had fully warmed up and was in Ready state. Last night we noticed random lags in sync gateway responses from the node that did NOT go down so I checked the logs this morning and see this line over and over again:
14:42:24.297499 2016-06-03T14:42:24.297Z WARNING: Skipped Sequence 3966241 didn’t show up in MaxChannelLogMissingWaitTime, and isn’t available from the * channel view. If it’s a valid sequence, it won’t be replicated until Sync Gateway is restarted. – db.func·005() at change_cache.go:206
These messages completely fill the log (the sequence value in it is one higher each time) and after a few hours sync gateway became unresponsive. Restarting sync gateway fixed the issue and it is working fine now with none of these messages in the log.
Is this related to the node failing and rejoining the cluster? Is there anyway we can mitigate this issue in the future or is restarting sync gateway after all nodes in the cluster are in the Ready state the only way to fix it?