Java client (1.4.1) taking long to receive configuration updates after fail over


#1

I am using Couchbase server 2.2 and Java client 1.4.1

I have 3 nodes in my cluster. I am doing fail over testing by bring down one of the nodes.

Use iptables to prevent the node from being seen to other machines

Script as below

CB_PORTS=$(sudo netstat -lptun | egrep -i ‘moxi|memcache’ | awk ‘{print $4}’ | awk -F: ‘{print $NF}’ | tr ‘\n’ ’ ')
for p in $CB_PORTS; do
sudo iptables -I INPUT -p tcp -m tcp --dport $p -j REJECT
done
sudo iptables -I INPUT -p tcp -m multiport --dports 21100:21199 -j REJECT
sudo iptables -I INPUT -p tcp -m multiport --sports 21100:21199 -j REJECT
sudo iptables -I INPUT -p tcp -m tcp --dport 8091 -j REJECT
sudo iptables -I INPUT -p tcp -m tcp --dport 8092 -j REJECT

I notice that it takes a while for the Java client to detect that the node is down and receive configuration updates especially when then there is less load (say 1 op/sec)

During this period the client keeps throwing TimedOut exceptions.

On high load (100 ops/sec) it recovers faster because it reaches DEFAULT_MAX_TIMEOUTEXCEPTION_THRESHOLD (998) and then requests for the configuration update.

At 1 op/sec it takes around 15 minutes to recover , maybe that when it hits the 998 limit.

I also tried the same with the new client (1.4.2) and see the same issue

Is there a configuration setting I am missing ? Can request for configuration updates be more deterministic say on fail over try after 30 seconds instead of TIMEOUTEXCEPTION_THRESHOLD