retryStrategy on openBucket


#1

I’m curios if the RetryStrategy is applied when connecting to a Bucket. I know we can extend the default timeout from 2.5 seconds to what ever we desire I was thinking of having a retry a second time if Im unable to connect on CouchbaseCluster#openBucket


#2

Hi @mlblount45,

We retry certain failure cases on bucket connect using best effort by default, it would be interesting to know what is the specific failure in your case.

Which version of sdk are you using?
Can you provide the trace logs?


#3

thanks for your reply @subhashni, Im using java sdk version 2.2.1 (but can be upgraded if needed.

I don’t actively have a failure I just want to improve my logic when a failure does happen when a bucket fails to connect.

I should probably explain the scenario in which causes me to ask this question. So I have an application that loops through a list which is opening multiple different buckets then storing each AsyncBucket in a hashmap to be used later. Probably 97% of the time everything works fine but occasionally one of the CouchbaseCluster#openBucket calls will return a TimeoutException one simple approach would be to increase the connectTimeout something like .openBucket(couchbaseProperties.getWriteBucket(), 10, TimeUnit.SECONDS).async(); i don’t really like this. So I was looking for alternative solutions when i came across the .retryStrategy But i was confused as to if .retryStrategy was applied when a openBucket request was made or not. it felt like it was only for n1ql queries/ views etc.

Its not clear how the connectTimeout for openBucket plays with maxRequestLifetime does the connection try for the default 5k ms if it fails it keeps trying for the 75k ms of maxRequestLifetime? so total time attempting to connect is really always 75k ms? assuming BestEffortRetryStrategy is being used? if this is the case it seems like maxRequestLifetime may also need to be increased in most cases.


#4

The entire bucket connection timeout by default is 5 seconds, there is 1 second timeout for the underlying socket itself if the node is not reachable due to firewall or network issue so it will throw a timeout exception immediately. We use best effort if there was a configuration fetch failure, we use exponential delay and retry the request to the server until the connect timeout. For your case, I think increasing the connect timeout makes more sense from your description. @daschl can advise a better alternative maybe.


#5

@subhashni @daschl ok do you think its worth retrying a second time from my end if i get a timeout. maybe catch the TimeoutException thrown and try once more or do you think this is wasteful and not beneficial


#6

@mlblount45 Retrying might help but it would be worthwhile to investigate the underlying cause (for example if the network is slow)