Client discounts under "Idle", never able to reconnect


#1

We found a bug in the library when the client discounts under “idle” for more than half an hour, and it’s easy to reproduce:

  • Create a client for the couchbase client library (version 1.3.3.0), used to read document only, let’s called it ROC, read only client;
  • Insert a document by another application (embedded with the same couchbase client), returning a key;
  • Use the ROC to read the document by the key from the app, so far so good;
  • However, let ROC idle for more than half an hour;
  • Insert another document by the app, returning a new key.
  • Then, I realized that the new document can be accessed by the app and other non-idle ROCs, but NOT by the first ROC idle for half an hour.

Here’s how we initialize the couchbase client:

var cbConfig = new Couchbase.Configuration.CouchbaseClientConfiguration();
cbConfig.Bucket = “doc”;
cbConfig.BucketPassword = “doc”;
foreach (var s in Servers)
cbConfig.Urls.Add(new Uri(string.Format(“http://{0}:8091/pools/default”, s)));
client = new CouchbaseClient(cbConfig);

Here’s how we read the document:

var result = client.ExecuteGet(key);

Here’s how we create the document:

var doc = Guid.NewGuid().ToString("N") var key = string.Format("doc_{0}", token); client.StoreJson(StoreMode.Add, key, doc, TimeSpan.FromSeconds(900))

Exception (converted to JSON) we caught in the idle ROC when we call ExecuteGet:

{ "HasValue": false, "Value": null, "Cas": 0, "Success": false, "Message": "Exception reading response - 12.23.42.12:11210", "Exception": { "ClassName": "System.IO.IOException", "Message": "Failed to write to the socket '10.70.10.97:11210'. Error: ConnectionReset", "Data": {
                            },
                            "InnerException": null,
                            "HelpURL": null,
                            "StackTraceString": "   at Enyim.Caching.Memcached.ThrowHelper.ThrowSocketWriteError(EndPoint endPoint, SocketError error)   at Couchbase.CouchbasePooledSocket.Write(IList`1 buffers)   at Couchbase.CouchbaseNode.Execute(IOperation op)",
                            "RemoteStackTraceString": null,
                            "RemoteStackIndex": 0,
                            "ExceptionMethod": "8\nThrowSocketWriteError\nEnyim.Caching, Version=1.3.3.0, Culture=neutral, PublicKeyToken=05e9c6b5a9ec94c2\nEnyim.Caching.Memcached.ThrowHelper\nVoid ThrowSocketWriteError(System.Net.EndPoint, System.Net.Sockets.SocketError)",
                            "HResult": -2146232800,
                            "Source": "Enyim.Caching",
                            "WatsonBuckets": null
            },
            "StatusCode": 132,
            "InnerResult": {
                            "Cas": 0,
                            "Success": false,
                            "Message": "Exception reading response - 12.23.42.12:11210",
                            "Exception": {
                                            "ClassName": "System.IO.IOException",
                                            "Message": "Failed to write to the socket '10.70.10.97:11210'. Error: ConnectionReset",
                                            "Data": {
                                                            
                                            },
                                            "InnerException": null,
                                            "HelpURL": null,
                                            "StackTraceString": "   at Enyim.Caching.Memcached.ThrowHelper.ThrowSocketWriteError(EndPoint endPoint, SocketError error)   at Couchbase.CouchbasePooledSocket.Write(IList`1 buffers)   at Couchbase.CouchbaseNode.Execute(IOperation op)",
                                            "RemoteStackTraceString": null,
                                            "RemoteStackIndex": 0,
                                            "ExceptionMethod": "8\nThrowSocketWriteError\nEnyim.Caching, Version=1.3.3.0, Culture=neutral, PublicKeyToken=05e9c6b5a9ec94c2\nEnyim.Caching.Memcached.ThrowHelper\nVoid ThrowSocketWriteError(System.Net.EndPoint, System.Net.Sockets.SocketError)",
                                            "HResult": -2146232800,
                                            "Source": "Enyim.Caching",
                                            "WatsonBuckets": null
                            },
                            "StatusCode": 132,
                            "InnerResult": null
            }

}


#2

yli119 -

First of all, thanks for the detailed description and steps to reproduce, it’s much appreciated. Would you mind creating a jira ticket for it? You can do so here: http://www.couchbase.com/issues/browse/NCBC. As a side note, if you create the NCBC, then you’ll get alerted as the status of the issue changes. So I suggest you do that.

This is likely a bug, but could be caused by other network related factors. Note that the client does set TCP keep-alives; the interval of the “ping” request however is defined by the hosting OS. In the case of windows, the default is 2hrs between pings. What can sometime happen is that LB or other network appliance can terminate the connection because it detects the connection as idle.

Thanks,

Jeff


#3

I tried to file the bug in Jira, but couldn’t log in; my user/pass only works on the forums.

I’m experiencing the same issue, but it turns out that there’s a workaround for it. If you retry the ExecuteGet about 15 times, you’ll start receiving correct responses again.

Edit: I should mention that I’m connecting to a couchbase server running Linux.