Randomly getting generic network failure

I am running a website with 2 servers for website code (in PHP), and 1 server as load-balancer. All 3 are also running couchbase instances, as part of one single cluster. See

http://i.stack.imgur.com/TijKL.png

In PHP code, I am using couchbase buckets as follows:

$cluster = new \CouchbaseCluster('http://127.0.0.1:8091');
$greyloftWebbucket = $cluster->openBucket('some_bucket');
$query = \CouchbaseViewQuery::from('abcd', 'pqrs');

This arrangement works fine when all couchbase instances are running. When any one of them is closed and I try to access buckets, I get following error randomly:

[2015-07-17 13:46:08] production.ERROR: exception 'CouchbaseException' with message 'Generic network failure. Enable detailed error codes (via LCB_CNTL_DETAILED_ERRCODES, or via `detailed_errcodes` in the connection string) and/or enable logging to get more information' in [CouchbaseNative]/CouchbaseBucket.class.php:282
Stack trace:
#0 [CouchbaseNative]/CouchbaseBucket.class.php(282): _CouchbaseBucket->http_request(1, 1, '/_design/abcd...', NULL, 1)
#1 [CouchbaseNative]/CouchbaseBucket.class.php(341): CouchbaseBucket->_view(Object(_CouchbaseDefaultViewQuery))
#2 /var/www/greyloft-laravel/app/couchbasemodel.php(25): CouchbaseBucket->query(Object(_CouchbaseDefaultViewQuery))
#3 /var/www/greyloft-laravel/app/Http/Controllers/Listing.php(42): App\couchbasemodel::listings()
#4 [internal function]: App\Http\Controllers\Listing->index()

That is, one time the page will load correctly and show bucket content, and then one time will show me above error. It doesn’t matter if I access load balancer or any of the server directly. Since I don’t get this error when all 3 instances of couchbase are running, I doubt it is network failure.

Also, autofailover in enabled with replication set to 1 in couchbase cluster. In all 3 servers, I have set LCB_LOGLEVEL=5. To set LCB_LOGLEVEL, I have modified my .bashrc in server as well as /etc/init.d/couchbase to

...
start() {
    export LCB_LOGLEVEL=5
    touch $PIDFILE $NODEFILE $COOKIEFILE
    ...

What is happening? Is it a problem in Couchbase PHP SDK or anything else? I would really appreciate any help at all.

Update:

I have updated the connection string

$cluster = new \CouchbaseCluster('http://127.0.0.1:8091?detailed_errcodes=1');

With this, the error message that I get is now:

CouchbaseException in CouchbaseBucket.class.php line 74: The remote host refused the connection. Is the service up?

It still pops up randomly (~50% of times). Also cross checked if autofailover is working:

Failed over 'ns_1@<ip_address>': ok
Node ('ns_1@<ip_address>') was automatically failovered.

It appears that PHP SDK is trying the access the failed node. Anything that I’m doing wrong or why this error message is popping up?

(I also posted it on Stackoverflow, link)

Ensure the server is actually failed over. Autofailover will wait for a certain amount of time (by default 2 minutes) until it actually fails over an unreachable node. During this period you will get errors if something is trying to access that node.

Use detailed_errcodes=1 in your connection string to get more detailed error messages.

I think this would be a good reference for more explanation on what it does and other options that are available.

@mnunberg Can all of these options be passed into the connection string in the PHP SDK?

yes. that’s the reference and you should be able to pass any of these options into the connection string.