How to use an existing couchbase connection?

Hi,

I was told by a couchbase engineer that there was an optional configuration in older PHP sdk to cache cluster information so that PHP SDK does not have to gather cluster information for each process.

It looks like that option is gone and only way to use an existing connection is using fast cgi such as PHP-FPM.

While it might work for most use cases, it is still not optimized. Some users are going to see slow performance time to time.

I understand that caching cluster information on a remote server might not be a good idea, but is there a way to do that?

OR at least, is there a way to confirm that PHP is using an exiting connection?

I don’t know why, but I NEVER get answer to this question. Other people asked this several times, but there is simply no answer.

Okay let me describe what you can do with PHP SDK now

Lets say that you have /etc/hosts record like this

127.0.0.1   localhost localhost0 localhost1 localhost2 localhost3 localhost4 localhost5 localhost6 localhost7 localhost8 localhost9

By default PHP will cache and reuse bucket connections with the same connection strings. So compare two following examples:

for ($i = 0; $i < 10; $i++) {
    $cluster = new CouchbaseCluster("couchbase://localhost1");
    $bucket = $cluster->openBucket('travel-sample');
    $bucket->upsert('foo', 'bar');
}
$ strace -e trace=connect php test.php
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(3, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
+++ exited with 0 +++

Here we can see single network socket with id 3. And when connecting to localhost using different aliases

for ($i = 0; $i < 10; $i++) {
    $cluster = new CouchbaseCluster("couchbase://localhost" . $i);
    $bucket = $cluster->openBucket('travel-sample');
    $bucket->upsert('foo', 'bar');
}
$ strace -e trace=connect php test.php
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(3, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(5, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(5, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(6, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(6, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(7, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(7, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(8, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(8, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(9, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(9, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(9, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(10, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(10, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(11, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(11, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(12, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(12, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
+++ exited with 0 +++

As expected, we have 10 connections with ids from 3 to 12.

The optimisation with caching shared configuration you mentioned still exists and would make sense if you are using HTTP to bootstrap the client (i.e. CCCP provider disabled with LCB_NO_CCCP=1 or HTTP provider selected explicitly). Lets see it in action:

for ($i = 0; $i < 10; $i++) {
    $cluster = new CouchbaseCluster("couchbase://localhost" . $i . "?bootstrap_on=http");
    $bucket = $cluster->openBucket('travel-sample');
    $bucket->upsert('foo', 'bar');
}
$ strace -e trace=connect php test.php
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(3, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(5, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(5, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(6, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(6, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(7, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(7, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(8, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(8, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(9, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(9, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(10, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(10, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(11, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(11, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(12, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(12, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(13, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(13, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(14, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(14, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(15, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(15, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(15, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(16, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(16, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(16, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(17, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(17, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(18, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(18, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(19, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(19, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(20, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(20, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(21, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(21, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(22, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(22, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
+++ exited with 0 +++

Here we could see that for each data connection, it also opens HTTP connection to listen the changes. And here we can get rid of these 8091 sockets if we instruct the library to read configuration from the file.

for ($i = 0; $i < 10; $i++) {
    $cluster = new CouchbaseCluster("couchbase://localhost" . $i . "?bootstrap_on=http&config_cache=/tmp/cache.json");
    $bucket = $cluster->openBucket('travel-sample');
    $bucket->upsert('foo', 'bar');
}

It will reduce HTTP connections to only one and use file as a cache for the configuration:

$ strace -e trace=connect php test.php
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(3, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(5, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(5, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(6, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(6, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(7, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(7, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(8, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(8, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(9, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(9, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(10, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(10, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(11, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(11, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(12, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(12, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(13, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(13, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
+++ exited with 0 +++

Also which is more important, other processes see this cache file and will not create HTTP bootstrap connection at all. The cache invalidated when the library detects stale topology.

That’s really really helpful. Thank you very much!

If there is still caching option available and the SDK can detect changes, why isn’t it recommended? It sounds like it is most optimized way of using SDK at least for PHP due to its nature of share nothing model.

Recommended way is to use CCCP which is default, because it doesn’t need connection at all and use the same data connection. Cache optimization make sense for http which is quite expensive to keep extra connection for topology updates

Please correct me if I’m wrong. I’m still not quite sure if I understand it correctly.

  1. The first example in your post uses the same connection because it creates a new couchbase cluster within the same process. That totally makes sense. I was told that I have to use PHP-FPM to use an existing process. If I set pm.max_requests to 1000 in PHP-FPM. Does that mean that all 1000 requests will use the same connection? I wasn’t sure how this works.

  2. I was also told that if I only pass an ip address of couchbase, then C library uses couchbase:// by default, which means it uses CCCP (?). Am I correct on it?

Hello,

This is unrelated to this main thread (I think). I was running your command strace -e trace=connect php aPhp.php in my server and received the following data. It looks like I have a problem with permission. Is this a problem? I have two ECONNREFUSED.

connect(3, {sa_family=AF_LOCAL, sun_path="/tmp/.newrelic.sock"}, 22) = 0
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.x.x.2")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("10.x.x.164")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("10.x.x.11")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("10.x.x.11")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("10.x.x.11")}, 16) = -1 ECONNREFUSED (Connection refused)
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("10.x.x.164")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(4, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("10.x.x.164")}, 16) = -1 ECONNREFUSED (Connection refused)
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.x.x.2")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("10.x.x.11")}, 16) = 0
connect(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("10.x.x.164")}, 16) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("10.x.x.11")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(4, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("10.x.x.11")}, 16) = 0
+++ exited with 0 +++

@moon0326 you need to make sure that the right ports are reachable from your client. Especially 11210, 8091-8094 for non SSL communication.

Hi,

What happens if it is not? My server has been working just fine. Just wondering if it will cause any errors or has been causing performance issues.

@avsej sorry for pinging this late. Could you please answer my questions? I still don’t have an answer for #1.

When I use CCCP, does it save the data (cache) locally and use it for new processes as well or is this per process base?