Client connection monitor using C SDK


#1

Hi ,
I am using C SDK to connect to Couchbase cluster and wanted to know different options for getting connection status w.r.t cluster nodes before trying out detailed hands on.

I can figure out below options

  1. lcb_ping3 to cluster and get the node details for each service. And based on these node information i can decide whether a new node is added or an existing node is down.

Is there any other way to monitor the connection status so that I can pass the information to my monitoring server?

Thanks & Regards
Jatin


#2

lcb_ping3 is better when you want actively reach all nodes, which might be expensive, as requires to send at least one command to each of the server on each node on the cluster.

The second option you have is lcb_diag3

http://docs.couchbase.com/sdk-api/couchbase-c-client-2.9.0/group__lcb-ping.html#ga7fd5094f85f4c151360372bd520b4999

This function just reports information about current connections, and do not trigger any new ones.

Your third option is metrics counters, this stuff exposes you some numbers about amount of data and operations which has been processed by the library: number of requests, number of failures etc. You can find a usage example in cbc-pillowfight tool sources:

But what is common about all those options, is that they more focused on the health of the SDK and its connections to the cluster. If you want to expose health of the cluster to your monitoring server, you definitely need to consume server stats and metrics. The binary protocol allows to get subset of them through STATS command, and that is available through cbc-stats. But much more information you could get using server tooling:
https://developer.couchbase.com/documentation/server/current/rest-api/rest-bucket-stats.html
https://developer.couchbase.com/documentation/server/current/cli/cbstats-intro.html
https://developer.couchbase.com/documentation/server/current/monitoring/monitor-intro.html


#3

Hi Sergey,

Thanks for your reponse !
I will definitely try out the suggested cbc-stats options to get the cluster health on monitoring server.

However, I have this specific use case, wherein my application will have a monitor thread which will periodically get the cluster nodes information and based on this information, it will send an alert to the end user that a node has been added or is unavailable.

I tried the lcb_diag, but I did not got the expected response. Please find below the cases which I tried.

I have a cluster with 2 nodes (172.16.130.68 and 172.16.129.68) where 172.16.130.68 is the localhost and 172.16.129.68 is the remote host.
In case 1, I did NOT recieved the host/ip details of remote node.
But in case 2, I did recieved the host/ip details of remote node.

case 1: Running the test program on localhost, with connection string pointed to local node
options.v.v3.connstr = “couchbase://172.16.130.68/TestBucket”
The diagnostic response is as below
{“config”:[{“id”:“0x2199700”,“last_activity_us”:5004743,“local”:“172.16.130.68:45866”,“remote”:“172.16.130.68:11210”,“status”:“connected”}],“id”:“0x2194c80”,“sdk”:“libcouchbase/2.9.0”,“version”:1}

case 2: Running the test program on localhost, with connection string pointed to remote node
options.v.v3.connstr = “couchbase://172.16.129.68/TestBucket”
The diagnostic response is as below
{“config”:[{“id”:“0x2199700”,“last_activity_us”:5004743,“local”:“172.16.130.68:45866”,“remote”:“172.16.129.68:11210”,“status”:“connected”}],“id”:“0x2194c80”,“sdk”:“libcouchbase/2.9.0”,“version”:1}

Are these response correct, or am I missing something ?

Thanks,
Jatin


#4

As I wrote previously lcb_diag() does not create any new connections, so if you want to see other addresses, you have to send operations to them, which will initiate new socket connections. Or you should use lcb_ping() to broadcast NOOP-like operations to all services on all nodes, which makes sense if you share lcb_t instance, but you mentioned threads, and I must warn you that lcb_t is not thread safe and you have to have global mutex which should be locked for any operation with lcb_t (callbacks, direct calls etc).

Also you said you want to get information about nodes, not the connections to them. And that definitely means you need to use server REST APIs, because lcb_diag and lcb_ping give you information related to SDK point of view. For cluster monitoring, you have to look at the last option I’ve written in previous post, with cluster tooling and its monitoring APIs.