Trouble querying, but can get documents

Using the code below, it is possible to download a full document in a bucket, but issuing a query to that same cluster fails with an exception. I am having difficulty figuring out where to look to fix the issue. I’m using python 3.9.5 with the 3.1.1 python SDK and couchbase-ce 6.6.0

What I have is the narrowed representation of the problem. Anyone have any ideas on what to look for? Couchbase is running inside of a docker container and it is using a docker bridge network to communicate to another container which is executing this code. Could that network layering be a problem somehow (not sure why it would only fail for cluster queries, though)

First here is the exception on query:

couchbase.exceptions.InvalidArgumentException: <RC=0xCB[LCB_ERR_INVALID_ARGUMENT (203)], There was a problem scheduling your request, or determining the appropriate server or vBucket for the key(s) requested. This may also ba bug in the SDK if there are no network issues, C Source=(src/pycbc_http.c,388)>
Full Exception
 >>> cluster.query('select 1')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/couchbase/cluster.py", line 557, in query
    return self._maybe_operate_on_an_open_bucket(CoreClient.query,
  File "/usr/local/lib/python3.9/site-packages/couchbase/cluster.py", line 576, in _maybe_operate_on_an_open_bucket
    if self._is_6_5_plus():
  File "/usr/local/lib/python3.9/site-packages/couchbase/cluster.py", line 507, in _is_6_5_plus
    response = self._admin.http_request(path="/pools").value
  File "/usr/local/lib/python3.9/site-packages/couchbase/management/admin.py", line 182, in http_request
    return self._http_request(type=LCB.LCB_HTTP_TYPE_MANAGEMENT,
couchbase.exceptions.InvalidArgumentException: <RC=0xCB[LCB_ERR_INVALID_ARGUMENT (203)], There was a problem scheduling your request, or determining the appropriate server or vBucket for the key(s) requested. This may also be a bug in the SDK if there are no network issues, C Source=(src/pycbc_http.c,388)>

The code:

from couchbase.cluster import Cluster, ClusterOptions, QueryOptions, QueryScanConsistency
from couchbase_core.cluster import PasswordAuthenticator
cluster = Cluster.connect("couchbase://couchbase.my.domain", ClusterOptions(PasswordAuthenticator("Administrator", "password")))
bucket = cluster.bucket("resource")
collection = bucket.default_collection()
collection.get('testdoc')  # works
bucket.get('testdoc')  # works
bucket.query('SELECT 1') # works
cluster.query('SELECT 1') # fails

wondering how bucket.query is working when its not even a thing in SDK 3.x . Something is not right here.
@jcasey is bucket.query still exposed ?

:slight_smile:

I don’t think it’s supposed to be? It’s not in the docs. I accidentally tried it, and saw that it existed and then worked… so I included it in case it was helpful. I’m using cluster.query() in my code.

I’m still at a loss as to how cluster.query is not working, but bucket/collection.get both are (i.e. it’s not a connection/network issue). Also, when I say “works” above, I just mean it doesn’t raise an exception. I didn’t actually check the output.

Thanks for pointing out that method, I was wondering myself if it was a mistake in the docs or unintentionally exposed.

-Eric

no problem @erichorne. one thing I can think of is port 8093 / 18093 are they open and accessible from your app machine ?

8093 is open as a TCP port and I can connect to it from the other container. However, 18093 is not open from the remote container nor from within the couchbase container itself. This is what’s listening on in the couchbase container, and these are all able to establish a connection from the remote container (I just checked each port):

tcp        0      0 0.0.0.0:11209           0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:11210           0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:9100            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:21100           0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:9101            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:9102            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:9999            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:21200           0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:9105            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:4369            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:21300           0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:8091            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:8092            0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:8093            0.0.0.0:*               LISTEN      -

Hi @erichorne – Can you bash into the app container and try the following curl command (using the appropriate value for the “couchbase container”)?

curl -u Administrator:password http://<couchbase container>:8091/pools

From the error, the query is not actually executing, it is failing on a check to see what version of Couchbase Server is running (this is done so the client knows if it can execute on a cluster or not). The check is done prior to executing the query. So, this seems to point to the app container’s network setup. Another check to confirm: can you run the same code on your local machine hitting the Couchbase Server instance in the Docker container?

To @AV25242 's point, bucket.query() should not be used in the 3.x SDK, but unfortunately at the moment it is still available. This functionality is most likely to be removed in a future release of the SDK.

Here is the output from the remote container. It was successful.

Output
{
  "isAdminCreds": true,
  "isROAdminCreds": false,
  "isEnterprise": false,
  "allowedServices": [
    "kv",
    "n1ql",
    "index",
    "fts"
  ],
  "isIPv6": false,
  "isDeveloperPreview": false,
  "packageVariant": "ubuntu16.04/docker",
  "pools": [
    {
      "name": "default",
      "uri": "/pools/default?uuid=daf2b149bdbeaceb3afd9d06e9d6e9ce",
      "streamingUri": "/poolsStreaming/default?uuid=daf2b149bdbeaceb3afd9d06e9d6e9ce"
    }
  ],
  "settings": {
    "maxParallelIndexers": "/settings/maxParallelIndexers?uuid=daf2b149bdbeaceb3afd9d06e9d6e9ce",
    "viewUpdateDaemon": "/settings/viewUpdateDaemon?uuid=daf2b149bdbeaceb3afd9d06e9d6e9ce"
  },
  "uuid": "daf2b149bdbeaceb3afd9d06e9d6e9ce",
  "implementationVersion": "6.6.0-7909-community",
  "componentsVersion": {
    "sasl": "3.1.2",
    "os_mon": "2.4.4",
    "inets": "6.5.2.4",
    "crypto": "4.2.2.2",
    "ale": "0.0.0",
    "lhttpc": "1.3.0",
    "stdlib": "3.4.5.1",
    "ssl": "8.2.6.4",
    "kernel": "5.4.3.2",
    "public_key": "1.5.2",
    "asn1": "5.0.5.2",
    "ns_server": "6.6.0-7909-community"
  }
}

I was planning on trying to hit the couchbase directly from container host, but there’s a bunch of configuration that needs to happen (install the libs and such). I’ll get back to you; thanks for the ideas!

-Eric

I ran a packet sniff (tcpdump) in the remote container. When I try to get a document, I see it connecting to the couchbase container successfully on port 11210. When the cluster.query is tried, there is no network activity. It seems like it doesn’t like something about a setting or the connection string it doesn’t like.

My connection string is couchbase://couchbase_db_1.mydomain.org. Changing that to http did not help.

When I replace the hostname with the IP address, it works. So there’s something funny with the connection string or the ability to resolve the fqdn ip address. Of course, I’m able to resolve the address in the container via ping or with the other couchbase functions (collection.get), so there isn’t anything obviously wrong with name resolution. I’ll report back when I find out more.

-Eric

1 Like

I’ve only had a little time to look into this. This is all I’ve found so far:
Using the IP address instead of the FQDN resolves the issue (in a sense)
Using the FQDN or just the hostname both fail, although they both ping and work otherwise in the container
Using the FQDN or just the hostname both successfully ‘Cluster.connect’. That is, no error is produced. However, when I use a non-resolvable name, Cluster.connect does throw an exception.

I spent some time tracing the flow of execution. It comes down to a call to an _http_request function that looks like a C call back into libcouchbase. I’m having difficulty finding the actual function, looking both in the libcouchbase github and the python-sdk github. The class “couchbase.management.Admin” is created from the class LCB.Bucket, which comes from couchbase_core._libcouchbase – searching for “Bucket” in there is, as you can imagine, getting a lot of hits.

Do you know where I can find the code for Bucket or ultimately Bucket._http_request? I’m hoping that may expose what’s happening here. The problem, I"m sure is something in the docker setup, but I need to know how to reproduce the issue outside of couchbase to figure out the problem.

In the meantime, I just use the IP address to connect – but this is not a great solution.

Thanks for your help!

-Eric

Quick ping… Can anyone direct me to the source code for Python SDK’s Bucket._http_request? I am trying to track down an issue in which _http_request is not able to resolve an FQDN from within a container on a container network; but that’s the only place in the system that can not resolve the name. All other places (OS and couchbase’s original connection, and couchbase’s get document functions) all are able to resolve the name. Using an IP address does work. Thus I’m left to try to figure out what exactly is happening in the Bucket._http_request function, but can’t seem to find it.
Thank you for the help!

-Eric

I’ve discovered that the FQDN resolultion doesn’t like underscores in the name. For a container started by docker-compose, the hostname defaults to – so in my example, couchbase_db_1. When I force docker-compose to use the hostname couchbase1, the name resolution works. Underscores in DNS names have been problematic for a long time; interesting that cluster.quest() uses a different name resolution mechanic than collection.get(). Either way, documenting in case anyone else happens to run into this.