Couchbase Client 3.1.3 Bootstrap errors

I posted about having Couchbase Client/Bucket initialization problems a few weeks ago using .Net 5 + Couchbase Client 3.1.X, and I thought at the time I had fixed the problem by calling the connection config in a certain way, but actually the problem wasn’t fixed.

Digging deeper on .Net Core 3.1 + the newest Client 3.1.3, I found the following issues:

3.1.X code has this in ConnectionFactory:
//The endpoint we are connecting to
var targetHost = endPoint.Address.ToString();

	....

            //create the sslstream with appropriate authentication
            await sslStream.AuthenticateAsClientAsync(targetHost, certs,
                    SslProtocols.Tls | SslProtocols.Tls11 | SslProtocols.Tls12,
                    _clusterOptions.EnableCertificateRevocation)
                .ConfigureAwait(false);

This fails with the following exception:
“Cannot bootstrap bucket [BUCKET] as Couchbase. System.Security.Authentication.AuthenticationException: The remote certificate is invalid according to the validation procedure. at System.Net.Security.SslStream.StartSendAuthResetSignal(ProtocolToken message, AsyncProtocolRequest asyncRequest, ExceptionDispatchInfo exception)…”

When comparing to 2.7.X, it seems the problem may lie in using the IP address instead of the hostname in the AuthenticateAsClientAsync call. The 2.7.x code (from SSLConnection) is as follows:

            var targetHost = ConnectionPool.Configuration.Uri.Host;

	....

                    _sslStream.AuthenticateAsClientAsync(targetHost,
                        certs,
                        SslProtocols.Tls | SslProtocols.Tls11 | SslProtocols.Tls12,
                        Configuration.ClientConfiguration.EnableCertificateRevocation).Wait();
  1. As a temp workaround to see if fixing the above issue would be all that is needed, if I change the Couchbase client code to hardcode the hostname instead of the IP address, or if I utilize the following configs:
    opts.KvIgnoreRemoteCertificateNameMismatch = true;
    opts.HttpIgnoreRemoteCertificateMismatch = true;

I can get around the auth issue. However in doing so, I encountered another error in ClusterNode where the code never returns on this call from ExecuteOp:

            ResponseStatus status;
            using (new OperationCancellationRegistration(op, tokenPair))
            {
                status = await op.Completed.ConfigureAwait(false);
            }

This seems to point to 2 issues possibly:

  • error in the actual operation
  • cancellation doesn’t kick-in (not sure what the timeout is though)

We are on Couchbase 6.0, so I thought it might have to do with the newer ServerFeatures that were introduced in 3.1.x vs 2.7.x. However, commenting them out of the code resulted in the same behaviour.

Any idea how to proceed?

hello @jmorris / @btburnett3 / @matthew.groves can you please assist ?

Take a look at this post, this likely describes the problem and the fix.

For SDK3 your looking at ClusterOptions.KvIgnoreRemoteCertificateNameMismatch and optionally you can override the validation by proving a custom validation method here: ClusterOptions.KvCertificateCallbackValidation.

-Jeff

ClusterOptions.KvCertificateCallbackValidation=true did the trick to avoid the sslStream.AuthenticateAsClientAsync error against the IP address. Thank-you!

Issue 2) is still plaguing me though, and from what I can tell, the Task never completes and/or returns properly? Specifically, the error occurs when trying to execute the Hello operation in ClusterNode. Everything seems to run swimmingly up to this section of ExecuteOp:

            await sender(op, state, tokenPair).ConfigureAwait(false);

            ResponseStatus status;
                using (new OperationCancellationRegistration(op, tokenPair))
                {
                    status = await op.Completed.ConfigureAwait(false);   ==> doesn't complete/pass this point
                }

The sender call completes fine, and is able to create/configure the operation builder ok (logs I put in indicates no errors there). I put in logging in various places in Operation base, and after that, the GetStatus is called with status “Pending”:
public ValueTaskSourceStatus GetStatus(short token)
{
return valueTaskSource.GetStatus(token);
}

Then, OnCompleted is called, and I can’t tell what happens after that:
public void OnCompleted(Action<object?> continuation, object? state, short token,
ValueTaskSourceOnCompletedFlags flags)
{
_valueTaskSource.OnCompleted(continuation, state, token, flags);
}

Not sure if environment has a role in the behaviour/issue I’m seeing?:

  • Couchbase Client 3.1.3
  • .Net Core 3.1.12
  • Linux (Alpine 3.13)
  • GKE Docker image

@obawin -

I am not sure about the Hello call not responding, it should at the worst error or timeout. It’s possible its related to Linux/NET Core 3.1.12 especially if there are no errors from the server. Have you tried running WireShark and seeing what the server returns? If the server is returning success and there is blocking there it would likely be a client/.NET issue.

-Jeff

Finally got around to looking at things again. For anyone else having problems, the error seemed to be related to including 18091 as the port in the connection. Once removed (thanks @btburnett3 for your post here: Unable to connect after upgrading CouchbaseNetClient from v2.x to v3.1.7), everything started working.

1 Like