.NET Core locks up on Cluster.ConnectAsync

When trying to connect to a couchbase server using the connection string couchbase://{host}:8091 or http://{host}:8091 the .NET SDK locks up trying to initialize the server and never returns. No exceptions are throw or log messages indicating an error occurred.

Log output

      Bootstrapping with node cos-int-next.cosential.io
dbug: Couchbase.Core.IO.Connections.MultiplexingConnection[0]
      Setting TCP Keep-Alives using SocketOptions - enable keep-alives True, time 00:01:00, interval 00:00:01.
dbug: Couchbase.Core.ClusterNode[0]
      Starting connection initialization on server <redacted>:8091.
dbug: Couchbase.Core.IO.Connections.MultiplexingConnection[0]
      Setting TCP Keep-Alives using SocketOptions - enable keep-alives True, time 00:01:00, interval 00:00:01.
dbug: Couchbase.Core.ClusterNode[0]
      Starting connection initialization on server <redacted>:8091.
dbug: Couchbase.Core.ClusterNode[0]
      Executing op Helo on <redacted>:8091 with key {"i":"96c675c7268b7e18/0000000000000001","a":"couchbase-net-sdk/3.1.4.0 (clr/.NET 5.0.5) (os/Microsoft Windows 10.0.19042)"} and opaque 3.
dbug: Couchbase.Core.ClusterNode[0]
      Executing op Helo on <redacted>:8091 with key {"i":"96c675c7268b7e18/0000000000000002","a":"couchbase-net-sdk/3.1.4.0 (clr/.NET 5.0.5) (os/Microsoft Windows 10.0.19042)"} and opaque 4.

I have not tried other ports or with TLS enabled. This was a maddening bug that took me a whole day to narrow down. This did not appear to be an issue in the 2.x version so only after upgrading to 3+ was this discovered.

Server Info:

Couchbase Server Community Edition 6.0.0 build 1693

Expected / Desired outcome

An exception should be thrown indicating configuration is invalid or client should connect. Whatever the solution is it should for sure not lock up with no additional information.

Example project

ReproduceCouchbaseSdkLockup.csproj

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net5.0</TargetFramework>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Couchbase.Extensions.DependencyInjection" Version="3.0.5.931" />
    <PackageReference Include="CouchbaseNetClient" Version="3.1.4" />
    <PackageReference Include="Microsoft.Extensions.Configuration" Version="5.0.0" />
    <PackageReference Include="Microsoft.Extensions.DependencyInjection" Version="5.0.1" />
    <PackageReference Include="Microsoft.Extensions.Hosting" Version="5.0.0" />
    <PackageReference Include="Microsoft.Extensions.Logging" Version="5.0.0" />
    <PackageReference Include="Serilog.Extensions.Logging.File" Version="2.0.0" />
  </ItemGroup>

</Project>

Program.cs

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Couchbase.Extensions.DependencyInjection;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;

namespace ReproduceCouchbaseSdkLockup
{
    class Program
    {
        private const string key = "ping";
        private const string value = "pong";
        private const string hostname = "localhost";

        private static string bucketName;
        private static string port;
        private static string schema;


        private static Dictionary<string, string> schemaCases = new Dictionary<string, string>
        {
            { "couchbase", "couchbase" },
            { "http", "http" },
        };

        private static Dictionary<string, string> portCases = new Dictionary<string, string>
        {
            { "none", "" },
            { "rest", ":8091" },
            { "memcache", ":11210" },
            { "badport", ":5000" },
        };

        private static Dictionary<string, string> bucketCases = new Dictionary<string, string>
        {
            { "nonexistant", "non_existant_bucket" },
            { "memcache", "memcache_bucket" }, // update this to an existing bucket of type memcache
            { "couchbase", "couchbase_bucket" }, // update this to an existing bucket of type couchbase
        };

        static async Task Main(string[] args)
        {
            /*
            Argument cases:

            Non-existant bucket
                1. http none nonexistant            http://{host}               - runs but throws exception as expected
                2. couchbase none nonexistant       couchbase://{host}          - runs but throws exception as expected
                3. http rest nonexistant            http://{host}:8091          - runs and LOCKS UP 
                4. couchbase rest nonexistant       couchbase://{host}:8091     - runs and LOCKS UP
                5. http memcache nonexistant        http://{host}:11210         - runs but throws exception as expected
                6. couchbase memcache nonexistant   couchbase://{host}:11210    - runs but throws exception as expected
                7. http badport nonexistant        http://{host}:5000          - runs but throws exception as expected
                8. couchbase badport nonexistant   couchbase://{host}:5000     - runs but throws exception as expected
            
            Memcache bucket
                1. http none memcache               http://{host}               - runs but throws exception as expected
                2. couchbase none memcache          couchbase://{host}          - runs but throws exception as expected
                3. http rest memcache               http://{host}:8091          - runs and LOCKS UP 
                4. couchbase rest memcache          couchbase://{host}:8091     - runs and LOCKS UP
                5. http memcache memcache           http://{host}:11210         - runs but throws exception as expected
                6. couchbase memcache memcache      couchbase://{host}:11210    - runs but throws exception as expected
                7. http badport memcache            http://{host}:5000          - runs but throws exception as expected
                8. couchbase badport memcache       couchbase://{host}:5000     - runs but throws exception as expected
            
            Couchbase bucket
                1. http none couchbase              http://{host}               - runs but throws exception as expected
                2. couchbase none couchbase         couchbase://{host}          - runs but throws exception as expected
                3. http rest couchbase              http://{host}:8091          - runs and LOCKS UP 
                4. couchbase rest couchbase         couchbase://{host}:8091     - runs and LOCKS UP
                5. http memcache couchbase          http://{host}:11210         - runs but throws exception as expected
                6. couchbase memcache couchbase     couchbase://{host}:11210    - runs but throws exception as expected
                7. http badport couchbase           http://{host}:5000          - runs but throws exception as expected
                8. couchbase badport couchbase      couchbase://{host}:5000     - runs but throws exception as expected
             */

            string schemaArg = args.FirstOrDefault() ?? "couchbase";
            string portArg = args.Skip(1).FirstOrDefault() ?? "none";
            string bucketArg = args.Skip(2).FirstOrDefault() ?? "couchbase";

            schema = schemaCases.ContainsKey(schemaArg) 
                ? schemaCases[schemaArg] 
                : throw new Exception($"Unknown schema argument. Value values are {string.Join(',', schemaCases.Keys)}");
            
            port = portCases.ContainsKey(portArg) 
                ? portCases[portArg] 
                : throw new Exception($"Unknown port argument. Value values are {string.Join(',', portCases.Keys)}");

            bucketName = bucketCases.ContainsKey(bucketArg) 
                ? bucketCases[bucketArg]
                : throw new Exception($"Unknown bucket argument. Value values are {string.Join(',', bucketCases.Keys)}");

            var host = CreateHostBuilder(args).Build();

            Console.WriteLine($"Create bucket provider.");
            IBucketProvider bucketProvider = host.Services.GetService<IBucketProvider>();

            Console.WriteLine($"Get bucket '{bucketName}'.");
            var bucket = await bucketProvider.GetBucketAsync(bucketName); // <------ LOCKS UP HERE
            Console.WriteLine($"Get default collection.");
            var collection = await bucket.DefaultCollectionAsync();
            Console.WriteLine($"Check if {nameof(key)} '{key}' exists.");
            var existOperation = await collection.ExistsAsync(key);
            if (!existOperation.Exists)
            {
                Console.WriteLine($"The {nameof(key)} '{key}' does not exist.");
                var createOperation = await collection.UpsertAsync(key, value);
                Console.WriteLine($"The {nameof(key)} has been created with value '{value}'.");
            }
            else
            {
                Console.WriteLine($"The {nameof(key)} '{key}' exists.");
            }

            var readOperation = await collection.GetAsync(key);

            Console.WriteLine($"The {nameof(readOperation)} result '{readOperation.ContentAs<string>()}' should be '{value}'.");
        }

        public static IHostBuilder CreateHostBuilder(string[] args) =>
            Host.CreateDefaultBuilder(args)
                .ConfigureAppConfiguration(config =>
                {
                    config.AddInMemoryCollection(new[]
                    {
                        new KeyValuePair<string, string>("ConnectionString", $"{schema}://{hostname}{port}"),
                        new KeyValuePair<string, string>("UserName", "Administrator"),
                        new KeyValuePair<string, string>("Password", "password"),
                    });
                })
                .ConfigureLogging(logging =>
                {
                    logging.ClearProviders();
                    logging.AddConsole();
                    logging.SetMinimumLevel(LogLevel.Trace);
                })
                .ConfigureServices((hostContext, services) =>
               {
                    services.AddCouchbase(hostContext.Configuration);
                });
    }
}
1 Like

@dferguson -

Thanks for the incredibly detailed bug report! I created a ticket for tracking and will try to get a fix in an upcoming released. If your interested in contributing or need a fix sooner, shoot us a PR: github.com/couchbase/couchbase-net-client

-Jeff

1 Like

Thank you. I am fairly certain that I narrowed it down to ClusterNode.cs:470 . This await statement never returns. However, I couldn’t make heads or tails of the IOperation abstraction so I stopped there.

1 Like

@dferguson -

There is currently not support for bootstrapping over HTTP port 8091 in this manner, it just simply hasn’t been implemented. The SDK tries to use port 8091 instead of 11210 to send Memcache bootstrapping operations, which will always fail (must use 11210 to do that). This is more of lacking feature and a bug as the initial bootstrapping to get the cluster map should be over 8091 and then use 11210 for the Memcache operations. It should throw a NotImplemented/NotSupported exception here.

It is also a bug in that deadlocks there on the Helo call using the wrong port. This will be fixed.

Thanks
Jeff

1 Like

In a somewhat related question, is there documentation over how you would connect using a non-standard port? For example, if Couchbase was run in a container and all standard port mappings were changed. How could you provide those updated ports in a connection string so the SDK knows how to properly communicate?

@dferguson -

If your using DNS-SRV the SDK should handle this for you; you would pass in the the SRV service record and it will resolve it and bootstrap with one of the returned values. The cluster map that the server returns will contains the ports to be used. If your not and you want to use a custom port directly, there is an open ticket (NCBC-2628) to add this feature to the SDK.

-Jeff

1 Like

I think more specifically, that’s where alternate addresses help rather than SRV. SRV will help you bootstrap to the alternate addresses, but the alternate ports on the ‘outside’ need to be communicated to the clients. See the docs on Alternate Addresses.

It’s worth noting we do it this way somewhat intentionally to avoid having to proxy requests internally and to deliver the best latency/throughput at the lowest cost.