Roadmap question: are there plans to make the Go SDK context aware for dynamic request timeouts (and / or distributed tracing)


#1

For any network calls / long operations it is important (for CFG prevention) to enforce timeouts. There are global timeouts on the connection to couchbase but no timeouts that are specific to particular requests. As some types of queries are naturally longer running than others, it is useful to be able to cancel anomalous long running requests dynamically (i.e. based on query type / current load / other kinds of anomaly detection).

The usual way to facilitate dynamic timeouts in a Go client package is to provide an API that allows for context.Context to be passed as the first argument (see the Go blog about context).

context.Context is also used by distributed tracing frameworks (i.e. OpenCensus) to thread span information through a program and across network boundaries.

Are there plans to support this kind of API in the existing Go SDK or a future one?


#2

Hi @voutasaurus, for the dynamic timeouts do you mean for KV requests? N1QL, search etc… have functions for setting timeouts on a per query basis.

For distributed tracing we support the OpenTracing API, although this support is currently experimental so is subject to change. By using Cluster.SetTracer you can set your own tracer implementation to be used which means that you can easily hook up something like Jaeger tracing by doing something like:

import (
...
"github.com/couchbase/gocb"
jaeger "github.com/uber/jaeger-client-go"
jaegercfg "github.com/uber/jaeger-client-go/config"
"github.com/uber/jaeger-lib/metrics/prometheus"
...)

cfg := jaegercfg.Configuration{
	Sampler: &jaegercfg.SamplerConfig{
		Type:  jaeger.SamplerTypeConst,
		Param: 1,
	},
	Reporter: &jaegercfg.ReporterConfig{
		LogSpans: true,
	},
	ServiceName: "gocb",
}

jMetricsFactory := prometheus.New()

//// Initialize tracer with a logger and a metrics factory
tracer, closer, err := cfg.NewTracer(
	jaegercfg.Metrics(jMetricsFactory),
)
if err != nil {
	panic(err.Error())
}
defer closer.Close()

cluster, err := gocb.Connect("couchbase://10.111.180.101")
if err != nil {
	panic("Error connecting to cluster:" + err.Error())
}
cluster.Authenticate(gocb.PasswordAuthenticator{
	Username: "username",
	Password: "password",
})
cluster.SetTracer(tracer)
...

#3

Thanks @chvck , I did look for query based timeouts but I didn’t look hard enough I guess. This is the one: https://godoc.org/github.com/couchbase/gocb#N1qlQuery.Timeout

It serves my purposes for now (context has more flexibility for cancellation due to the cancel function but I don’t need that today).

As for the tracing, thanks for letting me know about SetTracer! Ideally tracing is request based and tracks the request across all of the components of a system that request touches by creating a span tree of all of those operations. In particular I don’t see how that span tree can be constructed without passing the parent span to the query (from gocb package user to gocb internally) and https://godoc.org/github.com/couchbase/gocb#N1qlQuery doesn’t seem to have a method for adding the parent span (or a context containing that span).


#4

Glad that you found that one :slight_smile:

You’re right that we don’t expose a way for a user to pass a span context into tracing. This is something that we’re aware of and, along with passing context.Context, will be considering when we design the next major version of the SDK which will be released mid next year.