How to raise timeout threshold in Python SDK for analytics_query

naftali · January 25, 2021, 5:31pm

I am trying to execute a long running analytics query via the 3.0 sdk (I believe, would love to know definitive way to check).

I’ve verified the query works in the UI. I verified that short running analytics queries are working via the sdk.

I’m getting a timeout error (Error_1)
I don’t see an obvious way to up the timeout in the query options.

The docs don’t mention a timeout property on the AnalyticsOptions.

Thinking maybe AO extends regular query options I tried passing in a timeout: timedelta(minutes=20) and get Error_2.

Thinking maybe to use the raw ‘Escape hatch’ I pass in {timeout: timedelta} but crash getting a timedelta is not json serializable error.

Please help:

Error_1: (Timeout)

TimeoutException: <RC=0xC9[LCB_ERR_TIMEOUT (201)], HTTP Request failed. Examine ‘objextra’ for full result, Results=1, C Source=(src/pycbc_http.c,212), OBJ=ViewResult<rc=0xC9[LCB_ERR_TIMEOUT (201)], value=None, http_status=201, tracing_context=0, tracing_output=None>, Context={‘first_error_code’: 0, ‘http_response_code’: 0, ‘first_error_message’: ‘’, ‘statement’: ‘…’, ‘client_context_id’: ‘076869b7c3f18504’, ‘query_params’: ‘{“args”:[{“timeout”:null}],“metrics”:false,“statement”:"…’, ‘http_response_body’: ‘’, ‘endpoint’: ‘…:8095’, ‘type’: ‘AnalyticsErrorContext’}, Tracing Output={“:nokey:0”: null}>

Error_2: passing in timedelta as value to timeout property as docs say to do for standard query:

TypeError Traceback (most recent call last)
in
25
26
—> 27 result = cluster.analytics_query(query1, AnalyticsOptions(timeout=delta))
28 for row in result:
29 print(row)

~/anaconda3/lib/python3.8/site-packages/couchbase/cluster.py in analytics_query(self, statement, *options, **kwargs)
665 return self._maybe_operate_on_an_open_bucket(CoreClient.analytics_query,
666 AnalyticsException,
→ 667 opt.to_query_object(statement, *opts, **kwargs),
668 itercls=itercls,
669 err_msg=‘Analytics queries require an open bucket’)

~/anaconda3/lib/python3.8/site-packages/couchbase/options.py in to_query_object(self, statement, *options, **kwargs)
312 for k, v in ((k, args[k]) for k in (args.keys() & self.VALID_OPTS)):
313 for target, transform in self.VALID_OPTS[k].items():
→ 314 setattr(query, target, transform(v))
315 return query
316

TypeError: ‘member_descriptor’ object is not callable

naftali · January 25, 2021, 6:17pm

Ok, I found documentation about making the setting cluster wide: Client Settings | Couchbase Docs

Am trying it out, will update.

UPDATE…

No it did not work. Also the docs only mention it affecting N1QL queries and also that the cluster wide setting shouldn’t be messed with usually but overridden by per query setting.

So back to the original issue: Please help.

OK, hmm, unfortunately there may be something deeper going on here.

Noting that other SDKs do provide a timeout param for the Analytics query. I tried it out with node sdk.

Am seeing the same behavior and setting the timeout override doesn’t help.

I am seeing this in the node docs:>

Server Side Timeout , customizes the timeout sent to the server. Does not usually have to be set, as the client sets it based on the timeout on the operation. Uses the timeout option, and defaults to the Analytics timeout set on the client (75s). This can be adjusted at the cluster global config level.

I seem to be running up against the 75 second client side timeout

How do I change this magical Analytics client side timeout?

So the docs seem to say this is done by passing a ClusterTimeoutOptions into ClusterOptions.

Also this isn’t working. Here is my code:

from couchbase.bucket import Bucket

from couchbase.cluster import Cluster, ClusterOptions, AnalyticsOptions, ClusterTimeoutOptions

from couchbase_core.cluster import PasswordAuthenticator

timeoutOptions = ClusterTimeoutOptions(query_timeout=timedelta(seconds=700), kv_timeout=timedelta(seconds=700))

from couchbase.exceptions import CouchbaseException

options = ClusterOptions(PasswordAuthenticator('username', 'password'), timeout_options=timeoutOptions)

cluster = Cluster.connect('couchbase://{}'.format('...'), options)

query1 = "..."

result = cluster.analytics_query(query1)

naftali · January 25, 2021, 8:42pm

Looking at the source code (and comparing with the docs in the .NET sdk)

I’m thinking that this override just hasn’t been implemented.

KEY_MAP = {‘kv_timeout’: ‘operation_timeout’,
‘query_timeout’: ‘query_timeout’,
‘views_timeout’: ‘views_timeout’,
‘config_total_timeout’: ‘config_total_timeout’}

Those are the available ClusterTimeoutOptions. (I tried config_total_timeout. Didn’t help)

I do see an Analytics Timeout in the C Client>

LCB_CNTL_ANALYTICS_TIMEOUT
Analytics Timeout This is the global I/O timeout for Analytics queries, issued via lcb_analytics() More…

Am I correct? If so, do we have a timeline. This makes it very difficult to use Analytics in our pipeline (I assume Node SDK has the same issue).

naftali · January 25, 2021, 9:34pm

Ok with an assist from some vague node sdk documentation and a fortunate guess as to the param name, I got this to work, not on a per query basis (which would be best) but as a global client setting, which is acceptable.

You can pass the config option into the connection string in the Cluster.connect method (and I’m guessing also into its constructor)

Here’s an example:

cluster = Cluster.connect(‘couchbase://{}?analytics_timeout=1000’.format(‘1.1.1.90’))

This was waaaaaay harder than it should have been.

AV25242 · January 26, 2021, 4:04pm

@naftali sorry for the inconvenience, totally agree that documentation should have reflected the analytics_timeout options. I have created a documentation ticket to address this