SearchError "expected an opening bracket for the rows" while executing one of our FTS queries

We are using Couchbase Community Edition 6.5.0 Build 4966.
We have FTS index for one of our buckets.
For one of our queries, we get a SearchError while executing the search -
expected an opening bracket for the rows | {“query”:{“conjuncts”:[{“disjuncts”:[{“field”:“case_id”,“prefix”…

We are using the Go SDK: github.com/couchbase/gocb/v2 v2.2.1
Our index status is fine, 100%, etc.
It happens in certain cases and most of our search queries work.

Looking at the gocb source code and searching this error message, it seems like couchbase is struggling to deserialize the json results?

We are trying to understand whats the cause of that, can you please help us?
Thanks

1 Like

@eran-levy , its better to post SDK related queries in relevant SDK forums - https://www.couchbase.com/forums/c/go-sdk

Tagging @chvck for a quick peek.

It could be some unhandled error cases too.
And it would be helpful if you can confirm whether the issue happens even with a direct curl query to the FTS server query endpoints.

Hi @eran-levy would you be able to capture the response body that the server is sending back to the SDK? It sound like the SDK is finding the hits field but the value isn’t something that we expect (we expect the search response hits to always be an array).

@sreeks is there a scenario where the hits field is returned as a different datatype than an array?

@chvck , I can’t recollect a scenario where the hits would be non-array at the moment. But hits would certainly be empty if there are other errors.
My hunch is that it could be a parse error with a failed search request/response.
And hence eager to know the query output for a direct curl command here.

@sreeks @chvck in case the hits are null, this error make sense?

@idangazit I wasn’t aware that hits can be null (I thought that the field would just be missing or an empty array) but if it is null then yes I think that this error would be raised. I’ve raised https://issues.couchbase.com/browse/GOCBC-1106 to look into this further.

2 Likes

@chvck @sreeks
Yes the hits is null, here is the response received when we curl with the same query (truncated a bit to see the important stuff):
{“status”:{“total”:6,“failed”:6,“successful”:0,“errors”:{“case_search_v2_6f8033d7cdb7275b_13aa53f3”:“TooManyClauses[9199 \u003e maxClauseCount, which is set to 1024]”,“case_search_v2_6f8033d7cdb7275b_18572d87”:“TooManyClauses[9189 \u003e maxClauseCount, which is set to 1024]”,…,“hits”:null,“total_hits”:0,“max_score”:0,“took”:121201041,“facets”:null}

seems like as a result of the TooManyClauses error the hits is null, hence thats why the sdk not marshalling well the response.

Regarding the TooManyClauses errors, we are not totally sure whats the root cause of that?

Is that because our response can be too large as explained here: Searching with the REST API | Couchbase Docs ?
Lets say I would like to search but get the first 10 results? is it possible without raising that error?

We performing wildcard queries - such as gma i.e.
{
“field”: “some.field”,
“wildcard”: " * gma * "
},
{
“field”: “another.field”,
“wildcard”: " * gma * "
},

What is the best practice?

In FTS, any search query would be applied or results in an index lookup for all the matching candidate tokens/terms present in the index for the given query.
With the queries like, prefix/wildcard/fuzzy/regex etc, the query might have lots of matching tokens in index to search for.

In your example, there are more than 1024 matching candidate tokens for the query - * gma *.

Internally, the FTS server puts a restriction on the search when the number of such matching candidate tokens exceeds the default limit of 1024. This is done to hint the user that,
1- The query isn’t well-scoped as it targets a lot of matching tokens/documents. From a relevance perspective, we think the user could rewrite their query to target more specific documents.
2- To mitigate the unwarranted memory/resource usage with such queries in the system.

Now, if the user still wanted to stick with the query knowing its cost, the above max clause count is configurable.
Users can override that with the following REST command in a cluster.

curl -XPUT -H "Content-type:application/json" \
http://<username>:<password>@<ip>:8094/api/managerOptions \
-d '{"bleveMaxClauseCount": "10000"}'

From 6.5 release onwards, the above config is a cluster level, persisted configuration. Meaning you could just set this on any node in the cluster and the config would survive any system reboots.

You could find many threads on this forum if you just give a search for “bleveMaxClauseCount”.

Cheers!

Thanks @chvck and @sreeks !

I think that we should also try to make our query more efficient.
Our query does
"Query": { "conjuncts": [ { "disjuncts": [ { "field": "field1", "wildcard": "*gma*" }, { "field": "field2", "wildcard": "*gma*" }, { "field": "field3", "prefix": "gma" } ] }, { "field": "org_id", "inclusive_max": true, "inclusive_min": true, "max": 33, "min": 33 } ] }
Do we have control over the order of subqueries? meaning, first filter by org id, than do and AND with the disjuncts block?

Thanks

No further control is possible with the order of clauses here.
Another thing noted is that, If your org_id is a text field in the source document then it ought to be a faster simple term query than a range query like now.

1 Like