Regexp query not working as expected

I’m trying to do a Regex query on a field, however it’s not working as expected. Here’s my code:

qp.And(cbft.NewConjunctionQuery(cbft.NewRegexpQuery(".*cow.*").Field("Animals")))

This matches:
cow, dog, cat

But not:
dog, cow, cat

Any suggestions how I can do a match mid string?

Hello @benrolfe can you give more information ? and we will get you some help
Are you using N1QL query or FTS ?
What version of SDK are you using against which Server Version ?
Also if I understand it right you are trying to find the word (in this example cow) anywhere in the string not just if it starts with?

Hi,

Yes, I’m trying to find the word (in this example cow) anywhere in the string.

  • I’m using FTS.
  • couchbase/gocb.v1
  • Community Edition 6.5.0 build 4966

Thanks
Ben

Thanks much for the information, @benrolfe .

@chvck is this something you can assist with please ?

Hi @benrolfe, what is the rest of your search query? I’ve just created a bit of a test to repro this using a document of

{
  "animals": "dog, cow, cat"
}

Using gocb.NewSearchQuery("test", cbft.NewConjunctionQuery(cbft.NewRegexpQuery(".*cow.*").Field("animals"))) seems to match my document. It also matches if I change to “cow, dog, cat”.

Here’s my full query, which returns 0 results, with no errors.

Both .*cow and cow.* work, but that obviously doesn’t help much :frowning:

qp := cbft.NewConjunctionQuery(cbft.NewRegexpQuery(".*cow.*").Field("animals"))

ftSearch := gocb.NewSearchQuery("test", qp)

result, err := bucket.ExecuteSearchQuery(ftSearch)

Thanks team!

1 Like

@benrolfe I suspect this behavior you’re observing has to do with the analyzer you’re using to index the field “animals”…

If you check out the documentation here … Query Types | Couchbase Docs, regexp queries are non-analytic queries meaning the search terms are interpreted as is. A keyword analyzer is most suited for non-analytic queries.

Would you share the index definition of your index here?

Here’s a playground we host to test how your analyzer analyzes text …
http://bleveanalysis.couchbase.com/analysis

Thanks, those links were helpful.

Unless mistaken I think I’m using a keyword analyzer. See below.

What do I need to change?

{
  "type": "fulltext-index",
  "name": "PropertyAPISearch",
  "uuid": "221a92802b4a7473",
  "sourceType": "couchbase",
  "sourceName": "PropertyAPI",
  "planParams": {
    "maxPartitionsPerPIndex": 171
  },
  "params": {
    "doc_config": {
      "docid_prefix_delim": "",
      "docid_regexp": "",
      "mode": "type_field",
      "type_field": "type"
    },
    "mapping": {
      "analysis": {},
      "default_analyzer": "keyword",
      "default_datetime_parser": "dateTimeOptional",
      "default_field": "_all",
      "default_mapping": {
        "default_analyzer": "keyword",
        "dynamic": true,
        "enabled": true
      },
      "default_type": "_default",
      "docvalues_dynamic": true,
      "index_dynamic": true,
      "store_dynamic": true,
      "type_field": "_type"
    },
    "store": {
      "indexType": "scorch"
    }
  },
  "sourceParams": {}
}

image

@benrolfe The definition looks fine to me.
Can you try this curl request directly to your server (replacing the <> with the necessary parameters) and let us know if you see results?

curl -XPOST -H "Content-type:application/json"
http://<username>:<password>@<ip>:8094/api/index/<index_name>/query -d
'{"query": {"field": "animals", "regexp": ".*cow.*"}}'

Below is the response. The issue looks to be “TooManyClauses”.

{"status":{"total":6,"failed":6,"successful":0,"errors":{"PropertyAPISearch_221a92802b4a7473_13aa53f3":"TooManyClauses[2144 \u003e maxClauseCount, which is set to 1024]","PropertyAPISearch_221a92802b4a7473_18572d87":"TooManyClauses[2167 \u003e maxClauseCount, which is set to 1024]","PropertyAPISearch_221a92802b4a7473_54820232":"TooManyClauses[2178 \u003e maxClauseCount, which is set to 1024]","PropertyAPISearch_221a92802b4a7473_6ddbfb54":"TooManyClauses[2162 \u003e maxClauseCount, which is set to 1024]","PropertyAPISearch_221a92802b4a7473_aa574717":"TooManyClauses[2233 \u003e maxClauseCount, which is set to 1024]","PropertyAPISearch_221a92802b4a7473_f4e0a48a":"TooManyClauses[2134 \u003e maxClauseCount, which is set to 1024]"}},"request":{"query":{"regexp":".*OW1.*","field":"PortalOptions"},"size":10,"from":0,"highlight":null,"fields":null,"facets":null,"explain":false,"sort":["-_score"],"includeLocations":false,"search_after":null,"search_before":null},"hits":null,"total_hits":0,"max_score":0,"took":19711861,"facets":null}

I can’t really make my search any more specific, as that’s the data I need.

I tried this command:

curl -XPUT -H "Content-type:application/json" \ http://<username>:<password>@<ip>:8094/api/managerOptions \ -d '{"bleveMaxClauseCount": "10000"}'

But got this error:
Method Not Allowed

Any suggestions?

Thanks again.
Ben

The cURL request needs to be a PUT request, not POST. Thanks for your help.

curl -X PUT -H "Content-type:application/json"
http://<username>:<password>@<ip>:8094/api/index/<index_name>/query -d
'{"query": {"field": "animals", "regexp": ".*cow.*"}}'

No. It needs to be a POST request. A PUT is not supported over that endpoint.

Sorry, my mistake. I meant to write that this PUT cURL command increases the MaxClauseCount, which solved my problem. I think I used the wrong method in my previous message.

curl -X PUT -H "Content-type:application/json" http://<username>:<password>@<ip>:8094/api/managerOptions -d '{"bleveMaxClauseCount": "10000"}'