Query/search performance

Hi,
Our typical workload at one of our applications is to execute a FTS (cbft) search query and using those keys we do a simple n1ql with USE KEYS. It’s CB6 and the bucket contains about 250k documents of different types.

My concern is performance and in particular performance consistency…

  1. When i executed the n1ql query which is very simple: SELECT x,y,z FROM bucket USE KEYS[‘k1’,'k2, ‘kn’] on the workbench, i got inconsistent results. For the same query (same keys), the execution elapsed from 5ms to 50ms and even more

  2. doing the initial CBFT also elapsed more that i expected.

Couple of questions here

  1. Is there any dif in terms of performance between SELECT USE KEYS and BULK MULTI GET (considering a diffrrent document sizes, there are small and large documents) ?

  2. When dealing with cbft, in terms of performance, is there any dif between using the default and define a field as “keyword”…

  3. Any advise on the inconsistencies i described above?

Thanks

If you already doing FTS search query and have document keys.

If you are not going to filter further or JOIN or aggregation and want to project only few fields you can try this.

In FTS index store the the fields project as included_fields
OR
Use SDKs with subdoc API get the required fields or use SDK and get it via asynchronously.
OR
use N1QL as described above

Also checkout

@vsr1
thanks.

Not sure i followed you suggestion re store the fields and project them as included_fields. To store each of them or an additional field?

re my questions, where can i find best practices re:

  • configuring a field as a keyword vs text
  • USE KEYES vs Multi GET by keys…

thanks

Hi @shaike.marc,

If your requirement is to just retrieve certain fields of matching documents for a given full text search query,
then FTS service already provides an option to index/store those fields as a part of the FTS index and you may retrieve those needed fields for each of the matching documents as a part of your original FTS query itself.

For this, the user needs to specify the desired x,y,z fields during the FTS index creation and enable the “store” check box option while adding a child field as specified here.
ref - Creating Search Indexes | Couchbase Docs

This ensures the original source document contents are stored intact into the FTS index without any text analysis process.

Now during the query time, the user may specify the “Fields” option in the search request object to retrieve all those necessary fields as a part of the search request.
ref - https://docs.couchbase.com/server/current/fts/fts-response-object-schema.html#request

curl -XPOST -H "Content-Type: application/json" -uusername:pwd http://host:port/api/index/FTS/query -d '{
  "fields": [
    "x","y","z"
  ],
  "query": {"field": "fieldName", "match_phrase": "query text"},
}'

When dealing with cbft, in terms of performance, is there any dif between using the default and define a field as “keyword”…

keyword and default are the text analysers and I am not sure whether you have explored it from that angle. Keyword will treat the whole field content as a single non-analysed token where as the standard default analyser would tokenise it differently.

Details can be found here,
-Full-Text Search Indexing Best Practices & Tips - Part 1
-Understanding Analyzers | Couchbase Docs

You may experiment with analysers here - http://analysis.blevesearch.com/analysis

With the given details, it looks like your performance wrinkles mostly has to do with something outside the chosen analyser. Let us know of the perf numbers once you exercise the store option mentioned above.

note - a support ticket with cbcollect info would be highly recommended for fixing the perf worries. :slight_smile:

Cheers!