Full text search with N1QL query gives incorrect and inconsistent response

Hi, I’m trying to perform a search using FTS from N1QL using Couchbase server version 7.1.1.

From FTS search UI the results are consistent with correct response for the below query string,

{
  "explain": true,
  "fields": [
    "*"
  ],
  "highlight": {},
  "query": {
    "query": "+id:1001 +extension:pdf*"
  },
  "size": 10,
  "from": 0
}

Same query string when tried using N1QL query, it gives no results.

SELECT t.*
FROM test t
WHERE SEARCH(t, { "query": { "query": "+id:1001 +extension:pdf*" }, "size": 10, "from": 0 }, {'index':'fts_idx_test'})

And when the wildcard ‘*’ is removed from the query string, it gives inconsistent results. I’m assuming the query string should work similar in both the cases with or without wildcard if there is a match.
Kindly advise how to make this search work with above N1QL query along with FTS search.

Note: I have tried other syntaxes like ‘match’, ‘prefix’, ‘wildcard’ etc. Each have its own limitation either with case-sensitivity or with partial match.

FTS search UI,

N1QL run result,

Expected result out of N1QL query,

Note: here ‘*’ is removed from the query string to show the expected result.

While running FTS search requests from N1QL, we added a constraint that for non-analytic queries - the FTS index definition will need those fields to be indexed with the “keyword” analyzer. This is for strict consistency when N1QL needs to evaluate documents that it obtains directly from the database.

The keyword analyzer stores the content of the field exact as is (case sensitive).

The wildcard query is one such non-analytic query, you’ll need to the field extension to be indexed using the keyword analyzer to support your search from N1QL.

When you search without the wildcard, a match query (which is an analytic query) runs and works of the analyzer you’ve defined in the index definition. The default is standard which tokenizes text on unicode and applies the lower case filter over those tokens before indexing them. This is the reason why PDF is considered a match when you look for pdf.

1 Like

Thanks Abhinav. Trying to understand, now in my real use-case, I have many such column filters like depicted here as extension which needs case-insensitive search along with partial search (wildcard). How do I configure index for all such fields which are dynamic in nature?

here is my plain FTS_Index configuration (using default standard analyzer),

{
 "name": "fts_idx_test",
 "type": "fulltext-index",
 "params": {
  "doc_config": {
   "docid_prefix_delim": "",
   "docid_regexp": "",
   "mode": "scope.collection.type_field",
   "type_field": "type"
  },
  "mapping": {
   "default_analyzer": "standard",
   "default_datetime_parser": "dateTimeOptional",
   "default_field": "_all",
   "default_mapping": {
    "dynamic": true,
    "enabled": false
   },
   "default_type": "_default",
   "docvalues_dynamic": false,
   "index_dynamic": true,
   "store_dynamic": false,
   "type_field": "_type",
   "types": {
    "_default.test": {
     "dynamic": true,
     "enabled": true
    }
   }
  },
  "store": {
   "indexType": "scorch",
   "segmentVersion": 15
  }
 },
 "sourceType": "gocbcore",
 "sourceName": "TestDb",
 "sourceUUID": "05a25bdffd283bb3b132ee3530f7f78e",
 "sourceParams": {},
 "planParams": {
  "maxPartitionsPerPIndex": 1024,
  "indexPartitions": 1,
  "numReplicas": 0
 },
 "uuid": "aceb82bf288f6f12"
}

Hi @nikhilvjain

I think your issue is “store_dynamic”: should be set to true. But be aware just indexing everything uses more space and CPU rather than building a purpose built index for the fields your interested in.

Setting in the UI it will look something like this (Advanced/DocValues for Dynamic Fields)

Best

Jon Strabala
Principal Product Manager - Server‌

I have many such column filters like depicted here as extension which needs case-insensitive search along with partial search (wildcard).

I see, so that’s do-able. You can retain the dynamic nature of your index alongside specifying the keyword analyzer ONLY for field extension this way:

{
  "name": "fts_idx_test",
  "type": "fulltext-index",
  "params": {
    "doc_config": {
      "docid_prefix_delim": "",
      "docid_regexp": "",
      "mode": "scope.collection.type_field",
      "type_field": "type"
    },
    "mapping": {
      "default_analyzer": "standard",
      "default_datetime_parser": "dateTimeOptional",
      "default_field": "_all",
      "default_mapping": {
        "enabled": false,
        "dynamic": true
      },
      "default_type": "_default",
      "docvalues_dynamic": false,
      "index_dynamic": true,
      "store_dynamic": false,
      "types": {
        "_default.test": {
          "dynamic": true,
          "enabled": true,
          "properties": {
            "extension": {
              "enabled": true,
              "dynamic": false,
              "fields": [
                {
                  "name": "extension",
                  "type": "text",
                  "analyzer": "keyword",
                  "index": true
                }
              ]
            }
          }
        }
      }
    },
    "store": {
      "indexType": "scorch",
      "segmentVersion": 15
    }
  },
  "sourceType": "gocbcore",
  "sourceName": "TestDb",
  "sourceUUID": "05a25bdffd283bb3b132ee3530f7f78e",
  "sourceParams": {
    
  },
  "planParams": {
    "maxPartitionsPerPIndex": 1024,
    "indexPartitions": 1,
    "numReplicas": 0
  },
  "uuid": ""
}