Build Search Index from nested hierarchy


#1

Hi Guys,

We are trying to add search functionality to our application and decided to rely on FTS.
Documents that must be included in “search by name” have a nested structure that looks like this:

 {
  "type": "folders",
  "workspaceId": 100,
  "folders": [
    {
      "id": 107,
      "name": "Birds",
      "assetCount": 0,
      "children": [
        {
          "id": 108,
          "name": "Owl",
          "children": [],
          "assetCount": 10
        },
        {
          "id": 109,
          "name": "Parrot",
          "assetCount": 20,
          "children": [
            {
              "id": 110,
              "name": "Crazy Bird",
              "children": [ ],
              "assetCount": 30
            },
            {
              "id": 111,
              "name": "Penguin",
              "children": [ ],
              "assetCount": 40
            }
          ]
        }
      ]
    }
  ]
}

The use case is pretty simple.
User searches for “bir” and result set is (2 matches):

match 1

{
  "id": 107,
  "name": "Birds",
  "assetCount": 0,
  "workspaceId": 100
}

match 2

{
  "id": 110,
  "name": "Crazy Bird",
  "assetCount": 30,
  "workspaceId": 100
}

Is it possible to achieve this functionality using FTS or should we build a regular index and use N1QL?

Thank you.


#2

FTS search usually finds/identifies the matches at a document level granularity.
Here the word Bird/Birds appears in the given document, its just that they occurs at different positions with in the same document.
FTS is capable of returning the array positions relative to the whole document hierarchal structure if you request for / enable the “IncludeLocation” field in the search request. So theoretically, the user will be able to figure out where all the search term appears in the document. (For that it can either fetches the document body from KV or use the store field option within FTS index itself)

But in case of a very complex (many levels), high density single document with many occurrences of the given search text, I am not sure how effective it would be identify those children values for the application.

Please note that, if you want to search for partial words(like bir) , FTS has a rich vocabulary of query types to explore.
(/prefixes/regex/wildcard/edit distance to name a few)

Please refer the documentation to get a detailed view.
https://docs.couchbase.com/server/6.0/fts/fts-queries.html

regards,
Sreekanth