Count/Aggregation

Yes quicker — but now not returning results. I must have done somehting wrong … I’ll play around with it

{
“name”: “faksearch2”,
“type”: “fulltext-index”,
“params”: {
“doc_config”: {
“docid_prefix_delim”: “”,
“docid_regexp”: “”,
“mode”: “type_field”,
“type_field”: “type”
},
“mapping”: {
“default_analyzer”: “standard”,
“default_datetime_parser”: “dateTimeOptional”,
“default_field”: “_all”,
“default_mapping”: {
“dynamic”: true,
“enabled”: false
},
“default_type”: “_default”,
“docvalues_dynamic”: true,
“index_dynamic”: true,
“store_dynamic”: false,
“type_field”: “_type”,
“types”: {
“type”: {
“dynamic”: false,
“enabled”: true,
“properties”: {
“type”: {
“enabled”: true,
“dynamic”: false,
“fields”: [
{
“docvalues”: true,
“include_in_all”: true,
“include_term_vectors”: true,
“index”: true,
“name”: “type”,
“type”: “text”
}
]
}
}
}
}
},
“store”: {
“indexType”: “scorch”
}
},
“sourceType”: “couchbase”,
“sourceName”: “ice_us”,
“sourceUUID”: “021160cf87998bf9e4dc96303a90a13d”,
“sourceParams”: {},
“planParams”: {
“maxPartitionsPerPIndex”: 171,
“indexPartitions”: 6,
“numReplicas”: 0
},
“uuid”: “6f40814d6c9bcb55”
}

Cool, you’d want to make sure that the field name that you’re indexing matches with the field name you’re searching for.

Trying to index literally type value in JSON

{

<other data>

“l7”: “Harding”,
“d0”: “CUSTACCP”,
“l8”: “362 Helmson Ave”,
“type”: “addressbook”, <=== INDEX THIS
“l9”: “Apt 21”,
“cas”: 0,
“u0”: “00918050”,
“u1”: 4,
“u2”: “”,
“s1”: “”,
“s2”: false
}

I select

JSON Type Filed (type)
Unselect default type mappings
Click +Add type mapping
Add type as the type name… leave inherit on and select only index spcifief fields
Hit ok
Deselect default/dynamic mapping
Hover back over new type mapping - select ‘insert child field’

Select field : type
Select type : text
select searchable as : type
select analyzer :inherit

select all options - but ensure “store” is deselected.

Create index.

Just noticed that your index wouldn’t have indexed anything at all.

  • Do NOT un-select the default type mapping.
  • Do select “Only index specified fields”.

Thanks - that works but is slow again.

Do I need to select ‘insert child field’ still or not as Im applying to all ‘type’

This is incorrect. Follow these steps:

  • Drop the type mapping “type” and everything within it.
  • Within the default mapping, first select “only index specified fields” and then add a child field “type”.

Or instead, to make this even simpler, lets do this …

  • Copy this index mapping into a file, say “temp.json”
{
  "name": "faksearch2",
  "type": "fulltext-index",
  "uuid": "",
  "sourceType": "couchbase",
  "sourceName": "ice_us",
  "sourceUUID": "021160cf87998bf9e4dc96303a90a13d",
  "sourceParams": {},
  "planParams": {
    "maxPartitionsPerPIndex": 171,
    "numReplicas": 0,
    "indexPartitions": 6
  },
  "params": {
    "mapping": {
      "default_mapping": {
        "enabled": true,
        "dynamic": false,
        "properties": {
          "type": {
            "enabled": true,
            "dynamic": false,
            "fields": [
              {
                "name": "type",
                "type": "text",
                "store": false,
                "index": true,
                "include_term_vectors": true,
                "include_in_all": true,
                "docvalues": true
              }
            ]
          }
        }
      },
      "default_type": "_default",
      "default_analyzer": "standard",
      "default_datetime_parser": "dateTimeOptional",
      "default_field": "_all",
      "store_dynamic": false,
      "index_dynamic": true
    },
    "store": {
      "indexType": "scorch"
    },
    "doc_config": {
      "mode": "type_field",
      "type_field": "type",
      "docid_prefix_delim": "",
      "docid_regexp": ""
    }
  }
}
  • Next run this command against your node …
curl -XPUT -H "Content-type:application/json" http://<username>:<password>@<ip>:8094/api/index/faksearch2 -d @temp.json

The UI would look like this …

Works … but back to slow.
Around 7-9k docs per 3 second refresh of screen.
So lets call it 3k docs per seconds …

Thats 4 hours to index the docstore …

CPU is idle.

04:28:25 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
04:28:30 PM all 15.70 0.00 7.66 0.14 0.00 0.24 0.00 0.00 0.00 76.25
04:28:30 PM 0 14.37 0.00 7.38 0.00 0.00 0.19 0.00 0.00 0.00 78.06
04:28:30 PM 1 17.90 0.00 7.78 0.19 0.00 0.19 0.00 0.00 0.00 73.93
04:28:30 PM 2 16.92 0.00 7.50 0.19 0.00 0.19 0.00 0.00 0.00 75.19
04:28:30 PM 3 13.66 0.00 8.16 0.00 0.00 0.19 0.00 0.00 0.00 77.99

That is pretty slow. Can you confirm that the index definition is exactly identical to what I shared earlier?

Hi @johnfak, are you sure of the cpu usage of cbft process?
Whats the amount of RAM and the respective FTS RAM quota set?
Healthy RAM/FTS quota helps in faster indexing too.

I believe so … although it formats it differently - sorry was out Monday

{
  "type": "fulltext-index",
  "name": "faksearch2",
  "uuid": "612e4eeb653ebc1b",
  "sourceType": "couchbase",
  "sourceName": "ice_us",
  "sourceUUID": "021160cf87998bf9e4dc96303a90a13d",
  "planParams": {
    "maxPartitionsPerPIndex": 171,
    "indexPartitions": 6
  },
  "params": {
    "doc_config": {
      "docid_prefix_delim": "",
      "docid_regexp": "",
      "mode": "type_field",
      "type_field": "type"
    },
    "mapping": {
      "analysis": {},
      "default_analyzer": "standard",
      "default_datetime_parser": "dateTimeOptional",
      "default_field": "_all",
      "default_mapping": {
        "dynamic": false,
        "enabled": true,
        "properties": {
          "type": {
            "dynamic": false,
            "enabled": true,
            "fields": [
              {
                "docvalues": true,
                "include_in_all": true,
                "include_term_vectors": true,
                "index": true,
                "name": "type",
                "type": "text"
              }
            ]
          }
        }
      },
      "default_type": "_default",
      "docvalues_dynamic": true,
      "index_dynamic": true,
      "store_dynamic": false,
      "type_field": "_type"
    },
    "store": {
      "indexType": "scorch"
    }
  },
  "sourceParams": {}
}

basically each server (3) has 16GB RAM
Not using MDS.

10GB to data
3GB to query
512MB (default) to search
1GB to analytics

512 MB is very less and try bumping this upto 2-3GB or so. You may adjust the memory quota of other non using services accordingly.

Definately quicker and ore acceptable/normal for a large index.

Query is down from 40 second (N1QL) to around 3.5 to 4 seconds and pretty on par with redis.
Nice job @abhinav @sreeks

Final side note.
Seems to be many more options and possible faster access via FTS over N1QL and Analytics (based on this basic use case).

Is there a guide on when to use one over the other based on performance/features.

Thanks again … very insightful.

1 Like

There’s documentation on various query types supported by FTS here …
https://docs.couchbase.com/server/6.5/fts/fts-query-types.html

And here’s how to leverage FTS from within N1QL in the upcoming release …
https://docs.couchbase.com/server/6.5/n1ql/n1ql-language-reference/searchfun.html

As for guidelines on when to use N1QL+GSI vs FTS, I’ll ping @binh.le to check if he can point you to any documentation available or if its a work in progress.

Thanks @abhinav
That would be great. Im going to reach out to our Couchbase engagement guys and ask for a team demo on FTS/analytics also.

appreciated.