Geo Search Problems

makeawish · March 10, 2021, 9:08pm

I created as per docs my geo based index where i have a object called geo which holds values for lat and lon.
First strange thing is that it claims that 100% of my docs are indexed but the doc count is like 50% of the docs in the bucket.

Here is what my index looks like

{
 "name": "Geo_Index",
 "type": "fulltext-index",
 "params": {
  "doc_config": {
   "docid_prefix_delim": "",
   "docid_regexp": "",
   "mode": "type_field",
   "type_field": "_type"
  },
  "mapping": {
   "default_analyzer": "standard",
   "default_datetime_parser": "dateTimeOptional",
   "default_field": "_all",
   "default_mapping": {
    "dynamic": false,
    "enabled": true,
    "properties": {
     "geo": {
      "enabled": true,
      "dynamic": false,
      "fields": [
       {
        "docvalues": true,
        "include_in_all": true,
        "include_term_vectors": true,
        "index": true,
        "name": "geo",
        "store": true,
        "type": "geopoint"
       }
      ]
     }
    }
   },
   "default_type": "_default",
   "docvalues_dynamic": true,
   "index_dynamic": true,
   "store_dynamic": false,
   "type_field": "_type"
  },
  "store": {
   "indexType": "scorch"
  }
 },
 "sourceType": "couchbase",
 "sourceName": "rets",
 "sourceUUID": "68ba50313fd833d089e0c9704188988f",
 "sourceParams": {},
 "planParams": {
  "maxPartitionsPerPIndex": 171,
  "indexPartitions": 6,
  "numReplicas": 0
 },
 "uuid": "fd96992b583c4a96"
}

when i run my query like this

curl -u Administrator:password -X POST \
-H "Content-Type: application/json" \
http://localhost:8094/api/index/Geo_Index/query \
-d '{
  "from": 0,
  "size": 10,
  "query": {
    "location": {
      "lon": -110.6887945,
      "lat": 32.516891
     },
      "distance": "5mi",
      "field": "geo"
    },
  "sort": [
    {
      "by": "geo_distance",
      "field": "geo",
      "unit": "mi",
      "location": {
      "lon": -110.6887945,
      "lat": 32.516891
      }
    }
  ]
}'

i get

So why do i get errors and why am i missing docs in index ?

abhinav · March 10, 2021, 11:53pm

You see a message “Pindex not available” when the node that contains those pindexes is not reachable.
As your can see from the search response, you have 2 pindexes that aren’t available of the 6.

I’d look into the node that contains the inaccessible pindexes and make sure all the necessary ports are accessible for couchbase to run clean. Here’s documentation on the necessary ports …
https://docs.couchbase.com/server/6.6/install/install-ports.html#detailed-port-description

makeawish · March 11, 2021, 12:51am

I guess this might happen as it still builds the index, as this error is now gone even so when i ran the query it said 100% completed and the count was off.

and here is snapshot of response, i didnt change any setting on servers

sreeks · March 11, 2021, 3:49am

@makeawish,

Are you sure of any documents in the bucket qualifying the given query? if so, can you please share one such sample document here so that we can check that further?

Cheers!

makeawish · March 11, 2021, 4:26am

It seems like the Index reporting on the server is not correct as it originally reported around 650K docs and 100% completed and then a few hrs later it showed 1.2 Million docs and also 100% complete. This bucket has a feed which adds like 10K documents per day but no where near 600K in 2 hrs. Since i did not make any changes to index i can only asume it was not complete even so it reported complete.
FYI this runs on a 4 node cluster with version 6.6 also for some time 2 of the Pindex where not avail

sreeks · March 11, 2021, 4:51am

From the pindex “not available” issue, it looks like there were intermittent connection issues between the FTS nodes.
Now with the item count in the bucket showing variations, the chances for the above connectivity issues theory is higher even between your FTS and KV/data nodes.

At a given moment, the index progress is reported based on the stats from the bucket and index at that moment. If there are connectivity issues between nodes, then the source bucket numbers keep changing to show varying progress percentages.

How many documents are there in the bucket now? Was it only 660K (650+10K max for a day)?
Also, it sounds like nodes are hosting multiple services. (not very ideal for production clusters)

makeawish · March 11, 2021, 4:44pm

There was no connectivity issue as all nodes sit on the same 10G switch on same subnet. Also there was never a network issue as i have 3 System monitor it, Veeam One, VM Ops as well as Solarwinds.
The Total Doc count is around 1.2 Million Records in that bucket but in the early stage it reported arround 650K and a 100% index completed while the bucket had 1.2 million

My point is that somehow the reporting must have been incorrect