How many Indexes should one use and how to create them


#1

The below doc is a sample of one of 100 K doc in my store which are of type tract::id.
I created secondary index on the common fields like type, id, city etc. and it works pretty well when query by that.
But here is where i think i need some more index’s when i want to get all docs which have models with 4 + Bath or with pool, or where the house is on a certain street as i can get that info but it takes quite some time. Also as most my nbr’s are stored in a string i use the tonumber fuction , do i pay a huge penalty special on sort or select ?

So how would i create an secondary Index for the Street element which only stores array of string and as well as for models.Bath and Models.Pool

thanks

{
"type": "tractInfo",
"id": "001c4222-fbf3-4d74-b87d-882d7dc704df",
"tractid": "771",
"county": "Orange",
"Tract Name": "Gallery at Sorrento",
"B/T Code": "GALRY",
"City": "Cypress",
"City Code": "CYP",
"Area": "Cypress North of Katella",
"Area Code": "80",
"Nbhd": "Sorrento",
"Zip Code": "90630",
"created_on": "2018-10-12T00:31:37.828Z",
"created_by": "ap",
"feature": {
"Type": "SFR",
"Units": "",
"Attached": "No",
"Gated": "No",
"Pool": "No",
"Spa": "No",
"Tennis": "No",
"Golf": "No",
"55 +": "No",
"Builder": "Cypress Homes - Fieldstone",
"Notes": "BROCHURE BUTTON\nSUBSCRIBERS: After clicking on the Premium Report, a BROCHURE button will appear. Click on that TAB and PDF documents will appear with all of our information on this project. After that, you may either print or save the whole document or print one page at a time."
},
"street": [
 "Corso Cir",
"Linaro Dr",
"Ravari Dr",
"Ravenna Way",
"Tremezzo Dr (models)",
"Trevi St"
],
"model": [
    {
  "Model Name": "Plan One",
  "Model Code": "A",
  "Bed": "4",
  "Bath": "2.50",
  "SqFt": "2424",
  "Levls": "B",
  "Fam": "Yes",
  "Din": "DA",
  "Nook": "Yes",
  "Den": "No",
  "Bon": "No",
  "Loft": "No",
  "Fpl": "1",
  "Util": "IR",
  "Gar": "3"
},
{
  "Model Name": "Plan Two",
  "Model Code": "B",
  "Bed": "4",
  "Bath": "3.00",
  "SqFt": "2614",
  "Levls": "B",
  "Fam": "Yes",
  "Din": "DA",
  "Nook": "Yes",
  "Den": "No",
  "Bon": "No",
  "Loft": "No",
  "Fpl": "1",
  "Util": "IR",
  "Gar": "3"
},
{
  "Model Name": "Plan Three",
  "Model Code": "C",
  "Bed": "5",
  "Bath": "3.00",
  "SqFt": "2900",
  "Levls": "B",
  "Fam": "Yes",
  "Din": "DA",
  "Nook": "Yes",
  "Den": "No",
  "Bon": "No",
  "Loft": "No",
  "Fpl": "1",
  "Util": "IR",
  "Gar": "3"
}
]

#2

There is no limit on how many indexes you can create. If you know what field you are searching and limited you can create as you mentioned.

If you want to search on random field and want better performance may be Full Text Search might be better by indexing on whole document and querying.

If you are predicate is using TONUMBER(x) , index key also includes TONUMBER(x) the predicate will pushed to indexer and performs better.

For ARRAYS use ARRAY indexing.

https://docs.couchbase.com/server/5.5/n1ql/n1ql-language-reference/adaptive-indexing.html

Also useful info available here

https://blog.couchbase.com/a-guide-to-n1ql-features-in-couchbase-5-5-special-edition/


#3

Ok i went and created a basic index on street Array

CREATE INDEX `idx_tractInfo_street` ON `default`((distinct (`street`))) WHERE (`type` = "tractInfo")

if i now run the following query

SELECT d.tractid,d.City,d.type from default d
UNNEST d.street s
where s like '%Aloma%'

i don’t see large performance boost with the array index. I would have expected this to be much faster.

Without Index it takes around 608 ms for 4K documents which have anywhere from 1 to 15 streets in array .
Also it seems if i execute the same query over and over it sometimes hangs and takes more then a sec up to 5.
here is my plan with index

{
"plan": {
"#operator": "Sequence",
"~children": [
  {
    "#operator": "PrimaryScan3",
    "as": "d",
    "index": "undefined",
    "index_projection": {
      "primary_key": true
    },
    "keyspace": "default",
    "namespace": "default",
    "using": "gsi"
  },
  {
    "#operator": "Fetch",
    "as": "d",
    "keyspace": "default",
    "namespace": "default"
  },
  {
    "#operator": "Parallel",
    "~child": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "Unnest",
          "as": "s",
          "expr": "(`d`.`street`)"
        }
      ]
    }
  },
  {
    "#operator": "Parallel",
    "~child": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "Filter",
          "condition": "(`s` like \"%Aloma%\")"
        },
        {
          "#operator": "InitialProject",
          "result_terms": [
            {
              "expr": "(`d`.`tractid`)"
            },
            {
              "expr": "(`d`.`City`)"
            },
            {
              "expr": "(`d`.`type`)"
            }
          ]
        },
        {
          "#operator": "FinalProject"
        }
      ]
    }
  }
]
},
"text": "SELECT d.tractid,d.City,d.type from default d\nUNNEST d.street s\nwhere s like '%Aloma%'"
}

#4

Your index (partial index) only has data for (type = “tractInfo”) and query doesn’t have predicate (d.type = “tractInfo”) i.e asking for all the type. So the index will not qualify.

CREATE INDEX `idx_tractInfo_street` ON `default`( DISTINCT street, traceid, City) 
WHERE (`type` = "tractInfo");

SELECT d.tractid,d.City,d.type from default d
UNNEST d.street s
WHERE  s like '%Aloma% AND d.type = "tractInfo";

 SELECT d.tractid,d.City,d.type from default d
WHERE  ANY s  IN d.street SATISIFIES s LIKE '%Aloma% END AND d.type = "tractInfo";

Due to leading % in like you may not see major performance.


#5

Ok adding the d.type to where clause makes a huge difference as long as i don’t use '%like if i use like '%name% the performance is worse then without any array index. if i do ‘name%’ it is almost as fast as =‘name’.


#6

when you have leading % in like query service needs to get all the data from index and do filter on post index scan.

Checkout Example 17, 18 https://docs.couchbase.com/server/5.5/performance/index-scans.html#ex17-like-1

Also Checkout https://dzone.com/articles/a-couchbase-index-technique-for-like-predicates-wi