Array index not being used by query

Hi. I have a document as :

{
"firstName": "Mark",
"lastName": "Summers",
"gender": "M",
"phone": "+1 1769866284",
"birthDate ": "1977-03-10",
"type": "Visitor",
"visits": [
{
"startDate": "2017-12-16T06:12:00",
"endDate": "2017-12-16T21:30:00",
"departmentCode": "HS"
},
{
"startDate": "2017-12-02T06:12:00",
"endDate": "2017-12-02T21:30:00",
"departmentCode": "DL"
}
]
}

The indexes are :

  • CREATE INDEX IX_type_test on test(type)
  • CREATE INDEX IX_departmentCode_startDate_endDate_test ON test((distinct (array [(vs1.departmentCode), date_format_str((vs1.startDate), “1111-11-11”), date_format_str((vs1.endDate), “1111-11-11”)] for vs1 in visits END))) WHERE (type = “Visitor”)

And the query I am using is:

SELECT meta(v).id, v.firstName, v.middleName, v.lastName, v.birthDate, v.genderCode, v.phone
from test v
WHERE 
v.type = 'Visitor'
AND ANY visit IN v.visits SATISFIES 
visit.departmentCode= "HS" 
AND DATE_FORMAT_STR(visit.startDate,"1111-11-11") 
BETWEEN "2017-12-13" AND "2017-12-21"
AND DATE_FORMAT_STR(visit.endDate,"1111-11-11") BETWEEN "2017-12-13"
AND "2017-12-21"
END) 
;

This query is only using ‘type’ index to fetch data. Can anyone please suggest me what is wrong with the array index?

The index doesn’t qualify for query.

Your index has entries [(vs1.departmentCode), date_format_str((vs1.startDate), “1111-11-11”), date_format_str((vs1.endDate), “1111-11-11”)]. So SATISFIES caluse need to have some thing like this.
SATISFIES [(vs1.departmentCode), date_format_str((vs1.startDate), “1111-11-11”), date_format_str((vs1.endDate), “1111-11-11”)] = [“HS”,“2017-12-13”,“2017-12-13”] END
You can only use equality predicate.

The best option you have create array index on single key and apply rest as post scan filter.

This won’t help much as almost all the times, the filtering is done based on all these three fields in this collection. Can’t we make the array flat (something like unnest) and then apply index on these fields like other regular fields?

Hi @vsr1,

Tried covering array index and it has improved the query execution time significantly:

CREATE INDEX IX_visits_test
ON test(DISTINCT ARRAY v.departmentCode FOR v IN visits END, visits) WHERE (type = “Visitor”);

This index covers the all the filters done based on the collection attributes. One more question though: Is adding the fields that are coming in select statement to this index a good idea?

To make query covered you can add (add projections fields as trailing index keys). Run the query and see which one performs better.

when I unnest the visits collection, and select without any predicate based on collection fields, the index is not used. Is it the expected behavior from these type of indexes?
I read somewhere that this index covers UNNEST as well.

Any Index to use query required predicate.

For UNNEST use the index
         * Array Index key needs to be leading Index key
         * Array Index key requires ALL keyword NO DISTINCT keyword (Unnest semantics does not eliminate duplicates)
         * Unnest Alias needs to be matched with Array Index variable
         * Predicate must contain  ARAAY variable and WHEN clause.

Example:

CREATE INDEX IX_visits_test
ON test(ALL ARRAY v.departmentCode FOR v IN visits END, visits) WHERE (type = "Visitor");

SELECT v FROM IX_visits_test AS d UNNEST d.visits AS v WHERE d.type = "Visitor" AND v.departmentCode "HS";

If those doesn’t match you can use Array IndexScan by providing ANY predicate and apply Unnest post scan

Example:

 CREATE INDEX IX_visits_test
    ON test(DISTINCT ARRAY v.departmentCode FOR v IN visits END, visits) WHERE (type = "Visitor");

 SELECT v FROM IX_visits_test AS d UNNEST d.visits AS v WHERE d.type = "Visitor" AND ANY v IN d.visits SATISFIES v.departmentCode  = "HS" END;

The use index predicate must contain leading index key. The reason is Indexer does not maintain data if the leading index key value evaluates to MISSING. Without predicate on leading key query wants to results of MISSING on that field, so index will not be chosen. You can provide explicit predicate WHERE <leading key> IS NOT MISSING