Indexing on specific Fields in an Array of objects

{
  "animals": [
    {
      "type": "mammal",
      "species": "dog",
      "name": "FIDO"
    },
    {
      "type": "reptile",
      "species": "lizard",
      "name": "GEICO"
    },
    {
      "type": "mammal",
      "species": "cat",
      "name": "DUSTY"
    }
  ]
}

Let’s say I have JSON documents of animals in various shelters (example above).
I want to create index on all documents that contain a mammal and also contain a specific species of animal (user-inputted).
Essentially, the query would look something like this:

SELECT name
FROM shelters
USE INDEX(shelter_idx)
WHERE ANY animal in animals SATISFIES animal.type = "mammal"
AND animal.species = ${species_value} END # this could be "cat" for example
limit 100;

My idea for an index would look something like this:

CREATE INDEX `shelter_idx`
  ON `bucket`(DISTINCT ARRAY animal.type FOR animal IN animals WHEN animal.type = "mammal"
    AND animal.species END,
    name)
  WHERE animals IS VALUED
  AND name IS VALUED;

I’m honestly wondering if this can be simplified/improved past what I have currently and any suggestions/improvements on the topic would be appreciated!

Also I’ve looked online and found some relatively similar forum posts, but if anyone knows of a question similar to mine, a link would be appreciated as well!

Please try the index below:
CREATE INDEX shelter_idx
ON shelters(DISTINCT ARRAY animal.species FOR animal IN animals WHEN animal.type = “mammal” END,
name);
And refer to the execution plan as follows:
{
#operator”: “Sequence”,
“~children”: [
{
#operator”: “Sequence”,
“~children”: [
{
#operator”: “DistinctScan”,
“limit”: “100”,
“scan”: {
#operator”: “IndexScan3”,
“covers”: [
“cover ((distinct (array (animal.species) for animal in (shelters.animals) when ((animal.type) = “mammal”) end)))”,
“cover ((shelters.name))”,
“cover ((meta(shelters).id))”
],
“filter_covers”: {
“cover (any animal in (shelters.animals) satisfies (((animal.type) = “mammal”) and ((animal.species) = “cat”)) end)”: true
},
“index”: “shelter_idx”,
“index_id”: “b123cb17bd3ecda8”,
“index_projection”: {
“entry_keys”: [
1
],
“primary_key”: true
},
“keyspace”: “shelters”,
“namespace”: “default”,
“spans”: [
{
“exact”: true,
“range”: [
{
“high”: ““cat””,
“inclusion”: 3,
“low”: ““cat””
}
]
}
],
“using”: “gsi”
}
},
{
#operator”: “Parallel”,
“~child”: {
#operator”: “Sequence”,
“~children”: [
{
#operator”: “Filter”,
“condition”: “cover (any animal in (shelters.animals) satisfies (((animal.type) = “mammal”) and ((animal.species) = “cat”)) end)”
},
{
#operator”: “InitialProject”,
“result_terms”: [
{
“expr”: “cover ((shelters.name))”
}
]
},
{
#operator”: “FinalProject”
}
]
}
}
]
},
{
#operator”: “Limit”,
“expr”: “100”
}
]
}

1 Like

Sorry for the late reply, but I’ll try this, thanks!