Slow index count


#1

We have a very large index where we match content by some standard fields.

Let’s say we have the following index:

CREATE INDEX ON `bucket`(`large_set`, `medium_set`, `small_set`) WHERE `x`=1

Would it be more efficient to refactor the index to be:

CREATE INDEX ON `bucket`(`small_set`, `medium_set`, `large_set`) WHERE `x`=1

Assuming that all of our queries filter on all of the data in the ON clause?

Our current problem is index counts timing out:

SELECT count(1) FROM `bucket` WHERE `x`=1 AND `large_set` = "L" AND  `medium_set` = "M"  AND `small_set` = "L"

This is currently giving us problems (timeouts) with Couchbase 4.5.1…

Thanks - H


#2

If you have equality predicates it doesn’t matter which order you defined. You can try increase index memory and even indexer timeout. cc @deepkaran.salooja
Check query plan it should be using the IndexCountScan. Count might be large. 5.5.0 EE has Index aggregation feature https://blog.couchbase.com/understanding-index-grouping-aggregation-couchbase-n1ql-query/


#3

Thanks. Indeed the query plan looks clean (this is modified to redact proprietary info, so I did the best I could):

[
  {
    "plan": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "IndexCountScan",
          "covers": [
            "cover ((`bucket`.`large_set`))",
            "cover ((`bucket`.`medium_set`))",
            "cover ((`bucket`.`small_set`))",
            "cover ((meta(`bucket`).`id`))"
          ],
          "index": "index",
          "index_id": "3c601261cc5ad742",
          "keyspace": "bucket",
          "namespace": "default",
          "spans": [
            {
              "Range": {
                "High": [
                  "\"L\"",
                  "\"M\"",
                  "successor("\"S\"))"
                ],
                "Inclusion": 1,
                "Low": [
                  "\"L\"",
                  "\"M\"",
                  "\"S\""
                ]
              }
            }
          ],
          "using": "gsi"
        },
        {
          "#operator": "IndexCountProject",
          "result_terms": [
            {
              "as": "count",
              "expr": "count(1)"
            }
          ]
        }
      ]
    },
    "text": "SELECT count(1) as count FROM `bucket` WHERE `large_set` = \"L\" AND `medium_set` = \"M\" AND `small_set` = \"S\" AND ...."
  }
]