Slow count query


#1

Hi, we recently started with couchbase and n1ql is the best but we have some problem with pagination.
We use couchbase community edition 5 with one bucket and 2.000.000 documents.

This is the problem:
if we run query "SELECT COUNT(*) FROM bucket" return in 11.83ms, but if we run query “SELECT COUNT(*) FROM bucket WHERE attr_name=1” server response in 17.38s.
All the documents have attr_name equal to 1 and it’s indexed. On production we will have 30.000.000 docs.

Indexes on server :

  1. "CREATE PRIMARY INDEX index-name ON bucket"
  2. “CREATE INDEX second-index ON bucket(attr_name)”

How can i reduce the query count time?


#2

SELECT COUNT(*) FROM bucket;
Above query has no predicate that means it need to count all the documents in the bucket. So it uses bucket statistics. so it is fast.

SELECT COUNT(*) FROM bucket WHERE attr_name=1;
CREATE INDEX second-index ON bucket(attr_name)

Uses IndexCountScan which uses index to count the values. Which is taking time.


#3

ok, how can i count faster?


#4

is there any way to speed up the count process?


#5

Do EXPLAN and check if it uses IndexCountScan. If already using other options using MOI Indexes (Requires EE version)


#6

If we add 2 more index servers and create index with num_replica 2, it will be faster? Or maybe a bigger server with more cors and rams?

{
  "plan": {
    "#operator": "Sequence",
    "~children": [
      {
        "#operator": "IndexCountScan2",
        "covers": [
          "cover ((`logs`.`id_domain`))",
          "cover ((meta(`logs`).`id`))"
        ],
        "index": "logs_domain",
        "index_id": "d95775b56506cbb5",
        "keyspace": "logs",
        "namespace": "default",
        "spans": [
          {
            "exact": true,
            "range": [
              {
                "high": "1",
                "inclusion": 3,
                "low": "1"
              }
            ]
          }
        ],
        "using": "gsi"
      },
      {
        "#operator": "IndexCountProject",
        "result_terms": [
          {
            "expr": "count(*)"
          }
        ]
      }
    ]
  },
  "text": "SELECT COUNT(*) FROM logs WHERE id_domain=1;"
}

#7

Adding more replica would not improve the scan latency of a single scan. You can try with MOI(memory optimized indexes) in EE version or use partitioned indexes(in Couchbase Server 5.5 EE) which would process the query in parallel across partitions and lead to much improved scan latency.