Slow count query

grempelsgabriel · June 25, 2018, 6:42am

Hi, we recently started with couchbase and n1ql is the best but we have some problem with pagination.
We use couchbase community edition 5 with one bucket and 2.000.000 documents.

This is the problem:
if we run query "SELECT COUNT(*) FROM bucket" return in 11.83ms, but if we run query “SELECT COUNT(*) FROM bucket WHERE attr_name=1” server response in 17.38s.
All the documents have attr_name equal to 1 and it’s indexed. On production we will have 30.000.000 docs.

Indexes on server :

"CREATE PRIMARY INDEX index-name ON bucket"
“CREATE INDEX second-index ON bucket(attr_name)”

How can i reduce the query count time?

vsr1 · June 25, 2018, 12:51pm

SELECT COUNT(*) FROM bucket;
Above query has no predicate that means it need to count all the documents in the bucket. So it uses bucket statistics. so it is fast.

SELECT COUNT(*) FROM bucket WHERE attr_name=1;
CREATE INDEX second-index ON bucket(attr_name)

Uses IndexCountScan which uses index to count the values. Which is taking time.

grempelsgabriel · June 25, 2018, 5:40pm

ok, how can i count faster?

grempelsgabriel · June 27, 2018, 4:37am

is there any way to speed up the count process?

vsr1 · June 27, 2018, 5:04am

Do EXPLAN and check if it uses IndexCountScan. If already using other options using MOI Indexes (Requires EE version)

grempelsgabriel · June 27, 2018, 5:36am

If we add 2 more index servers and create index with num_replica 2, it will be faster? Or maybe a bigger server with more cors and rams?

{
  "plan": {
    "#operator": "Sequence",
    "~children": [
      {
        "#operator": "IndexCountScan2",
        "covers": [
          "cover ((`logs`.`id_domain`))",
          "cover ((meta(`logs`).`id`))"
        ],
        "index": "logs_domain",
        "index_id": "d95775b56506cbb5",
        "keyspace": "logs",
        "namespace": "default",
        "spans": [
          {
            "exact": true,
            "range": [
              {
                "high": "1",
                "inclusion": 3,
                "low": "1"
              }
            ]
          }
        ],
        "using": "gsi"
      },
      {
        "#operator": "IndexCountProject",
        "result_terms": [
          {
            "expr": "count(*)"
          }
        ]
      }
    ]
  },
  "text": "SELECT COUNT(*) FROM logs WHERE id_domain=1;"
}

deepkaran.salooja · June 27, 2018, 9:09pm

Adding more replica would not improve the scan latency of a single scan. You can try with MOI(memory optimized indexes) in EE version or use partitioned indexes(in Couchbase Server 5.5 EE) which would process the query in parallel across partitions and lead to much improved scan latency.

alpcan.yildiz · December 19, 2018, 8:18pm

Did you solve the problem ?