Poor query performance on Couchbase 5.0.1 CE

query
n1ql

#1

I’m evaluating Couchbase CE 5.0.1 (build 5003), I have a 4 nodes cluster on AWS (r4.2xlarge) each with 1TB volume and 61GB RAM.

I have a bucket with ~3M documents and all relevant fields are indexed using GSI.
Any COUNT query or any query with criteria or order clause takes a long time (e.g. Select Count(*) query takes over 18 seconds).

Looking at the query plan shows that the query is based on index scan only. I think all the index is in the RAM, there is no swap and yet, this is extremely poor performance (MySQL and PostgreSQL outperform Couchbase in these types of queries).

How can one optimize the query engine? is this the top performance of Couchbase?

Here is the explain plan results:

{
  "plan": {
    "#operator": "Sequence",
    "~children": [
      {
        "#operator": "IndexCountScan2",
        "covers": [
          "cover ((`innovi-events`.`_type`))",
          "cover ((meta(`innovi-events`).`id`))"
        ],
        "index": "idx_type",
        "index_id": "fb34f583f631a2b9",
        "keyspace": "innovi-events",
        "namespace": "default",
        "spans": [
          {
            "exact": true,
            "range": [
              {
                "high": "\"event\"",
                "inclusion": 3,
                "low": "\"event\""
              }
            ]
          }
        ],
        "using": "gsi"
      },
      {
        "#operator": "IndexCountProject",
        "result_terms": [
          {
            "as": "cnt",
            "expr": "count(*)"
          }
        ]
      }
    ]
  },
  "text": "SELECT count(*) cnt FROM `innovi-events` WHERE (_type = 'event')"
}

#2

Query plan is right, cc @deepkaran.salooja


#3

Hi @motty.cohen,

with Couchbase Community Edition, you’re still using forest-db which is a bit older/slower engine.
Even with the index in the RAM, it’s a different data structure than the the ones we have in the enterprise edition: MOI or Plasma. I expect them to be much faster. You can test with 5.0.1 EE.


#4

Thank you @keshav_m,
I’ll try it, but I was expecting the CE edition to be much faster than any other open source NoSQL DB (e.g. Cassandra / Elastic etc) or even RDBMS.
In this simple use case (query indexed documents using criteria and order) it doesn’t seem like Couchbase keeps the promise “World’s fastest NoSQL Database”. I’m disappointed.


#5

@motty.cohen, what is the Index RAM Quota? For good performance, the index ram quota needs to be greater than the sum of “indexed data size” of all indexes on a node.


#6

Index quota is 20GB, Im sure it is greater than the sum of indexed data.
How can I measure the sum of all indexed data? How can I determine if during index scan the query service performs memory swap?


#7

Each index has a “data_size” stat published on the UI mini-graphs. You can add those up. If all the indexed data is memory resident, then using EE storage engine is the only other option.