Count is taking long time

Hello,
I have a 3 machine cluster with 32Gb Ram and 16 cpu cores each.

I have a bucket with 9M records.
The json is simple, follows an example:

{
“notificationId”: “0000052c-7c27-4a37-bf67-9d7e9536094b”,
“createdDate”: “2016-02-25T11:26:19.1492023Z”,
“propertyId”: 2574,
“channelId”: 1,
“readedBy”: null,
“isReaded”: false,
“readedDate”: null

  • 10 simple fields
    }

I have an index:
CREATE INDEX portalevents_channelId_propertyId ON PortalEvents(channelId,propertyId) USING GSI

When i query the results like this:
select * from PortalEvents where channelId = 252 offset 50 limit 50;

It’s very fast, like 30ms.

But when i do a count:
select count() from PortalEvents where channelId = 252;
or
select count(channelId) from PortalEvents where channelId = 252;

It takes 3 seconds,
I did the explain, and it’s using the index.

explain select count(channelId) from PortalEvents where channelId = 252;
{
“requestID”: “d4caed73-f28d-427e-afba-0e5479da9853”,
“signature”: “json”,
“results”: [
{
"#operator": “Sequence”,
"~children": [
{
"#operator": “IndexScan”,
“index”: “portalevents_channelId_propertyId”,
“keyspace”: “PortalEvents”,
“limit”: 9.223372036854776e+18,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
“successor(252)”
],
“Inclusion”: 1,
“Low”: [
“252”
]
},
“Seek”: null
}
],
“using”: “gsi”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Fetch”,
“keyspace”: “PortalEvents”,
“namespace”: “default”
},
{
"#operator": “Filter”,
“condition”: “((PortalEvents.channelId) = 252)”
},
{
"#operator": “InitialGroup”,
“aggregates”: [
“count((PortalEvents.channelId))”
],
“group_keys”: []
}
]
}
},
{
"#operator": “IntermediateGroup”,
“aggregates”: [
“count((PortalEvents.channelId))”
],
“group_keys”: []
},
{
"#operator": “FinalGroup”,
“aggregates”: [
“count((PortalEvents.channelId))”
],
“group_keys”: []
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “count((PortalEvents.channelId))”
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
}
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “3.560815ms”,
“executionTime”: “3.455688ms”,
“resultCount”: 1,
“resultSize”: 3166
}
}

So someone have any idea why is this so slow?
How can i improve the count query?

Thanks

which version are you using? I’d recommend being in 4.1 with covering indexes for the query in explain. you will get a covering index and that will dramatically quicken the execution.
thanks
-cihan

i’m using Couchbase Server Community Edition 4.0.0.
Covering indexes is on the 4.1 Enterprise version only, right?

9M is our current number of records, but this number grows 500K +/- per day.
All the data is in memory. The cpu doesn’t go higher than 15%, and 15% is when i send multiple queries at the same time.

Other then the covering indexes is there anything i can do to improve this query?
i believe this time is going to imcrease every day.

Thanks

Covering indexes is available on Couchbase Server 4.1 Community Edition.

CREATE INDEX portalevents_channelId_propertyId ON PortalEvents(channelId,propertyId) USING GSI;

select count(1) from PortalEvents where channelId = 252;

The above count query is covered. In upcoming 4.5 Beta, Instead of IndexScan we do IndexCountScan by pushing the count to indexer. This should improve timing.

Also we have following additional improvements.
If query permits push limit, offset to indexer.
If query permits query with order will exploit index order.

Example:
select * from PortalEvents where channelId = 252 offset 50 limit 50;

      In above query offset + limit is pushed to indexer and indexer only produces 100
     records,  on result we apply offset and limit.

select * from PortalEvents where channelId = 252 order by channelId offset 50 limit 50;

      In above query, query order matches with index order. We will exploit index order 
     and eliminate query order. As result, offset + limit is pushed to indexer and indexer
     only produces 100 records,  then we apply offset and limit.
1 Like

We plan to release community edition for 4.1 in a few months and covering indexes is not an EE only feature so you will have that. However Community edition has other throughput limitations under highly concurrent query executions. For the best performance I’d recommend being on enterprise edition. I should also mention enterprise edition is free for development and testing.

For community edition, you can add more nodes to get better throughput.
thanks
-cihan

1 Like

Thanks for the reply,
I believe that adding another machine that wont do the trick.
So my options are waiting for the Community Edition 4.1 or 4.5. Or go to Enterprise Edition, is that right?
In a few month for Community Edition is 2, 3? it’s possible to know for sure?

Thank you

yes - 4.1 released in Dec so it will likely be around May when CE shows up. you can test with enterprise edition to see if it covering indexes help with the execution latency now.
thanks
-cihan

We can get you a build before that. Tell us what you need.

Hi,
If i can get the 4.1 CE version i will have access to the covered indexes and reduce the time in the count query like cihangirb said.
So if i can have a build of that version, to update my servers, it will be great.

Thank you very mutch,
Best Regards

Can you download the EE version, or are you using this in production?

I’m using in production.

Hello,
So geraldss it’s possible to have a 4.1CE before May?
It would help me a lot, because i now have 15M records and the count time continuous to increase.
Thanks

Best Regards
Rafael Félix

Hi @rafael.felix,
I’ll be happy to let you know as soon as 4.1 CE gets out. Have you had a chance to try the covering index in 4.1 or 4.5 to see if it is helping with query latency? one option you could evaluate is to run the developer edition of 4.5 that is available on downloads until we could get the CE edition for you.
thanks
-cihan

1 Like

It’s possible to update from 4.0 CE to 4.5? or a new fresh installation is needed?

unfortunately preview does not support upgrades so not fully tested. I’d recommend new cluster and XDCR to move your data to 4.5
thanks
-cihan

by the way I am cihan@couchbase.com if you need help during the upgrade. communicating through forums is also fine.
thanks
-cihan

Hi again,
This weekend I have taken the time to test the Couchbase Server 4.5 DP.

In my development environment I set up 3 machines with the same hardware and software and different Couchbase version / Configuration.

Machine 1 – Couchbase Server 4.0 CE
Machine 2 – Couchbase Server 4.5 Enterprise DP – Global Indexes
Machine 3 – Couchbase Server 4.5 Enterprise DP – With Memory-Optimized Indexes

I put on each machine exactly the same documents.
The document structure is:
{
“notificationId”: “0191aea8-e181-4c8a-a45a-5cdd7a1430e4”,
“logId”: “6bdea683-ae6c-421c-b604-cdc4a223fa3a”,
“createdDate”: “2016-03-18T18:19:24.8221794Z”,
“action”: 3,
“subActions”: [
2,
3
],
“operation”: 2,
“propertyId”: 1881,
“propertyName”: “Some Name”,
“channelId”: 379,
“dates”: [
{
“dateFrom”: “2016-03-18T00:00:00”,
“dateTo”: “2016-03-31T00:00:00”
}
],
“rates”: “rate 1”,
“rooms”: “Q1”,
“prices”: “110.00€”,
“allotment”: “100”,
“closeSales”: 0,
“readedBy”: null,
“isReaded”: false,
“readedDate”: null
}

To perform the count test I choose the channelId property.
The number of document for channelIds is:
Id 1000 -> 100000;
Id 1001 -> 200000;
Id 1002 -> 300000;
Id 1003 -> 400000;
Total Number of documents is 1000000.
Every document was in memory.

Then I run the following queries:

Machine 1:
Explain Select count(1) from PortalEvents where channelId = 1000;
{
“requestID”: “869914b1-5cba-42cf-9729-6c8c31801de9”,
“signature”: “json”,
“results”: [
{
"#operator": “Sequence”,
"~children": [
{
"#operator": “IndexScan”,
“index”: “portalevents_channelId_propertyId”,
“keyspace”: “PortalEvents”,
“limit”: 9.223372036854776e+18,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
“successor(1000)”
],
“Inclusion”: 1,
“Low”: [
“1000”
]
},
“Seek”: null
}
],
“using”: “gsi”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Fetch”,
“keyspace”: “PortalEvents”,
“namespace”: “default”
},
{
"#operator": “Filter”,
“condition”: “((PortalEvents.channelId) = 1000)”
},
{
"#operator": “InitialGroup”,
“aggregates”: [
“count(1)”
],
“group_keys”: []
}
]
}
},
{
"#operator": “IntermediateGroup”,
“aggregates”: [
“count(1)”
],
“group_keys”: []
},
{
"#operator": “FinalGroup”,
“aggregates”: [
“count(1)”
],
“group_keys”: []
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “count(1)”
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
}
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “2.872714ms”,
“executionTime”: “2.804859ms”,
“resultCount”: 1,
“resultSize”: 3061
}
}

Machine 2/3:
Explain Select count(1) from PortalEvents where channelId = 1000;
[
{
"#operator": “Sequence”,
"~children": [
{
"#operator": “IndexScan”,
“covers”: [
“cover ((PortalEvents.channelId))”,
“cover ((PortalEvents.propertyId))”,
“cover ((meta(PortalEvents).id))”
],
“index”: “portalevents_channelId_propertyId”,
“keyspace”: “PortalEvents”,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
“successor(1000)”
],
“Inclusion”: 1,
“Low”: [
“1000”
]
}
}
],
“using”: “gsi”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Filter”,
“condition”: “(cover ((PortalEvents.channelId)) = 1000)”
},
{
"#operator": “InitialGroup”,
“aggregates”: [
“count(1)”
],
“group_keys”: []
}
]
}
},
{
"#operator": “IntermediateGroup”,
“aggregates”: [
“count(1)”
],
“group_keys”: []
},
{
"#operator": “FinalGroup”,
“aggregates”: [
“count(1)”
],
“group_keys”: []
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “count(1)”
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
}
]

Then I execute the following queries 11 times each and get the average time.
Note: I removed the first query time, because it was a lot slower than the other 10.

Machine 1:
Select count(1) from PortalEvents where channelId = 1000; - Average 11.24s
Select count(1) from PortalEvents where channelId = 1001; - Average 23.83s
Select count(1) from PortalEvents where channelId = 1002; - Average 34.42s
Select count(1) from PortalEvents where channelId = 1003; - Average 51.30s

Machine 2:
Select count(1) from PortalEvents where channelId = 1000; - Average 1.488s
Select count(1) from PortalEvents where channelId = 1001; - Average 2.907s
Select count(1) from PortalEvents where channelId = 1002; - Average 4.237s
Select count(1) from PortalEvents where channelId = 1003; - Average 5.435s

Machine 3:
Select count(1) from PortalEvents where channelId = 1000; - Average 1.087s
Select count(1) from PortalEvents where channelId = 1001; - Average 2.115s
Select count(1) from PortalEvents where channelId = 1002; - Average 3.25s
Select count(1) from PortalEvents where channelId = 1003; - Average 4.108s

All the machines are virtual machines and does not have the production resources, so we will focus on percentage only.

Results:
From Machine 1 (CB 4.0 CE) To Machine 2 (CB 4.5 Enterprise DP) we can see an average improvement of 87.91%.

From Machine 2 (CB 4.5 Enterprise DP) To Machine 3 (CB 4.5 Enterprise DP – Memory-Optimized Index) we can see an average improvement of 25.48%.

As we can see from Coucbhase Server 4.0 CE to Couchbase Server 4.5DP there is a great improvement.
My production environment was taking 5.30s to count 90000 documents on 9000000. With the new version it will take +/- 600ms, if my math are correct. This is very good news.

What I didn’t like to see was that:
From Machine 2 channelId 1000 (count return 100000) to Machine 2 channelId 1001 (count return 200000) we can see an increase in 95.36%

From Machine 2 channelId 1001 (count return 200000) to Machine 2 channelId 1002 (count return 300000) we can see an increase in 45.75%

From Machine 2 channelId 1002 (count return 300000) to Machine 2 channelId 1003 (count return 400000) we can see an increase in 28.27%

We probably need a bigger case test with more documents, but from this we can see that the count time increase substantially when number of documents increase.

Do you have some more numbers on this? Is this being taking in consideration in the 4.5 version?
Can I expect more improvements on the count query?

I’m really happy with the improvement on the new version, is there a release date for the Community Edition for the 4.5 version?

Thank you
Best Regards

As number of qualifying documents increases ( channelId 1000 qualifies 100,000 documents, channelID 1001 qualifies 200,000 documents, etc) the indexer needs to spend more time to produce the keys and query requires more time to process the documents. Due to that time varies for various filter values.

In upcoming 4.5 Beta utilizes Index Count Scan (indexer does count, eliminates sending the keys to the query engine for processing) in certain cases and improves timing on the count queries.

Hi, we delay the community edition by a few months. The criteria is detailed here: http://www.couchbase.com/editions. We expect 4.5 in late Q2. Obviously timelines can shift as we release only when a certain quality bar is reached. You can expect CE in Q4 with this plan.
thanks
-cihan

Hello,
The index count scan sounds good.
I will wait for the beta version to test that.
Thank you for all the help.

Best Regards