Index RAM used is at 0B

shay · March 18, 2019, 2:26pm

Hi, I am running a 2 nodes CB 6.0 CE - getting constant index scan timeouts after 8 minutes.
looking at the stats UI page I see that the “Index RAM used” is at 100% but when looking under each Index it says that the index memory used is at 0B (zero)
all indexes are in ready state , I don’t think this is a healthy behavior (even thought that’s a fresh install) as I see other CB installs behave differently, what can be the cause of the 0B memory usage by the index and is it related to the index scan timeout?
thanks
Shay

jeelan.poola · March 19, 2019, 1:57pm

@shay

Could you please share the indexer.log files from all nodes to help analyse the issue? We may need full cbcollect if indexer.log files do not shed light on what is going wrong . So it would be good to do cbcollect and save it somewhere in case we need it later (You can find instructions on how to collect logs at https://docs.couchbase.com/server/6.0/cli/cbcollect-info-tool.html)

deepkaran.salooja · March 19, 2019, 8:47pm

@shay, index level “memory used” stat is available for EE storage engine only. For CE, it will always be 0 and can be ignored.

You may want to check how much is the memory quota assigned to index service(UI->Settings) and increase that depending on how much data has been indexed. It would also be good to check how many rows the query needs to scan. Full index scans can take longer duration.

shay · March 20, 2019, 6:38am

Thanks, I have 40M rows, all where and group by used by the query are indexed, index service has 4GB or ram to it use out of 120GB RAM for each machine, very short documents, I’d really expect CB to perform better with those numbers even with full index scan. I will try to share indexer log as @jeelan.poola suggested.
thank you @deepkaran.salooja!

shay · March 20, 2019, 12:14pm

there is nothing special in the indexer logs, but last time i dropped an index I had to remove it manually from the FS other wise it was just stuck, same thing now,after dropping and try to build new indexes, no index is really getting build

shay · March 20, 2019, 2:45pm

this is what I see in the indexer log in debug level its repeated all the time:
2019-03-20T14:43:17.193+00:00 [Error] KVSender::openMutationStream MAINT_STREAM metrics Error from Projector feed.feeder
2019-03-20T14:43:17.193+00:00 [Error] Indexer::startBucketStream Stream MAINT_STREAM Bucket metrics Error from Projector feed.feeder. Retrying 16.
2019-03-20T14:43:22.352+00:00 [Error] KVSender::sendMutationTopicRequest Projector dap-dev-metricsdb-001.pipl.pro:9999 Topic MAINT_STREAM_TOPIC_a4ef00897d739a0c06bb52a646069f53 metrics
Unexpected Error feed.feeder

projetor log shows the following errors:

[Error] FEED[<=>MAINT_STREAM_TOPIC_a4ef00897d739a0c06bb52a646069f53(127.0.0.1:8091)] ##1052 OpenBucketFeed(“metrics”): dcp.invalidBucket
2019-03-20T14:44:26.075+00:00 [Error] DCP[secidx:proj-metrics-MAINT_STREAM_TOPIC_a4ef00897d739a0c06bb52a646069f53-5615393788131434779] ##1053 DcpFeed::connectToNodes StartDcpFeed failed for secidx:proj-metrics-MAINT_STREAM_TOPIC_a4ef00897d739a0c06bb52a646069f53-5615393788131434779/0 with err EOF
2019-03-20T14:44:26.075+00:00 [Error] DCP[secidx:proj-metrics-MAINT_STREAM_TOPIC_a4ef00897d739a0c06bb52a646069f53-5615393788131434779] ##1053 Bucket::StartDcpFeedOver : error dcp.invalidFeed in connectToNodes
2019-03-20T14:44:26.075+00:00 [Error] FEED[<=>MAINT_STREAM_TOPIC_a4ef00897d739a0c06bb52a646069f53(127.0.0.1:8091)] ##1053 OpenBucketFeed(“metrics”): dcp.invalidBucket
2019-03-20T14:44:32.312+00:00 [Error] DCP[secidx:proj-metrics-MAINT_STREAM_TOPIC_a4ef00897d739a0c06bb52a646069f53-15874611169441861585] ##1054 DcpFeed::connectToNodes StartDcpFeed failed for secidx:proj-metrics-MAINT_STREAM_TOPIC_a4ef00897d739a0c06bb52a646069f53-15874611169441861585/0 with err write tcp 10.94.198.249:36478->10.94.198.249:11210: write: broken pipe

deepkaran.salooja · March 20, 2019, 4:43pm

EE index storage engine has much better performance. You can give that a try.

Also for EE, group by and aggregation gets pushed down to indexer which makes it a lot more efficient.

https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/groupby-aggregate-performance.html

And index can be partitioned to allow partitions to be scanned in parallel and reduce the query latency.

https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/index-partitioning.html

deepkaran.salooja · March 20, 2019, 5:04pm

Errors like above seem to indicate that projector has trouble talking to memcached. Is memcached process up and running fine?

shay · March 20, 2019, 5:24pm

the broken pipe error went away after restarting CB, EE edition is not an option for now, eventually i will need CB to aggregate across 900M docs or so, is that to much for CB? I can increase the indexer timeout but i didn’t think i will need to do so for 40M docs.

i am creating some other indexes , I will rerun the query again and run collect_info during the run, l hope it will give me some idea what’s going on. is there any benchmark outthere ?

deepkaran.salooja · March 22, 2019, 1:07am

You should check if the index is covered or not. All the time may not be getting spent in index scan. If query needs to fetch from data service, it will have a significant impact.

That said, 900M may be a higher number for CE storage engine. You can try and see how far you can push it. We have benchmarks for the EE storage engine.