In one of other similar posts I left a comment, but that post was not answered, so I decided to open a new question.
It seems that Couchbase has very frustrating limitation on number of documents able to be stored in a node/cluster, depending on the total memory.
My calculations; taking EC2 Large instance as a base and using the calculation provided in the CB2 manual I get the following:
- EC2 Large instance memory 7.5GB and only up to some 5.5 GB may be dedicated to a bucket. So we use 5.5GB.
- Suppose we don’t have replica (no_of_copies = 1 + number_of_replicas (0) = 1).
- Document Key length - 20b (ID_size=20).
- Meta data is 120b (This is the space that Couchbase needs to keep metadata per document. It is 120 bytes. All the documents and their metadata need to live in memory at all times and take no more than 50% of memory dedicated to a bucket.).
- Suppose intended number of documents is 100M.
- We even do not take into account the total documents disk size for simplicity
Memory needed = (documents_num) * (metadata_per_document + ID_size) * (no_of_copies)
100,000,000 * (120+20) * 1 = 14,000,000,000b = 13GB
13GB * 2 (50% of memory) = 26GB (memory needed to have 100M bucket operating)
26GB / 5.5 = 4.73 = 5 EC2 Large instance
So, that means that for storing 100M documents in EC2 L units we need to have at least 5 instances. This number of documents is nothing for a more or less serious project, though the cost of the servers (I am not even talking about Couchbase Support License Fees) will be unspeakable.
And if we want to have a replica than the number is duplicated.
These calculations are very raw. Using right more complicated way will not make it significantly different, if not even worse.
I really hope that I am not right on my calculations and I am looking forward to being disproved, because I like very much the product for a Number of aspects and really would like to use it in our projects.