I had read in some documentation that CB does not recommend more than 3-4 buckets on a small cluster.
What constitutes a small cluster, though is a bit of a mystery. Can someone clarify what is considered a small cluster?
My question is how to determine what cluster (hardware) configuration will support how many buckets? Also, does increasing the CPUs on a node allow for more buckets?
We have 4 CPU nodes and they can take at best 5 buckets. After which if we create more buckets, the nodes keep failing every now and then we have to keep adding back and rebalancing the cluster.
In Couchbase Server 2 most of the "background processes" that are used for replication, I/O, disk/file compaction, views, and other cleanup are per bucket.
So the reason why the documentation is mentioning as a best practice to reduce/control the number of bucket per cluster is really to avoid consuming too many resources on your servers.
We do not have a magic formula about the relation between node and buckets as it depends a lot of the volume of data and type of operations you do on them. (for example lot of mutation or not? views or not? ....)
I usually take another approach when I talk to developer about their project:
- you want multiple buckets, ok, but why? Can you explain why you need more?
Then we see if this "best practice" limit is an issue or not.
Some interesting reading in relation to your question:
- Sizing 1: http://blog.couchbase.com/how-many-nodes-part-1-introduction-sizing-couchbase-server-20-cluster
- Sizing 2: http://blog.couchbase.com/how-many-nodes-part-2-sizing-couchbase-server-20-cluster
Thanks for your response. Your question about why we need multiple buckets is also a fair one. And in our case, the answer is multi-tenancy. We are looking at hosting multiple consumers (different apps) on a single cluster. And the best way to segregate them is by allocating each one their own bucket.
I'll go through the links you have sent and come back with if I have any further questions.
Fewer buckets (i.e. less then 4) and better machines(dedicated servers) is always the best way to go. It depends on how CB is working , lots of SETS/REPLACE/DELETE = lots of Compaction and having lots of buckets is not a good idea.
Are you going to be using views?
How much do you expect the views to be to document data 1X, 2X, 3X ...... etc?
right got that househippo.
and no views on these clusters. we are using these clusters for purely caching purpose. so pretty much a key/value store. no funky document management happening on these nodes!