I have a scenario where i want to have number of buckets more than 10 but have read that having more than 10 buckets have some performance penalties. I would like to know what are those performance penalties and if there is any way to minimize impact on performance if number of buckets are more than 10.
Each bucket in Couchbase consists of 1024 vBuckets, which are used to shard the data amongst the nodes in the cluster. These vBuckets are redistributed when adding/removing nodes to the cluster (this is called rebalancing). This is one of the main reasons why it’s recommended to use as few buckets as possible.
Does your use case really need that many buckets? Are you working on a multi-tenant project? You may be able to combine buckets by using discriminator values in your documents.
Further, keep an eye on blog.couchbase.com, because we are working on adding scope/collection capabilities to Couchbase that give you a more efficient way to organize documents and support multi-tenancy.
Thanks for the info.
The use case is that we have different types of documents and have prefixes in their keys like documents of type “a” have keys like “a_123”, “a_789” and documents of type “b” have keys like “b_123”, “b_789”. Number of each type of document is 10 million to 100 million.
Keeping different type of document in separate bucket would allow us to have preferable number of documents in RAM, thus number of buckets might cross the limit of 10 buckets.
If you have any suggestion on how i can command Couchbase to store documents in RAM based on type of documents (key prefixes)? Since that might be another option which would keep number of buckets less as well.
Might be same approach would help when we go multi-tenant.
Also, can you explain what is “discriminator value”?
Okay, so in your case it isn’t about organization, per se, it’s about being able to customize memory usage for different types of data. I’m going to tag @chinhong as someone who might be able to help you, as there are some upcoming features that may help you. Multiple clusters might be a good option for you as well.
As for “discriminator”: something like a “type” field in your documents that help organize them. But it sounds like you already have your documents organized.
Sounds interesting. Is there more info about this somewhere?
It’s very much alpha-release right now, but you can learn some things about upcoming scope/collection capabilities in this blog post by @jmorris: https://blog.couchbase.com/introducing-the-couchbase-net-sdk-3-0-alpha-releases/
Thanks @matthew.groves. I believe it would be really useful in the scenario i am trying to solve. Looking forward to it.
Using collections to organize data has many benefits including:
Ability to refer to similar documents as a unit for various purposes such as building an index, setting up replication, querying, backup/restore etc.
Does it also provides ability to configure hardware at collection level as well? or at Scope level? I mean configuring Working dataset at collections level.
It’s good to know that up-to 30 buckets would be supported in Couchbase 6.5.