Distributing a bucket over multiple servers, and creating a view on it

Hi there,

I have created my couchbase cluster over a few servers and because I didn’t want to create different views over more than one bucket so I tried to have all my data in one bucket, but the problem I’m facing is that while inserting the data and after exceeding 50% of the available space on all servers couchbase started to alert me about that :
"Metadata overhead warning. over 52% of RAM allocated to bucket “default” on node “… " is taken up by keys and meta data”.
Now I know that I could solve the problem by creating more than one bucket but then querying data from an application would be a problem as I need to query all the views created on all buckets and gather the results instead of querying one view.

Is there a way of either having all my data in one bucket? or if not, is there a way to create a view over multiple buckets so couchbase can deal with querying them all automatically? if not , what could be the optimal way to deal with this.
Again, my aim is to reduce the amount to processing I’m doing inside my application.

May thanks,

How many bytes are your keys and value typically? How many replicas do you have? What is you total memory size for you cluster?

I’m feeding my data from a java application and
my key is a string with a miximum of 13 characters, values are objects of the following class:
{id int,
string(20),
string(20),
string(100),
string(20),
string(18),
string(20),
string(100),
int,
string(200)
}
I have around 7000000 records but it started giving those alerts after 3500000 records so had to stop it.

there is only one replica of this bucket (the only default mandatory replica).

cluster includes 8 nodes with about 4GB RAM each, and it was showing there is still around 40% free space of RAM in four of them in the server nodes tab.
the other issue I’ve just noticed is that now after deleting all buckets it’s still showing over 80% RAM usage in two nodes and over 50% usage
in the rest, but after stopping couchbase-server and rebooting them all the number has fallen to around 11%, so is this normal?

Hello,

Let me answer your question in 2 steps:

  • Buckets
  • Views

Buckets
When you have “issue with memory”, adding new bucket will not help you, you just need to add more RAM to your bucket, and for this you have 2 ways:

  • add new nodes to your cluster
  • add RAM to your bucket
  • or both :wink:

Since you are ready to add new bucket, this means you have more memory available for you cluster, so just add more RAM to your bucket and this will be it. This message is just to tell you that you start to have lot of metadata in RAM compare to the volume of your working set. As you know Couchbase keeps all the keys and meta data in RAM to make it very fast to find information in the database. By default this warning is raised when you reach 50% you can change it, but usually it is a very useful warning to look at the way your application.

The less bucket you have in your cluster the better it will be, we usually say do not create more than 10 buckets in one cluster. Keep in mind that most of the operations such as indexing, replication, compaction… are done per bucket so the more bucket you have, more resource you will consume of the server(s) see:
http://docs.couchbase.com/couchbase-manual-2.2/#setting-maximum-buckets-for-clusters

Views
You are asking if it is possible to use a view (and query it) over 2 buckets, the answer is simple and it is no. It is not possible to write a view for 2 buckets and query it.
Most of the time people are creating buckets when:

  • they have security constraints, just to be sure that data are separated
  • they have really different type of use cases: for example a part of the application is using documents, with views, and another part of the application is using just simple key value with TTL,… and no views.

The views will be also automatically distributed on all the nodes of the cluster since the indexing process is done where the document is located.

You can find more information about Couchbase architecture in this white paper:
http://info.couchbase.com/couchbase-server-architecture-review.html

Let me know if you have questions

Regards
Tug
@tgrall

Many thanks, these are really helpful information.
By the way I’m trying to collect some statistics from my java application and for this I’m using client.getstats().
However, it’s returning about 15 pages of servers statistics. I’m searching online and in the pdf : " Couchbase Server Manual 2.0" on how to specify certain parameters for certain servers to get statistics about but it would be really handy if you can refer me to some documentation.

Thanks again,