Performance got very slow with a lot of data


#1

Hi,

I’ve noticed very slow performance of my server all of the sudden. Only thing that has changed is the amount of the documents in my bucket.

I have 5697082 documents in my bucket and I see that the performance of ‘get’ got very slow.

I have just one server for testing.

Is there anything I can do?


#2

Having 5 million documents shouldn’t slow-down gets.

How slow are your gets?

One guess would be that you’ve outgrown the cache and you’re doing many more gets from disk rather than RAM.

So, my questions are:

  • Which version of Couchbase are you using?
  • How big is your data set in terms of size, not number of docs?
  • How big is your cache?
  • What happens to the get speed if you add more than one node to the test cluster?

Cheers,

Matthew.


#3

One other question is what is the size and type of document and are you using views.
When we say type I’m referring to JSON or not JSON and size like average size, 100K, 100 bytes, etc?

Austin


#4

Hi matthew,

It’s actually not that slow if I just do one single get operation. I’m using multiple php scripts to import mysql database into Couchbase through PHP classes (PHP classes have business logic and know how to format mysql data to JSON document). I was getting around 3k/second. All of the sudden, I started getting 1.4k/second. I’ve removed two gets from the PHP classes and the speed came back.

This is my setup.

  • Version 3.0.3-1716-rel on Mac OS
  • I have one bucket with 8gb ram assigned to it
  • (Question) Where do I check the cache size…?
  • I haven’t tried to add a cluster yet. I will test it out.

I’ve also noticed that increment/decrement slowed down as well.

The bucket currently has 20 millions of documents.

*** I think I should change the title. The performance is okay, but it’s just slower than how it was **

Thanks,
Moon


#5

Since I’m in the process of converting mysql data to couchbase, I’ve removed all the views from the bucket for now.

All the documents are in JSON format.

I had very slow performance on relatively large documents (250KB). That’s expected, so I’m not concerned about it.


#6

If you can start to pair down the document sizes and increase the number of documents it may make life easier. 250KB is a very large doc as you noted. Smaller documents with more lookups == higher operation rates and typically more stored in RAM so overall lower latency in the long run.

That is a document modeling exercise though so that might be out of scope for your current activity but I wanted to mention it.

Austin


#7

Thank you for the suggestion austin. Those big documents will be processed in the background and used as a backup when needed. I don’t plan to use them as a production data.