How to optimise persistent disk creates

Hello experts.
Can you suggest a way to optimize the number of operations per seconds, and to be specific the disk created per second with synchronic persistent?

I’m trying to optimize the operations per second on our couchbase server (2.2).
When I use asynchronic set command I see the disk creates get into 4000 or 5000 disk creates per second. In this method the disk write queue is increasing all the time.
So I conclude from here that the disk IO of couchbase machine can get to this number of disk creates per second.
When I change the code to synchronic with persistent the disk creations rate drop under 100 per second.
I’ve tried to change the connection pool and number of thread settings but couldn’t find anything that could maintain more than 100 creations per second.

I’ve used both ruby sdk and .net sdk.
Couchbase server 2.2
AWS machine r3.large (2 vCPU, 15.25 GM RAM) + 100GB EBS 1000 IOPS (OS = Ubuntu 14.04)

Thank you for your kind help.

Are you trying to do a bulk load as fast as possible? If so, then generally what you want to do is actually keep that system as full as possible within the maximum memory. You’ll find a couple of bulk loading examples in the documentation. The basic idea is to keep loading until TMPFAIL, then do an exponential backoff/retry.

The reason you likely saw lower throughput when trying to match the throughput observed under load has to do with the way the data is stored on disk. It’s partitioned up a bit, and by having more data for each partition, the system can optimize the IO a bit more.

Thanks for the answer @ingenthr .
I don’t try to do a bulk load. I’m trying to do the basic case of writing each web request into couchbase. So in this case each write is on its own.
I would like the write to be persistent in order to know that the data won’t be lost in case of server failure.
It seems that in this basic case I’m limited to less than 100 request per second. I hope that there is some way to solve this, otherwise we won’t be able to have a high scale solution.

Which SDK and code are you using?

@daschl

ruby sdk 1.3.
Couchbase server 2.2.
Here is an example code to simulate the problem:

doc= {
"aaaaaa"=>"aaaaa",
"bbbbbb"=>"bbbb",
"ccccccc"=>"ccccc",
"ddddddd"=>"dddddd",
"eeeeee"=>"eeeeee",
"fffffff"=>"ffffffff",
"ggggggg"=>"ggggggg",
"hhhhhhhh"=>"hhhhhh"
}

example_bucket = Couchbase.connect(:bucket => "example",:hostname=>"#{hostname}")
for i in 0..100000 do
   key = Digest::SHA1.hexdigest i.to_s
   example_bucket.set(key,doc,:observe=>{:persisted=>1},:quiet=>false)
end

Guys, any thoughts about this issue?

In my opinion using persistence restrictions for every operation does not seem the correct way to work with Couchbase. It is implemented on client side by polling key state on the server. So in conjunction with synchronous way, it slows down the client, not the server. I recommend you to adopt your application having this in mind.

Thanks @avsej.
How would you suggest to write the client to ensure that the documents were persisted while maintaining >1000 IOPS performance?
Is there a way to write every (let’s say) 100 messages in a bulk and than check that all 100 were persisted?

Actually you might write your own check, for example the code below sets 100 keys and polls their state in a bulk. While setting I haven’t used multi-set to imitate single request operations as you mentioned

require 'couchbase'

bucket = Couchbase.connect

keys = []
100.times do |t|
  key = "foo-#{t}"
  keys.push(key)
  bucket.set(key, val: t)
end

loop do
  sleep(0.1)
  stats = Hash.new(0)
  bucket.observe(keys).each do |_key, nodes|
    nodes.all? { |node| stats[node.status] += 1 }
  end
  puts stats.inspect
  break if stats[:found].zero?
end
# >> {:persisted=>20, :found=>80}
# >> {:persisted=>40, :found=>60}
# >> {:persisted=>60, :found=>40}
# >> {:persisted=>77, :found=>23}
# >> {:persisted=>97, :found=>3}
# >> {:persisted=>100}

There is also API function with does almost the same:

require 'couchbase'

bucket = Couchbase.connect

keys = []
100.times do |t|
  key = "foo-#{t}"
  keys.push(key)
  bucket.set(key, val: t)
end

bucket.observe_and_wait(keys, persisted: 1)

Thanks @avsej.
It seems that your second example can work for us.
Do you know if there is a similar way in .net sdk?

Thanks,
Gil

pinging @jmorris, because he probably knows!