How to perform REAL multi Get

Hi,
We are experiencing the same issue with 2.01 client. In our tests we run a bulk get operation of 1000 items repeated 100 times. For the 1.3.10 client the whole operation took about 10 sec, the 2.01 client took more than 50 sec. For the tests, I have used a remote cluster consisting of 3 nodes (3.01 community edition). I played around with the MaxDegreeOfParallelism with no avail.

We are using couchbase as a distributed cache in a performance critical role. Unfortunately the 2.01 bulk get performance is a deal breaker for us, so we can’t make the switch until it is sorted out (although we really like the new features the 2.01 offers, especially the replica reads).
Can you guys please vote for the jira issue mentioned by jmorris? Maybe it will get higher priority.
Thanks,
Bence

krumplib430 Thank you for your input. I have just voted for the jira issue.

Follow up here

Hi jmorris,

Do you have any planned release date for 2.1? We would really like to use the new client library because of the new features, but we can’t make the move until the BulkGet performance is sorted out.

Thank you,
Bence Farkas

@krumplib430

2.1.0 is planned for the first week of May: https://issues.couchbase.com/browse/NCBC/fixforversion/12504

-Jeff

Hey Jeff,

This issue is beeing moved from version to version. Is it possible to known when it is going to be implemented ?

Regards
Piotr

@drak25 -

It keeps getting bumped due to other higher priority tickets; I can’t give fixed date ATM, but most likely 2.1.3 (July) or 2.1.4 (Aug).

Note that 2.1.2 will likely be a very small follow up release to 2.1.1, with the only feature being support for multidimensional scaling for Couchbase 4.0.

-Jeff

I notice that there’s no equivalent overload for GetAsync<T> that takes a list of keys. Is that by design, with the intention that async multi-get is achieved with the below?

var operations = keys.Select(bucket.GetAsync<T>);
var results = await Task.WhenAll(operations).ConfigureAwait(false);

Cheers,
Fraser

@frasdav -

Yes, that is correct.

-Jeff

Is supporting multi get in a single request still on the backlog? In my use case I am multi getting millions of keys in a single call which takes ~20 seconds, so the change in the 2.x sdk of doing each get one at a time in parallel adds way too much overhead.

Some logs from my application (which is doing bulk data processing) showing it query almost 2 million keys in 17 seconds

14:39:24.525 CouchbaseReader: 28,272 queried 1,584 returned 38,016 values 13.400 query 0.045 stock
14:39:26.663 CouchbaseReader: 56,616 queried 215 returned 20,640 values 2.512 query 0.169 stock
14:39:48.763 CouchbaseReader: 1,975,697 queried 285,773 returned 15,119,375 values 16.973 query 8.747 stock

@brandonagr -

Yes, the changes that need to be made to do this as single request are in progress. That being said I don’t have an exact date; likely it will be later this year or early next.

-Jeff

1 Like

Any updates on getting a true multi-get call for the .NET SDK?

I’m trying to pull 2500+ documents out and the multithreaded/multi-call requests per document is horrible performance since it’s linear.

@radleta

From the example above, you should get good performance:

 var operations = keys.Select(bucket.GetAsync<T>);
 var results = await Task.WhenAll(operations).ConfigureAwait(false);

I am not sure I understand what you mean by:

Perhaps you can provide an example? (Also, which SDK and server version?)

No update yet here, if it happens it will probably be in v3.0 and we don’t have a timeline other than “in the future” ATM.

All of the existing solutions involve making one request to the server per key. This is what it appears to happen and upon inspection of the IL via ILSpy the Get(IList) just loops through all the keys and issues fetches separately for the keys instead of batch submission. So this appears to cap throughput based on the number of cores and threads available to handle the request.

The SDK version is CouchbaseNetClient 2.5.3 via NuGet.

Are there any other options like an HTTP REST call that can return bulk keys in one request?

@radleta -

There is a difference between Get(IList) and GetAsync(IList) or GetAsync(string key) or GetDocumentsAsync(IList). GetAsync and its cohorts will run asynchronously on thread pool. The SDK also uses multiplexing IO, so the reads/writes are all interweaved. - nothing is synchronous - tasks complete independently. It’s efficient in terms of serialization/deserialization and keeps the working set of memory smaller within the application and GC’s are generally occurring in gen 0.

I don’t suggest you use the Get(IList) implemenatations because they are dependent upon cores and available threads (although in some situations if tuned properly can be performant). Note that they are flagged as obsolete on IBucket and the description points you to the async equivalents.

From what we have seen in the .NET and Java SDK this gives comparative performance to using a bulk implementation which requires a large memcached packet be created and then the large response be unpacked and deserialized. If very large requests/responses are generated, in .NET you’ll find the working set of memory creeps up and opens the possibility for OOM and memory fragmentation and compaction issues because objects end up getting stored in LOH. You can also run into this situation if your List<string> keys is very large - in this case you should batch on the application side. It’s also more efficient for the server to process smaller requests/responses.

It’s important note that you may need to tune your configuration to get better performance; the simplest is to increase the number of connections that the client uses to process requests. The MaxSize property should be your starting point - the default is 2 connections - try bumping it up to 5 or more.

Yes, you can use N1QL, but K/V is generally faster because its straight from memory in most cases - you can get creative with indexing and improve it though. If you do go the N1QL route, make sure streaming is enabled for large responses or you will definitely run into OOM exceptions as the entire response is pulled across the wire and deserialized when streaming is not used.

Thanks for the in-depth explanation.

I switched to your recommendation of using the below.

var operations = keys.Select(bucket.GetAsync);
var results = await Task.WhenAll(operations).ConfigureAwait(false);

I had better performance. This reduced the call overhead from 20-30 seconds to 10-11 seconds. It still is very slow compared to the batched performance of the perl module we’re comparing the performance against. It runs in <1 second consistently.

I set up a Node.js HTTP REST server to proxy the calls from .NET to Couchbase via Node.js SDK using the getMulti call and was successful in getting similar performance (<1 second) to the perl module. There is a slight overhead having to do the call but it’s close.

I hate having to build a proxy interface to Couchbase for .NET SDK. This feels like an unfair advantage on languages and a huge disadvantage to anyone who works in Microsoft .NET with Couchbase. Is there any way we can get the same kind of performance out of .NET SDK as we have with the Perl SDK and Node.js SDK?

@radleta -

I get < 1s on my machine. with default settings. Can you post an example of the code you are using to get that metric? Even better send me a VS project with an example including a POCO and JSON.

-Jeff

@jmorris I built from scratch a .NET console app to test and was able to replicate your performance of less than 1 second.

I’m slowly tearing my app apart trying to determine what is going on. It appears to be some kind of bottleneck with Async calls. I’ll post here if I find a definitive answer to my problem to help those who end up in my position in the future.

I appreciate your help and looking into this for me.

@radleta -

Any chance your not reusing the Cluster/Bucket? You might be bootstrapping on every request which will kill performance; you should create the Cluster/Bucket when the app starts up and dispose/close of it when the app shuts down. More info here.

@jmorris Nailed it! The time to connect was the 10 seconds. The bucket hadn’t been opened and was getting initialized therefore costing the extra time to bootstrap on the call since it was the first one in the application. I moved the initialization of the bucket to the beginning of the app and voila my performance was magically less than 1 second.

Many thanks!

1 Like