Bulkget for 8000+ records with document id

Hello,

I’m trying to figure out the best solution in fetching 8000+ documents from couchbase bucket using their document ids. Tried mutiple approaches but still looking for the best way

My test configurations include : Java 8, Spring Boot 1.4.0.RELEASE, Couchbase Version: 4.5.0-2601 Enterprise Edition (build-2601),

  1. couchbaserepo.findAll() - lead to timeout

  2. Querying Asynchronously - N1Ql query - used this approach since the whole document data is not required

bucket.async()
.query(Select
.select(“quote[0].bestPrice.bestPriceInd,”
+ “quote[0].bestPrice.bestPriceGrp”)
.from(Expression.i(“databucket”)).useKeysValues(couchbaseIds.toArray(new String[couchbaseIds.size()]))).flatMap(result -> result.errors()
.flatMap(e -> Observable
. error(new CouchbaseException("N1QL Error/Warning: " + e)))
.switchIfEmpty(result.rows()))
.map(AsyncN1qlQueryRow::value).toBlocking().single();

This approach fetched result with 3.2 mins if the document id was supplied one by one, but as an array resulted in timeout

  1. Observable.from(couchbaseIds).flatMap(new Func1<String, Observable>() {
    @Override
    public Observable call(String id) {
    return bucket.async().get(id);
    }
    }).toList().toBlocking().single();

This resulted in 17336 ms, which is best time i got till now, but since i need the just few fields from the documents i have to perform mappings with JsonDocument and the Document POJO class which would be an overhead to the overall processing time.

Note : the bucket is non indexed

Kindly excuse for any errors/vagueness in the question

If you know the document IDs, then pretty much always a low-level KV lookup will be faster (assuming you don’t want to do any additional transformations on the documents before fetching).

Given you only want some fields, you should look at the Sub-document API as that can fetch just specific fields. However I don’t know off the top of my head how you do bulk subdoc lookups in Java.

(Note that there’s some known issues with concurrent sub-document updates in 4.5.0, I’d recommend upgrading to 4.6.0 before you go into production if you end up using sub-document for mutating documents)