Pagination in bulk get

Hi,

Can we implement pagination in bulk get with java?

@ppliatsik can you provide a little more detail what you need? Since a bulk get is done on a discrete set of document IDs, there is no “next page” to load from. Only you know the next set of keys.

Sorry @daschl my question was not so clear!

I mean if we can implement pagination on the list of document IDs we provide to bulk get and not provide from the beginning the IDs we want to get. For example i want to take the first 10 documents of the list but the list will contain more IDs. But i think you answer my question!

Also the bulk get does not check for double documents’ IDs in the list we provide, is this right?

@ppliatsik ah I see - so you man you have a huge list and you want to load them “batch by batch” and not all at once - is that correct?

Is the list changing at runtime?

Yes the list may have many IDs. No, i suppose the list will contain the uptodated IDs.

@ppliatsik you could use the RxJava operators to make that batching effect happen.

So for example the following code

List<String> ids = Arrays.asList("id1", "id2", "id3", "id4");

Observable
    .from(ids)
    .buffer(2)
    .subscribe(new Action1<List<String>>() {
        @Override
        public void call(List<String> strings) {
            System.out.println(strings);
        }
    });

Prints out

[id1, id2]
[id3, id4]

Optionally, you can zip it with an interval to “delay” each batch:

List<String> ids = Arrays.asList("id1", "id2", "id3", "id4");
        
Observable<Long> waitTime = Observable.interval(0, 1, TimeUnit.SECONDS);

Observable
    .from(ids)
    .buffer(2)
    .zipWith(waitTime, (strings, aLong) -> strings)
    .subscribe(System.out::println);

And and then you can feed each batch into a new Observable and run a bulk get. The following example fetches the docs and prints them on an iterator:

Cluster cluster = CouchbaseCluster.create();
Bucket bucket = cluster.openBucket();

List<String> ids = Arrays.asList("id1", "id2", "id3", "id4");

Observable<Long> waitTime = Observable.interval(0, 1, TimeUnit.SECONDS);

Observable
    .from(ids)
    .buffer(2)
    .zipWith(waitTime, (strings, aLong) -> strings)
    .flatMap(Observable::from)
    .flatMap(id -> bucket.async().get(id))
    .toBlocking()
    .forEach(System.out::println);
1 Like

also you are correct this pattern of bulk get doesn’t remove duplicate keys (it handles your list as is).

but once again you can use RxJava provided operators:
for that case we can chain distinct() after the from(...) (or simply use a Set from the get go as a collection of IDs).

The code was really helps me understand how it works

Thanks @daschl and @simonbasle!

1 Like