Storing and retrieving binary data from Couchbase


#1

Hi, the sizes of the JSON documents which I am storing the database have increased and I am not able to store large number of documents because of that. I am in need of converting the documents into binary form (byte[]) in-order to handle the space efficiently. I saw a question which is similar to this question, but it is for node.js SDK.

Can someone please help me with storing and retrieving binary data in Couchbase server?


#2

I’m not sure why storing as byte[] instead of JSON (which under the covers is of course also just a byte array) would help in terms of size? Can you please elaborate your use case a bit more?


#3

Hi, I am sorry. The requirement has been modified. I need not convert a JSON document into a ByteBuffer. I just wanted to convert my user-defined object into a ByteBuffer object which in turn was converted to a ByteBuf object to be stored in the Couchbase server.

However I am stuck when upserting the ByteBuf documents as a batch. Here is my RxJava code which tried to upload the documents as batches. Can you please help me out?

public void createMultipleCustomerDocuments(final String docId, ByteBuffer myCust, long numDocs, int batchSize) {
        ByteBuf buffer = Unpooled.wrappedBuffer(myCust);
        binaryDocuments.add(buffer);
        documentCounter++;
        System.out.println("Batch size: " + batchSize + " Document Counter: " + documentCounter);
        if(documentCounter >= batchSize){
            System.out.println("Document counter: " + documentCounter);
            Observable
            .from(binaryDocuments)
            .flatMap(new Func1<ByteBuf, Observable<ByteBuf>>() {
                public Observable<ByteBuf> call(final ByteBuf docToInsert) {
                    return theBucket.async().upsert(BinaryDocument.create(docId, docToInsert));
                }
            })
            .last()
            .toBlocking()
            .single();
            documentCounter = 0;
        }
     }

I have a static ArrayList<ByteBuf> binaryDocuments.

return theBucket.async().upsert(BinaryDocument.create(docId, docToInsert)); throws error.


#4

I was able to find the flaw by myself. BinaryDocument.create() returns a BinaryDocument and not a ByteBuf. So I had to change the return type.


#5

A few things here: If you can, please use RawJsonDocument instead. BinaryDocument sets different flags and also if you work with raw ByteBuffers you need to make sure you free them once you load them. RawJsonDocument allows you to pass in the already stringified JSON document.


#6

I receive the documents as bytebuffers only. That’s the reason I am using it. Earlier I was using JsonDocument. I am freeing the buffers after loading in order to avoid memory leaks.


#7

Okay, but keep in mind that the internal flags for the BinaryDocument are different, so you won’t be able to read the written documents by default with the JsonDocument or the RawJsonDocument (or the other JSON variants)


#8

Yes. I am using kryo to deserialize the written ByteBuf documents inorder to read the document’s contents.