Major Compression Use Case Missing in Couchbase

The Compression modes (Off, Passive and Active) offered in Couchbase 5.5 and beyond do not cover a very common use case, which is mentioned below:

Use Case:.Data in the Memory (RAM) should always be in uncompressed form so that updates do not go slow only due to data-decompression (which is often followed by a data-compression). In practice, RAM is cheaper than the cost of regular Compression / Decompression.

Moreover, the client (Consumer) should have option to specify whether receiving / sending of data to / from Couchbase server should be in compressed format, or in decompressed format. (Compressed format should be Default in order to save bandwidth and client-device resources).

Now, carefully compare the use case mentioned above with documentation on modes on the link below. None of the compression modes in documentation pass through above use-case:
https://docs.couchbase.com/server/6.0/learn/buckets-memory-and-storage/compression.html

Hi @fakhar.anwar, thanks for your post!

A few points:

  • Couchbase will automatically maintain an uncompressed copy of the data in RAM while it is being mutated frequently to prevent the overhead of constant compression/decompression
  • These modes really only apply to compression over the wire and in RAM…the data is always compressed when being written to disk.
  • The client SDKs already do negotiate with the Server whether they can support compression and this can also be controlled by the application. It’s on by default and has some thresholds to try to prevent compression when not appropriate (i.e. documents too small, binary docs)
  • Even with the mode being set to “Off”, the client can still send compressed documents and they will be uncompressed and stored uncompressed (in RAM). This is primarily for compatibility reasons when mixing clients that do or do not support compression, but I think can also satisfy your use case to force data to be uncompressed in RAM. The one limitation here is that the server will not re-compress it when sending back to the application.

The other big thing to keep in mind is that, in general, there is very little performance impact from compression/recompression. We are leveraging the Snappy algorithm which is very efficient, and because Couchbase is already so fast at sending data from RAM, a little bit of extra CPU usually doesn’t have a noticeable impact. Certainly, if you are doing 100’s of thousands of operations per second it will be more noticeable than lower throughputs.

I agree the documentation could be improved to help explain this better and I will ask the team to do so.

Let me know if 1) that all makes sense, 2) you have any further questions or feedback or 3) you are seeing different behavior from your experience with Couchbase.

Thank you!

Perry

3 Likes

Thanks, Perry ! for the quick response and good explanation. The provided logic in your response makes perfect sense.

Regards,

Fakhar.

1 Like

Hi @perry
looks like you have more insight on Data Compression and Decompression .
Need help on Storage Compression . Can you please give any insights on my existing thread ?

thanks and appreciate /