Can't find Redis Sorted Set (ZSET) alternative in Couchbase


#1

Hello. I want to know, whether Couchbase allows him to do what in Redis called “Sorted set” For instance, I have millions of sets each of them contains tens of documents. In Redis it is absolutely no problem to achieve, just make as many sets as I want.
But in Couchbase I did not found such feature, only Secondary Index. So I can only make one bucket and put all documents there, separated by index. For instance, in Redis:

set1: [doc1, doc2] set2: [doc1, doc2, doc3]

in Couchbase:

{ "set":"set1", "order": 1, "doc1_json" }, { "set":"set1", "order": 2, "doc2_json" }, { "set":"set2", "order": 1, "doc1_json" }, { "set":"set2", "order": 2, "doc2_json" }

Then I make composite index on (set, order) and query like

"SELECT * FROM my_bucket WHEREset='set1' ORDER BYorderDESC"

Will this work as Redis sets? Will it be fast on tens of millions documents with many inserts? I need ultra fast inserts and selects with same predicted speed regardless of the number of documents in buckets.


#2

Also it would be very helpful to make article describing how to implement Redis types (set, sorted set, list) in Couchbase in best way.


#3

Hi @ravlio,

Can you give more details about how you need to query the data stored in the sorted sets? Specifically:

  • Does every set have a unique name/id?
  • Do individual documents appear in more than one set?
  • Do you want to query on the contents of the documents? Or just by set name/id?
  • How often do the contents or order of a set change, if ever?

Couchbase uses JSON as its data model, and JSON supports arrays, which are sorted collections of any data type. Thus you could have:

  • Document Key: “set1”, JSON: { “contents”: [“doc1_json”, “doc2_json”] }

  • Document Key: “set2”, JSON: { “contents”: [“doc1_json”, “doc2_json”, “doc3_json”] }

By putting the set name/id as the document key, lookup for an individual set uses an extremely fast key-value lookup, e.g.:

SELECT contents FROM my_bucket USE KEYS "set1";

If documents can appear in many different sets, it may be more space-efficient to store each document in Couchbase, and then put the document keys inside the set.


#4

P.s. The JSON data model is quite flexible, and the best way to implement types like “set” or “list” depends a great deal on your specific requirements. If you implement something as an array,

{ "contents": [ "item1", "item2", "item3"] }

you get ordering for free, but searching for an item is O(size of array). But JSON objects behave a lot like Hashes or associative maps, but if order is important you would need to add that explicitly:

{ "contents": { "item1": 1, "item2": 2, "item3": 3 } }

In this approach you can find out if an item is present or absent quickly (e.g. ‘where contents.item1 is not missing’, but ordering requires a sort.

So to decide how to model the data, you need to figure out which operations you need to perform on the data.