Custom Conflict resolution in Couchbase Lite 2.x

replica

#1

I am currently working on an Android app using Couchbase lite 1.5. When looking to upgrade to 2.x, I noticed that the conflict resolution code has been updated. One of the features in our app is that users can work offline on separate devices and later can sync and all of the changes will be synced accordingly in chronological order. In this case LWW is not an acceptable merge strategy.

Based on the documentation, it appears Couchbase lite 2.x no longer can allow the device to conduct a custom conflict resolution strategy to merge a document on the device. Is it still possible to override the merge strategy for specific documents?


Purged documents are being replicated down to devices after being purged
#2

No it is not, but this is one of the features that is being considered based on demand (which so far has been a fair amount). @priya.rajagopal Here is another vote for this.


#3

@borrrden, with all due respect, how did you not anticipate that this would be an issue? One of the major reasons that we decided to go with Couchbase was that you provided a conflict resolution framework that allowed us to distribute our app in areas with low internet connectivity. The ability to handle conflict resolution ourselves is absolutely central to the success of our product.

We serve pharmacies in low income and rural areas in East Africa, it seems like all of the decisions your team made were based on the “edge case” of someone loosing an internet connection going into a subway or something like that. I can’t imagine that if you had consulted your customers you wouldn’t have heard more complaints like ours. It sounds like you talked to your major clients operating in developed markets who were complaining about database size.

We now have to use deprecated software until you put out a release that reinstates custom conflict resolution…

Thanks


#4

If you are an enterprise customer, then voice your concerns through the support / sales channels as well. It will help give an accurate representation of which features are most desired, and will be more visible to the people who, as you put it, “talk to the customers.”


#5

@borrrden, there’s not a ton of money in serving bottom of the pyramid pharmacies, so we’re not enterprise customers


#6

I hear you, Sam. Dropping conflict resolution was a painful decision. What it came down to was that the API we had in 2.0 beta, which called an app handler function with the conflicting revisions and the common ancestor, would have been too difficult for a lot of developers. Resolving conflicts this way is hard, and it has the potential for mistakes that can cause infinite loops where different clients resolve the conflict differently, creating another conflict, ad infinitum.

The specific problem Couchbase has with difficult/dangerous APIs is that we make our money from support contracts. If something generates a large volume of support requests, it becomes a big pain point for our support engineers, and ends up costing us money. So we have an incentive not to release such APIs. Which is a good thing in the long run, but in this case it’s currently painful.

What we’re doing is listening to developers who need conflict resolution, identifying the specific things they’d need, and deciding on what sort of API to provide. We are leaning toward things like conflict-resistant data types (CRDTs) that allow conflict-resolution strategies to be specified declaratively, but we’re open to other ideas too.

I’m sorry about the current situation, especially for your product, but I’m looking forward to being able to fix it before long…

PS: I also hear what you’re saying about not being able to afford EE licenses. We’re funded by those licenses, but we’re also proud of being open source, and of the open-source community around mobile in particular.


#7

Hey Jens,

Thanks! I really appreciate the thoughtful answer.

Here’s our use case, hopefully it’ll help with your future decisions. Our system is a POS, and we support multiple tills, which means that two or more devices running the system can be selling, editing, or doing anything else they want to their inventory, offline. When they go online we manually resolve any conflicts between items so that after each device is synced, the timeline of events (sales, receiving new items, etc) appear in chronological order as though they happened on a single device. So if device 1 generated events a, c and e and device 2 generated events b and d, then after sync the “timeline” of changes would look like a-b-c-d-e (the exact rules for ordering would vary depending on the use case). We can do this because we keep track of every change in our own internal “revision” objects.

All we need to continue to use this strategy is to have access to the full maps of all conflicting revisions. As you probably noticed, a LWW strategy would loose a lot of information for our clients, especially given that only a small percentage of them are online all of the time.


#8

Have you considered redesigning the data schema so it doesn’t create conflicts?

I don’t know the details, but it sounds like you have each client adding transactions to an array in a shared document?. If instead each client updates a separate document, or if each transaction is a unique document, there won’t be conflicts. On the flip side, of course, you may need some more complex queries to assemble the data in the form you need, but 2.0’s more powerful query engine will help.


#9

Just to chime in on this topic…

We are also in the same boat as @sam.wilks92. One of the primary reasons we chose Couchbase as our NoSQL platform was Couchbase Lite’s ability to identify and resolve conflicts, allowing us to incorporate changes from multiple users who operate in areas of low internet connectivity.

Our use case is that we have potentially many medical professionals documenting the care of a patient within our app, and naturally this causes conflicts which must be resolved. The changes from each documenter need to be incorporated into the patient’s chart, and we do this by performing an n-way merge during conflict resolution.

@jens, to your point, we could re-design the data model and break it into several other documents to help alleviate conflicts, but it wouldn’t mitigate our need to resolve conflicts completely. This would be a massive architectural change to our application, and one that I would have a very difficult time getting approved since the very reason we chose CB was because it supported our use case out of the box.

We have voiced our desire to have this feature added in CBLite 2.x through the proper channels . As of now, we can’t upgrade and have to stay on 1.4 until conflict resolution is supported.


#10

We use custom conflict resolution as a DR mechanism. If we were to lose the CB cluster for some reason, and the new cluster was restored with same document ids but with generation id of “1-”, it would fail to sync into mobile clients which has often much higher generation ids. We rely on detecting a conflict and letting the document with the latest update timestamp win. It works well, not just for DR tests but also because we have critical edge cases where documents that may be purged (60-day auto-purged tombstones in CB 5.x, for example) that are brought back to life for under certain business conditions.

We’re upgrading to CBL 2.1 for performance reasons on iOS. But now have to revise our DR and edge-case cases approaches, something we hadn’t planned or budgeted for. We’re eagerly awaiting the return of custom conflict resolution!


#11

Thank you all for your input and specifics on the use case - keep them coming!

Here is a little bit of background on the decision to only support automatic conflict resolution in 2.0 .

As Jim and Jens alluded to, Custom conflict resolution is definitely on our radar - we are carefully weighing the pros and cons of various approaches to determine the best way forward.

Will keep folks posted as we have more specifics.


#12

Sorry for digging up such an old thread but @priya.rajagopal is there any update on this topic?

We are in the process of updating CB 5.0.1/Mobile 1.5 to CB 6.0/Mobile 2.1 as part of the preparations to move from Community Edition to Enterprise Edition. During a recent call with the sales team they told me that CB Lite 2.1 supports the custom conflict resolution again.

I tried to implemented it but I could not find out how it is supposed to work and I also did not find any documentation on it. Am I missing something? (at the moment we haven’t signed the contract yet and are still using Community Edition)

In CB Lite 1.5 we use the custom conflict resolution to merge the entries of an array from all conflicting revisions. While I suspect that we could redesign our data model to accommodate the “new” conflict resolution, it would be much easier to continue to continue to use the current approach.


#13

We do not support custom conflict resolution in 2.1. This is how conflicts are handled. This capability is on our radar but we do not have a committed timeline yet. I will direct message you so you can share specifics of the Account to ensure there was nothing lost in translation.

While we are working on bringing back custom conflict resolution, do you think you can model your data to avoid such conflicts? Looks like you are already considering it. You mentioned that you have an embedded array, the entries of which are merged from conflicting documents. What for instance, can each of those array entries be individual documents that you refer back to the main doc? I am sure the Accounts team can guide you through that process.

As you probably have seen, CBL 2.x has a significant number of enhancements over 1.x and it is the version that will evolve and continued to be supported . I would strongly encourage you to upgrade . So I want to determine if there is a way for you to transition without this being a blocker - especially since we plan on bringing custom conflict resolution back in the future.


#14

Thanks for your clarification.

You are correct. Moving the array entries to individual documents is the alternative I am considering. In general I think this would also be a cleaner solution.

However, my main concern with this is performance, which is something that we need to test (I only have experience with 1.x so far and do not know how 2.x behaves regarding performance):

  1. Initial pull performance when the mobile database is empty and a large amount of documents is downloaded from the server. With the new architecture the amount of documents would increase quite heavily. (I know about the recommendation to prepare a database and ship it with the application, this is however not applicable for our use case)

  2. Query performance: Right now all the data that I need during the performance-critical part of the application I can fetch directly in one simple operation (I retrieve the main document directly by key). With the new architecture I would have to retrieve the main document and then query all the new “array-entry” documents and then sort them.


#15

As you noted , there are tradeoffs.

  1. Whether you have a single document or split to up, you are still pulling down all that content - s0 bandwidth usage should not vary much between the two cases. Very large docs impact memory and processing usage on clients. If you split it, you have the option of doing more fine grained filtering on the docs (get just specific docIds). So in that sense, you can in fact save on bandwidth. Of course, at the end of the day, it depends on the network connectivity . You should also reconsider modeling your data and splitting it up so public docs that can be shared can be bundled in with the apps and only pull down user specific/private data. This talk on " Architecting Business-Critical Applications With Couchbase Mobile – Connect Silicon Valley 2018 gives several useful tips - I recommend watching that

  2. You should look into JOINs