Why I gave up Couchbase, and why I still believe in it

mastohhh · April 15, 2016, 6:40am

Our need was to add a local sync service in our App that was working with CoreData. Our code was too deep linked with CoreData’s ManagedObjects and FetchedResultsControllers.

Couchbase with CBLIncrementalStore and local replication was the best choice. At the beginning everything was hopeful. Some things were a little buggy, but the developer’s team was responding and our Pull Requests were resolved. Thanks to @pasin and @jens. That’s why we believed in Couchbase.

We thought that Couchbase was smart to allow developers to use their CoreData backend with a plug-and-play sync system.

We spent months to understand channels, sync_gateway and his JavaScript function, Couchbase server’s views, etc…

Our first need was a simple local sync service and Couchbase gave us a full sync service that we could plug with our backend in JavaScript and maybe think about an Android version of our app. The time we used to make Couchbase work was an investment.

We encounter two problems that were never solved :

performance
data loss

Be careful: I’m not saying the whole framework is slow and unstable. I just say the combination of CouchbaseLite+CoreData with CBLIncrementalStore and a lot of data is. And maybe we just missed something in our integration, I’m not blaming Couchbase.

Performance: the data our users create in a single day can be 5000 documents. And this was too much and our app just slowed down. I let you imagine after a week, a month… We spend an other month to create channels that contains only data of days the user works on. And purge data that are not in use. It was a tricky work and forces us to let down some features (access to history).

Data loss : we just never understand that. Our testers, one after the other, just lost data that cannot be lost. Some of them now hates us. I can understand.

That’s why we just gave up Couchbase after 8 months.

What we did next ?

We solved our primary problem: local sync, with sockets and data exchange and a custom sync solution that does the job. But we don’t have a real sync service with revisions and web access.

At the beginning we decided to create our own subclass of NSIncrementalStore. Maybe to understand better what was wrong with CBLIncrementalStore. We searched other examples of NSIncrementalStore and saw that a lot were deprecated. (ex: Parse - PFIncrementalStore, AFNetworking - AFIncrementalStore)

Our version of NSIncrementalStore was a client server version, the server had a real CoreData with SQLite and the client the incremental store that asked the server for data.

This was too slow !

And we understand why CBLIncrementalStore was slow too. The incremental store is not optimized at all. I just give you an example : a simple table view with 10 cells generates 21 requests.

the first request gives an array of the IDs of the elements in the table view
then the table view knows the number of results, and for each cells you will have two requests: one for the values of the object and one for the relationship.

You need two requests every time you load an object. And CoreData does every request synchronously. In our case that means : open socket, send data, receive data, parse result hundred times per actions. Too slow.

That’s why we try to add some cache and discover that by adding cache, we didn’t need the incremental store.

This adventure gives us an idea for you at Couchbase. Maybe we can have a full CoreData with SQLite and a full CouchbaseLite for sync working side by side, with an object between them that syncs the two worlds. This object only knows an instance of a CBLDatabase an a NSManagedObjectContext (the one used on main thread for the UI).

At launch we will have a method that checks if objects are in sync. Then on every change made on NSManagedObjectContext side (with NSManagedObjectContextObjectsDidChangeNotification) we update the CBLDocument corresponding. And same for the other side : every time there is a change on a CBLDocument, we update the NSManagedObject corresponding.

This new bridge object is responsible for maintaining both sides in sync.

I only see advantages. As we use the full CoreData with the well optimized SQLite store, every saves and requests will be as speed as if there was not Couchbase below. And for data loss we have 2 files that contains data.

Don’t put CouchbaseLite below CoreData, put them side by side.

To go further, maybe we can use NSInMemoryStoreType built-in with CoreData for super fast requests and saves.

In this solution I only found a problem: the CoreData part must have a unique ID for every entities that will be used as documentID. We must find quickly a NSManagedObject given a CBLDocument and vice-versa. I think everything else is already in CBLIncrementalStore, for example the “type”'s property to match with the CoreData’s entity.

The only thing that can be expensive is the initial script at launch that checks for data match. But you can create a sub-context in background to do the job and then save that context to inform the main context.

@pasin: there will be no more resfreshObject: anymore because every changes will be directly done on the main context.

I hope this post will have an impact. I don’t know if CoreData users are your primary target but here’s a feedback from someone who have an intense use of CoreData.

I hope the idea I just gave you is a good one. Please tell me if this can work “in theory” and if I’m totally wrong and if you have unfortunately no other choice than use NSIncrementalStore.

pasin · April 15, 2016, 4:53pm

Hello Mastohhh,

I’m sorry to hear that you are moving away. It was a good time to work and discuss with you on some of the Core Data topics. Anyway this comment is very valuable. I will put my comments on some of the topics below.

Performance:
In general, I agree that using NSIncrementalStore will have some overhead compared to using pure CoreData or Couchbase Lite. CBLIncremetalStore itself requires optimization especially around JOINS queries. For JOINS predicates, CBLIncrementalStore currently simply queries the documents based on relationship view and does post filter baesd on the predicate after that. We need a better real solution for this. CBLincrementalStore deploys two caches, one for general queries and the other one for relationship queries. However if the data are frequently changed, those 2 caches might not be very useful. There could be some optimizations around cache invalidation to help improving this.

Data loss:
I just do not have enough information to comment much on this. I have never heard data loss issues in pure Couchbase Lite usages. However CBIncrementalStore was originally designed to work with only one context (main context) so when using with multiple contexts, it had some concurrency issues when accessing Couchbase Lite database. I don’t know if this contributed to the data loss experience that you have or not. Recently I made a fix to the CBLIncrementalStore to have its own dispatch queue for Couchbase Lite operations, and I believe that should solve the concurrency issue.

Bridge solution:
I think this solution is a valid alternative. One of our team members (@wayne) is using a very similar approach in one of his apps. The downside of the approach are that the data will be duplicated and could be out of sync (requires initial sync up). For the initial sync, you could query docs with kCBLBySequence mode to get only changed documents after a certain checkpoint sequence. About refreshing the main context directly, if there are a lot of changes coming from the replicators, this could impact the performance on the UI thread. So if you are anticipating that, you might consider updating the CoreData on a background context instead.

– Pasin

mastohhh · May 3, 2016, 7:32am

Hello Pasin,

Thanks for your reply.

As you understood, I only blame NSIncrementalStore and I’m pleased to heard that a bridge solution is a valid alternative.

Do you think this solution can be embedded in the CouchbaseLite-ios framework ? Does someone works on it, or @wayne’s app is only a proof-of-concept ?

Does @wayne’s project is open source ?

Do CoreData users are your primary target ?

Thanks