Sync Gateway Replication Interface Inconsistent with CouchDB


#1

I have been having issues with filtered pull replication from a CouchDB server using Couchbase Lite for Android as the client. This issue has to do with Couchbase Lite using POST to “get” the _changes feed.

The _changes feed from both CouchDB and the Couchbase Sync Gateway can be accessed using a POST or a GET. How the POST is used, however, appears to be different when comparing CouchDB/Cloudant documentation vs. the Sync Gateway documentation. As you can see below, CouchDB always expects params to be in the query string, but the Sync Gateway is OK with them being in the JSON body. If the params are in the JSON body and not also in the query string, CouchDB ignores them, producing an unexpected result.

Since Couchbase Lite for Android uses the POST to _changes when doing a pull replication, this causes the replication to work incorrectly if the pull is from CouchDB. This difference makes Couchbase Lite incompatible with CouchDB as long as Couchbase Lite is using POST to query _changes. Switching to using a GET in Couchbase Lite would side-step this incompatibility.

CouchDB/Cloudant Documentation:

Using POST to get changes
Instead of GET, you can also use POST to query the changes feed. The only difference if you are using POST and you are using either of the docs_ids or selector filters, is that it is possible to include the "doc_ids" : [...] or "selector": {...} parts in the request body. All other parameters are expected to be in the query string, the same as using GET.

Couchbase Sync Gateway Documentation:

POST /{db}/_changes
Implementation Notes
Same as the GET /_changes request except the parameters are in the JSON body.


#2

@philmay,

There are usually are limitations as far as string length for _changes via GET. While POST is not dependent on url string length as much.

Here are a few community members that have made their custom builds and have done a PULL request for code changes. It be great to have your contributions.
https://github.com/couchbase?utf8=✓&q=couchbase-lite&type=&language=


#3

Which version of CBL are you using? We’ve worked around these incompatibilities in the 1.4 release, I believe.


#4

@househippo,
I’m betting the URL limit issue was the result of trying to run the _doc_ids filter as a GET, as that could get really big.

I have looked at the client code, and I see where it’s dropping the filter params from the query string (the filter name gets placed in the query string in all cases, but the filter params get skipped for a POST). I think the right fix may be to set the request to a POST only if docIDs is set in ChangeTracker, but still do GET for all other filtered _changes queries. This way the query string doesn’t get out of hand due to large doc lists, and we’re still compatible with CouchDB. I’ll look into this change.

Thanks!


#5

@jens,
I’m using CBL 1.4. I found the place in the code (master branch on GitHub) where the query params are being dropped. The filter name gets placed in the query string, but the extra params for the filter function do not. I think I may have a fix for it.

There’s an additional issue with CouchDB 2.0, where it times out after a minute if there is a POST message body and the filter name isn’t _doc_ids or _selector. That seems like a bug on their part, and I am looking into that too and have logged an issue with them. This worked in CouchDB 1.6. I bring this up only because it means that simply adding query params to the POST won’t immediately fix the issue, and to let others using CBL + CouchDB 2.0 know.


#6

@jens,

I implemented a fix to this bug that we are using internally, but when I inquired about a pull request, the dev team didn’t seem interested. Also, I gathered that they are not particularly interested in maintaining CouchDB/Cloudant compatibility with the Couchbase Clients. Is this your understanding? I’d like to get a feel for Couchbase’s stance on CouchDB compatibility, since it affects how we design our system.

Thanks!

  • Phil

#7

The fix should be to include the query params in the POST body only if the filter name is _doc_ids or selector … when the server is known to be CouchDB.

I’d like to get a feel for Couchbase’s stance on CouchDB compatibility, since it affects how we design our system.

Sync Gateway will continue to support the CouchDB replication protocol. Couchbase Lite 2 won’t; our new protocol is a lot faster and IMHO better designed. (But we will continue to support Couchbase Lite 1.x for another year.)


#8

@jens,

Thanks for the reply. The fix I implemented uses GET for _changes for all cases except when the filter is _doc_ids or _selector, when it uses POST. It looks like the sync gateway supports _changes GET, so I figured this would work for either CouchDB and Couchbase without having to know which you are talking to?

Either way, I appreciate the information about your compatibility roadmap. Do you have any feel for approximately how many others are using Couchbase Lite as a client library for CouchDB/Cloudant?

  • Phil

#9

Hi, @philmay i’m also using couchdb 1.6, with CBL 1.4 and my filtered replication isn’t working. Could you share the fix you cameup with


#10

@philmay sorry for the late response. We don’t track (and can’t, that I know of) CBL + CouchDB/Cloudant users. Also, maintaining support for CouchDB/Cloudant requires extra resources for testing, etc., so it’s not as straightforward a choice as it might seem.

Are you currently using your fix?

Can I ask if you’ve looked at alternatives (maybe AWS and Couchbase Server Community Edition) and what the barriers are, if any?

Just to note, too, CBL 1.x support I believe will continue for a year after the official 2.x release. (Not a year from Jens’ reply.)

Thanks,
Hod


#11

@hod.greeley we are currently using my fix in our projects, but I am no longer on that project. Our use of Apache CouchDB is pretty well locked in for now, as much of our backend service layer code is written around it.

Will your 2.x release explicitly break CouchDB compatibility, or is it simply going to be un-prioritized and allowed to go out of spec (i.e., it may work for a while)? Also, when is the official 2.x release so I can track when to expect support to stop.

Thanks!


#12

@wizlif144 let me look. I have been off that project for a while, but I should be able to dredge up my git diffs from my changes and pass those along…


#13

2.0 will explicitly break compatibility as it is no longer based around HTTP. Furthermore, as far as “support” for this, for enterprise subscriptions we have an explicit support policy that is followed. For open source / community users, though, issues will be less prioritized. It’s hard to pin down an exact release date for 2.0, but current trends seem to put it in Q1 2018.


#14

@wizlif144, First let me say that these changes were made around March or April of this year, and I haven’t merged any changes from Couchbase since then, so things may have changed a bit. I also had to change some of the Android test cases to get all the tests to pass.

Here are the git logs for the changes I made:

coudhbase-lite-java-core (parent = 1ee944aee15ca597b07431e69b7779dafe54cc67):

commit d133b4376cdfbc0c7aa93a7665908bc68a62a3eb


    Fixed the issue where filter parameters are not being passed in the
    query string of a POST message within the replicator. This change is
    in the code that tracks the _changes channel from CouchDB/Cloudant.

diff --git a/src/main/java/com/couchbase/lite/replicator/ChangeTracker.java b/src/main/java/com/couchbase/lite/replicator/ChangeTracker.java
index f0f96927..3ebf40de 100644
--- a/src/main/java/com/couchbase/lite/replicator/ChangeTracker.java
+++ b/src/main/java/com/couchbase/lite/replicator/ChangeTracker.java
@@ -105,7 +105,11 @@ public class ChangeTracker implements Runnable {
         this.limit = 0;
         this.requestHeaders = new HashMap<String, Object>();
         this.heartBeatSeconds = Replication.DEFAULT_HEARTBEAT;
-        this.usePOST = true;
+        // Set the default for _changes queries to GET, and only use POST
+        // when there are docIDs present or when the filter is set to
+        // sync_gateway/bychannel. This prevents query strings from
+        // getting prohibitively long when long docIDs or channel lists are set.
+        this.usePOST = false;
     }
 
     public boolean isContinuous() {
@@ -118,6 +122,12 @@ public class ChangeTracker implements Runnable {
 
     public void setFilterName(String filterName) {
         this.filterName = filterName;
+        // If the filter is set to filter by channels, then use the POST
+        // to query _changes. This keeps things as they were for channel
+        // filtering, and prevents long query strings.
+        if (ReplicationInternal.BY_CHANNEL_FILTER_NAME.equals(filterName)) {
+            setUsePOST(true);
+        }
     }
 
     public void setFilterParams(Map<String, Object> filterParams) {
@@ -322,6 +332,10 @@ public class ChangeTracker implements Runnable {
                         .addHeader("Accept-Encoding", "gzip")
                         .post(RequestBody.create(JSON, changesFeedPOSTBody()));
             }
+            else {
+                builder.header("User-Agent", Manager.getUserAgent())
+                        .addHeader("Accept-Encoding", "gzip");
+            }
             addRequestHeaders(builder);
 
             // Perform BASIC Authentication if needed
@@ -577,6 +591,16 @@ public class ChangeTracker implements Runnable {
 
     public void setDocIDs(List<String> docIDs) {
         this.docIDs = docIDs;
+        if (docIDs == null) {
+            return;
+        }
+        if (docIDs.size() == 0) {
+            return;
+        }
+        // If docIDs has been set to a list with at least one element, then
+        // use a POST to query _changes. Otherwise, if docIDs has been set
+        // to null or an empty List, then stay with GET.
+        setUsePOST(true);
     }
 
     public String changesFeedPOSTBody() {

couchbase-lite-android (parent = 603259066dad077f71884df774fc37bad5938bfe):

commit d83e695a6e4d68d01eddefd97d0ed6df7754096e (HEAD -> fix/942)


    Modified unit tests so that the use of GET for some _changes queries
    is acceptable and the tests still pass.

diff --git a/src/androidTest/java/com/couchbase/lite/replicator/ChangeTrackerTest.java b/src/androidTest/java/com/couchbase/lite/replicator/ChangeTrackerTest.java
index e9e1ef96..0e3e7146 100644
--- a/src/androidTest/java/com/couchbase/lite/replicator/ChangeTrackerTest.java
+++ b/src/androidTest/java/com/couchbase/lite/replicator/ChangeTrackerTest.java
@@ -203,11 +203,11 @@ public class ChangeTrackerTest extends LiteTestCaseWithDB {
 
         ChangeTracker changeTracker = new ChangeTracker(testURL,
                 ChangeTracker.ChangeTrackerMode.LongPoll, false, 0L, null);
-        changeTracker.setUsePOST(false);
         List<String> docIds = new ArrayList<String>();
         docIds.add("doc1");
         docIds.add("doc2");
         changeTracker.setDocIDs(docIds);
+        changeTracker.setUsePOST(false);
 
         String docIdsUnencoded = "[\"doc1\",\"doc2\"]";
         String docIdsEncoded = URLEncoder.encode(docIdsUnencoded);
diff --git a/src/androidTest/java/com/couchbase/lite/replicator/ReplicationMockWebServerTest.java b/src/androidTest/java/com/couchbase/lite/replicator/ReplicationMockWebServerTest.java
index 517133ba..085dd487 100644
--- a/src/androidTest/java/com/couchbase/lite/replicator/ReplicationMockWebServerTest.java
+++ b/src/androidTest/java/com/couchbase/lite/replicator/ReplicationMockWebServerTest.java
@@ -377,7 +377,7 @@ public class ReplicationMockWebServerTest extends LiteTestCaseWithDB {
             assertTrue(getCheckpointRequest.getMethod().equals("GET"));
             assertTrue(getCheckpointRequest.getPath().matches(MockHelper.PATH_REGEX_CHECKPOINT));
             RecordedRequest getChangesFeedRequest = dispatcher.takeRequest(MockHelper.PATH_REGEX_CHANGES);
-            assertTrue(getChangesFeedRequest.getMethod().equals("POST"));
+            assertTrue(getChangesFeedRequest.getMethod().equals("GET"));
             assertTrue(getChangesFeedRequest.getPath().matches(MockHelper.PATH_REGEX_CHANGES));
 
             // wait until the mock webserver receives a PUT checkpoint request with doc #2's sequence
@@ -503,12 +503,12 @@ public class ReplicationMockWebServerTest extends LiteTestCaseWithDB {
             assertTrue(getCheckpointRequest.getMethod().equals("GET"));
             assertTrue(getCheckpointRequest.getPath().matches(MockHelper.PATH_REGEX_CHECKPOINT));
             RecordedRequest getChangesFeedRequest = dispatcher.takeRequest(MockHelper.PATH_REGEX_CHANGES);
-            assertTrue(getChangesFeedRequest.getMethod().equals("POST"));
+            assertTrue(getChangesFeedRequest.getMethod().equals("GET"));
             assertTrue(getChangesFeedRequest.getPath().matches(MockHelper.PATH_REGEX_CHANGES));
             if (serverType == MockDispatcher.ServerType.SYNC_GW) {
-                Map<String, Object> jsonMap = Manager.getObjectMapper().readValue(getChangesFeedRequest.getUtf8Body(), Map.class);
-                assertTrue(jsonMap.containsKey("since"));
-                Integer since = (Integer) jsonMap.get("since");
+                Map<String, String> queryMap = query2map(getChangesFeedRequest.getPath());
+                assertTrue(queryMap.containsKey("since"));
+                Integer since = Integer.valueOf(queryMap.get("since"));
                 assertEquals(2, since.intValue());
             }