How do I configure webSockets on couchbase lite to pull in all a user's documents when replication begins?


#1

Hi, I am trying to setup a websockets connection between my iOS couchbase lite 1.1.0 application and sync gateway 1.1.0 and I can’t find any documentation about how I should do this. Is there any up to date documentation on what we should do? Sync gateway does have documentation and I’ve set it up to correctly receive websocket requests, it’s just the couchbase lite side that has nothing. It looks like the default connection is websocket, but this doesn’t work for us. If we don’t set customProperties.websocket = 0 then our app won’t pull all a user’s documents when they login. Our current setup for the pull replicator is:

pull = [_database createPullReplication:_remoteURL];
pull.continuous = YES;

I can see in the app it is making requests when we start the pull replication:

2015-10-23 12:32:14.856 PUSH[868:137593] Sync: CBL_Puller[https://host.com/dbws/]: Replicating from lastSequence=(null)
2015-10-23 12:32:14.863 PUSH[868:137593] SyncVerbose: CBL_Puller[https://host.com/dbws/] starting ChangeTracker: mode=3, since=(null)
2015-10-23
 12:32:14.865 PUSH[868:137593] SyncVerbose: 
CBLWebSocketChangeTracker[0x15f1d4c0 dbws]: GET 
//host.com/dbws/_changes?feed=websocket

Then I see nothing else in the logs and on the sync gateway side it shows that it’s only looking for changes since a very high value, instead of 0:

16:31:42.539035 2015-10-23T16:31:42.539Z Cache: getCachedChanges("d0287c90-2c7a-4daf-86db-23650d74ca91_4215", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539062 2015-10-23T16:31:42.539Z Cache: getCachedChanges("2bde2091-66de-4048-9e74-f73b8085e27d_4115", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539085 2015-10-23T16:31:42.539Z Cache: getCachedChanges("d0287c90-2c7a-4daf-86db-23650d74ca91_4315", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539103 2015-10-23T16:31:42.539Z Cache: getCachedChanges("team_91f52b7d14a3554ae63836c4ffda1d8e_sync", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539120 2015-10-23T16:31:42.539Z Cache: getCachedChanges("55bb0a19-87da-4c37-91a9-f80fa6dac933_4115", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539136 2015-10-23T16:31:42.539Z Cache: getCachedChanges("605c3293-a0b5-4811-afb5-8f9242977bd9_4215", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539155 2015-10-23T16:31:42.539Z Cache: getCachedChanges("team_0d4f7df40f3585fa5370501c1b5987c1_sync", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539172 2015-10-23T16:31:42.539Z Cache: getCachedChanges("d0287c90-2c7a-4daf-86db-23650d74ca91_sync", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539189 2015-10-23T16:31:42.539Z Cache: getCachedChanges("55bb0a19-87da-4c37-91a9-f80fa6dac933_4315", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539205 2015-10-23T16:31:42.539Z Cache: getCachedChanges("d0287c90-2c7a-4daf-86db-23650d74ca91_4115", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539222 2015-10-23T16:31:42.539Z Cache: getCachedChanges("605c3293-a0b5-4811-afb5-8f9242977bd9_4115", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539238 2015-10-23T16:31:42.539Z Cache: getCachedChanges("60a5f471-1134-4a5f-8f8c-10bf2ed1ba54_4315", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539257 2015-10-23T16:31:42.539Z Cache: getCachedChanges("team_5265805abd427a6c7ee85bd151228b23_sync", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539277 2015-10-23T16:31:42.539Z Cache: getCachedChanges("60a5f471-1134-4a5f-8f8c-10bf2ed1ba54_sync", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539294 2015-10-23T16:31:42.539Z Cache: getCachedChanges("2bde2091-66de-4048-9e74-f73b8085e27d_4315", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539310 2015-10-23T16:31:42.539Z Cache: getCachedChanges("55bb0a19-87da-4c37-91a9-f80fa6dac933_sync", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539339 2015-10-23T16:31:42.539Z Cache: getCachedChanges("605c3293-a0b5-4811-afb5-8f9242977bd9_sync", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539359 2015-10-23T16:31:42.539Z Cache: getCachedChanges("605c3293-a0b5-4811-afb5-8f9242977bd9_4315", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539376 2015-10-23T16:31:42.539Z Cache: getCachedChanges("60a5f471-1134-4a5f-8f8c-10bf2ed1ba54_4115", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539393 2015-10-23T16:31:42.539Z Cache: getCachedChanges("60a5f471-1134-4a5f-8f8c-10bf2ed1ba54_4215", {0 0 250123}) --> 0 changes valid from #248963
16:31:42.539409 2015-10-23T16:31:42.539Z Cache: getCachedChanges("55bb0a19-87da-4c37-91a9-f80fa6dac933_4215", {0 0 250123}) --> 0 changes valid from #248963

So how are we supposed to configure the websocket connection so it actually asks for all changes since 0, then going forward it receives new changes that come in?


#2

The fact that it’s a WebSocket connection instead of REST is an implementation detail that you shouldn’t need to know/care about, unless you’ve got some kind of middleware in front of your SG that doesn’t support WebSockets.

I wouldn’t look at or even enable the Cache logs from SG — they’re not at all relevant to this. You’d only need them if you’re debugging something that seems to be related to caching.

Instead look at the HTTP logs (enabling HTTP+ helps too.) You should see something like:

HTTP:  #398: GET /cbl_auth_test/_changes?feed=websocket  (as test)
HTTP+: #398:     --> 101 Upgraded to WebSocket protocol  (0.0 ms)

#3

I don’t see either of those messages in my sync gateway logs. I have http+ enabled for logs.


#4

Sounds like something in between CBL and SG is blocking WebSocket connections. What’s the connection like? In particular, is it going through a cell network, is the server hosted by AWS or something similar, or do you have a gateway/proxy in front of SG?


#5

I have nginx sitting in front of sync gateway, but it’s configured as specified in the sync gateway documentation:
location / {
proxy_pass http://sync_gateway;
proxy_pass_header Accept;
proxy_pass_header Server;
proxy_http_version 1.1;
keepalive_requests 1000;
keepalive_timeout 360s;
proxy_read_timeout 360s;
}

I also have an azure load balancer, but it doesn’t strip stuff out. nginx is configured to only listen to 443 though, if someone tries to hit the server over port 80 it just redirects them to 443. I don’t even see a request come in to nginx though. I’m watching the access log and I just see a bunch of 400 responses. From our devices not using websockets I can see a lot of successful POST requests:

141.117.116.30 - - [23/Oct/2015:17:47:28 +0000] “POST /dbws/_changes HTTP/1.1” 200 54 “-” “CouchbaseLite/1.1 (iOS)”

But I don’t see any from the websockets device hitting nginx. As mentioned above CBL_Puller lists the correct host url, but then the CBLWebSocketChangeTracker doesn’t show the protocol used. Can I safely assume it is using https as specified in the CBL_Puller connection, or do we need to specify somewhere that the websockets should go over https?


#6

Yes, the WebSocket connection uses the same server URL, i.e. `https://host.com/’. If you’re not seeing a GET request at nginx, something earlier may be blocking it; is there anything else?

Another possibility is that the device is running iOS 9 and your SSL server doesn’t meet the requirements of App Transport Security. This would cause a connection failure on the client side that would be reported by the CBLReplication; is your client code handling errors?


#7

Sorry about this, I was hitting our production server instead of our development server. I can see the requests in sync gateway now, but not the websocket connection logs you were mentioning. I just see:

18:11:05.829685 2015-10-23T18:11:05.829Z HTTP:  #20935: GET /todos/_local/b4a82e4d1ac1a690c79230258b46ce05232d986a  (as d00c3bd6-3d39-4a3d-91f2-5cdc51d35386)
18:11:05.830262 2015-10-23T18:11:05.830Z HTTP: #20935:     --> 404 missing  (4.1 ms)
18:11:05.868489 2015-10-23T18:11:05.868Z HTTP:  #20936: GET /todos/_local/9d1b2bb550d33faed6deb1e241f55dca2e3a7692  (as d00c3bd6-3d39-4a3d-91f2-5cdc51d35386)
18:11:05.868681 2015-10-23T18:11:05.868Z HTTP: #20936:     --> 404 missing  (3.8 ms)
18:11:05.942412 2015-10-23T18:11:05.942Z HTTP:  #20937: GET /todos/_changes?feed=websocket  (as d00c3bd6-3d39-4a3d-91f2-5cdc51d35386)
18:11:05.942675 2015-10-23T18:11:05.942Z HTTP+: #20937:     --> 200   (4.1 ms)

I can see that nginx returned a 400 from that GET request though, and the app shows a 400 response as well.


#8

Note: The nginx sync gateway logs also say to add this to the nginx config to enable web sockets:

location / {
.
.
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection “upgrade”;
}

If I add those two lines though I don’t see any requests in nginx, but I do see the sync gateway upgraded to WebSocket protocol log. Then nothing happens though, The app receives no data and sync gateway shows no other logs after:

18:44:06.176872 2015-10-23T18:44:06.176Z HTTP:  #20987: GET /todos/_changes?feed=websocket  (as d00c3bd6-3d39-4a3d-91f2-5cdc51d35386)
18:44:06.176996 2015-10-23T18:44:06.176Z HTTP+: #20987:     --> 101 Upgraded to WebSocket protocol  (0.0 ms)

Eventually the “–> WebSocket closed” message comes up, about 6 minutes later when the connection would timeout. It doesn’t seem to be replicating anything over this connection, or picking up any changes.


#9

Oh, I thought you’d already added all the necessary stuff to the nginx config.

Sounds like WebSockets are working now. If the app isn’t getting changes, than something else is wrong. If you enable ChangeTracker logs in CBL you’ll get more info about what changes it’s receiving from SG.

Are you sure the app hasn’t already synced? Is its replicator still logging Replicating from lastSequence=(null) when it starts? Does the user account you’re authenticating as have access to the docs?


#10

I enabled the WS and ChangeTracker logs. I’m running the app by deleting the application and then running it from xcode each time, so I’m assuming deleting the app should clear any old couchbase dbs. With both those logs enabled I see the following in the app when I startup the pull replicator (not showing the push replicator logs):

2015-10-23 15:17:43.237 PUSH[1056:160666] Sync: CBL_Puller[http://host.com/dbws/]: Replicating from lastSequence=(null)
2015-10-23 15:17:43.239 PUSH[1056:160666] SyncVerbose: CBL_Puller[http://host.com/dbws/] starting ChangeTracker: mode=3, since=(null)
2015-10-23 15:17:43.242 PUSH[1056:160666] ChangeTracker: CBLWebSocketChangeTracker[0x18007620 dbws]: Starting...
2015-10-23 15:17:43.245 PUSH[1056:160666] SyncVerbose: CBLWebSocketChangeTracker[0x18007620 dbws]: GET //host.com/dbws/_changes?feed=websocket
2015-10-23 15:17:43.250 PUSH[1056:160666] WS: TRACE: -[CBL_WebSocket init]
2015-10-23 15:17:43.253 PUSH[1056:160666] WS: TRACE: -[CBL_WebSocketClient _connect:]
2015-10-23 15:17:43.258 PUSH[1056:160666] ChangeTracker: CBLWebSocketChangeTracker[0x18007620 dbws]: Started... <http://host.com/dbws/_changes?feed=websocket>
2015-10-23 15:17:43.298 PUSH[1056:160523] WS: TRACE: -[CBL_WebSocket socket:didWriteDataWithTag:] 

It doesn’t show any errors, but it doesn’t show anything being sent from the server either. I have ChangeTrackerVerbose enabled as well, but I see no output from it.


#11

If we disable web sockets then everything generally replicates properly when the user logs in. It is kind of buggy in that sometimes it will report it is done replicating before it is finished, and even though the replicator is set to continuous it will sometimes just stop replicating and we have to kill and restart the app to get the replication working again. We’re trying to switch to WebSockets for push and pull replication because it seems like it might be less buggy than the non-web sockets version. So I know the user has documents and they aren’t already synced to the device because everything works ok when we don’t use web sockets. I can see in sync gateway that the app requests some _local docs, but I’m not sure what they are for. Do you know if the two 404s below are the reason our web sockets aren’t working?

20:42:05.924683 2015-10-23T20:42:05.924Z HTTP:  #21676: GET /todos/_local/35425015dc28ff5fd60a9df16ead7dd96423717c  (as d00c3bd6-3d39-4a3d-91f2-5cdc51d35386)
20:42:05.924978 2015-10-23T20:42:05.924Z HTTP: #21676:     --> 404 missing  (4.1 ms)
20:42:05.967473 2015-10-23T20:42:05.967Z HTTP:  #21677: GET /todos/_local/f94bfd96dac0aae6250fab922385080a9d8a1d96  (as d00c3bd6-3d39-4a3d-91f2-5cdc51d35386)
20:42:05.967746 2015-10-23T20:42:05.967Z HTTP: #21677:     --> 404 missing  (3.8 ms)
20:42:06.064037 2015-10-23T20:42:06.064Z HTTP:  #21678: GET /todos/_changes?feed=websocket  (as d00c3bd6-3d39-4a3d-91f2-5cdc51d35386)
20:42:06.064220 2015-10-23T20:42:06.064Z HTTP+: #21678:     --> 101 Upgraded to WebSocket protocol  (0.0 ms)

#12

Using WebSockets probably isn’t going to fix other issues with replication; all it changes is the mechanism by which change notifications arrive. It’s generally better because it’s more efficient, but it’s only one small piece of replication. (And it’s only used for pull, not push.)

What I’m saying is that other problems you’re having with replication should be investigated; it shouldn’t be intermittently buggy like that. (I’d recommend upgrading both client and server to 1.1.1 before filing any bug reports, though, since a number of issues have been fixed.)

I don’t know what’s going on with the WebSockets, although it looks as though the messages are getting lost. Maybe it has something to do with nginx. I’ll see if I can drag in someone here who knows about nginx.

The _local/... docs that the replicator GETs are server-side checkpoints that store replication state. It’s not an error for these requests to return 404; it just means that this replication has never been run before.


#13

If you enable “Changes+” logging on Sync Gateway, it will shed some light on whether Sync Gateway is generating the changes (and they just aren’t making it back through nginx), or if the changes aren’t getting sent in the first place.

I also second the suggestion from Jens about moving to 1.1.1 - there were several fixes to changes processing on the Sync Gateway side.


#14

I upgraded sync gateway to v1.1.1 and enabled Changes+, but nothing showed up in the logs after the WebSocket connection was established, it looked the same as what I posted above. I haven’t had a chance to upgrade couchbase lite to 1.1.1 but it doesn’t look like it’s going to help since there isn’t anything WebSocket related in it, and the issue seems like it’s on the server since sync gateway doesn’t do anything. Are there any other logs or steps I can take that you would recommend to try and isolate the problem to either the server or the app? Currently on sync gateway I have these log flags:

"log": ["CRUD", "REST+", "Events+", "Access","Changes+","HTTP+"],

#15

I enabled Changes+ on sync gateway, upgraded couchbase lite to v1.1.1 and sync gateway to v1.1.1, but I still see nothing in the sync gateway logs after “101 Upgraded to WebSocket protocol (0.0ms)”. I tried running a simple unit test on the client to see if it was the client code but it has the same issue, it just doesn’t pull anything from the database. I have two unit tests, the NoWebsockets pull replication works, but the web sockets one doesn’t:

- (void) testLoggedInPullReplicationCookie {
    NSURL *syncUrl = [[NSURL alloc] initWithString:SYNC_GATEWAY_URL];
    _pull = [_database createPullReplication:syncUrl];
    
    _pull.continuous = YES;
    
    //setup authentication with stored cookie value (cookie never expires)
    NSHTTPCookie* cookie = [NSHTTPCookie cookieWithProperties:
                             @{ NSHTTPCookieName: UNIT_TEST_LOGIN_COOKIE_NAME,
                                NSHTTPCookieOriginURL: syncUrl,
                                NSHTTPCookiePath: syncUrl.path,
                                NSHTTPCookieValue: UNIT_TEST_LOGIN_COOKIE_VALUE,
                                NSHTTPCookieExpires: [NSDate dateWithTimeIntervalSinceNow: 100]
                                }];
    
    [_pull setCookieNamed: cookie.name
               withValue: cookie.value
                    path: cookie.path
          expirationDate: cookie.expiresDate
                  secure: cookie.isSecure];
    
    NSNotificationCenter* nctr = [NSNotificationCenter defaultCenter];
    [nctr addObserver: self selector: @selector(replicationProgress:)
                 name: kCBLReplicationChangeNotification object: _pull];
    
    _expectReplicationComplete = [self expectationWithDescription:@"Expectation that replication with cookie login will complete"];
    
    [_pull start];
    
    //wait 30 seconds for replication to complete
    [self waitForExpectationsWithTimeout:30.0 handler:^(NSError *error) {
     
     if(error)
     {
     XCTFail(@"Expectation Failed with error: %@", error.localizedDescription);
     }
     _expectReplicationComplete = nil;
     }];
}

- (void) testLoggedInPullReplicationCookieNoWebsockets {
    NSURL *syncUrl = [[NSURL alloc] initWithString:SYNC_GATEWAY_URL];
    _pull = [_database createPullReplication:syncUrl];
    
    _pull.continuous = YES;
    _pull.customProperties = @{@"websocket":@0};
    
    //setup authentication with stored cookie value (cookie never expires)
    NSHTTPCookie* cookie = [NSHTTPCookie cookieWithProperties:
                            @{ NSHTTPCookieName: UNIT_TEST_LOGIN_COOKIE_NAME,
                               NSHTTPCookieOriginURL: syncUrl,
                               NSHTTPCookiePath: syncUrl.path,
                               NSHTTPCookieValue: UNIT_TEST_LOGIN_COOKIE_VALUE,
                               NSHTTPCookieExpires: [NSDate dateWithTimeIntervalSinceNow: 100]
                               }];
    
    [_pull setCookieNamed: cookie.name
                withValue: cookie.value
                     path: cookie.path
           expirationDate: cookie.expiresDate
                   secure: cookie.isSecure];
    
    NSNotificationCenter* nctr = [NSNotificationCenter defaultCenter];
    [nctr addObserver: self selector: @selector(replicationProgress:)
                 name: kCBLReplicationChangeNotification object: _pull];
    
    _expectReplicationComplete = [self expectationWithDescription:@"Expectation that replication with cookie login and no web sockets will complete"];
    
    [_pull start];
    
    //wait 30 seconds for replication to complete
    [self waitForExpectationsWithTimeout:30.0 handler:^(NSError *error) {
        
        if(error)
        {
            XCTFail(@"Expectation Failed with error: %@", error.localizedDescription);
        }
        _expectReplicationComplete = nil;
        
    }];
}

This looks like the issue is on the server since that is the same code as the couchbase mobile tutorials and unit tests. I’ve already enabled pretty much all logging on sync gateway and it looks like it doesn’t do anything after the initial websocket connection. Is there anything else on the server side I can do to debug this?


#16

It seems most likely to be a case where nginx is preventing the websocket connection from being properly established.

To test this, is it possible to point the client directly at the SG (bypassing nginx)? If that’s successful, we can focus on possible nginx configuration steps.


#17

Thanks for the tip! I tried hitting sync gateway directly and web sockets worked, so it’s definitely an nginx issue. I’ll look more deeply into how to configure nginx in general to work with web sockets and see if that’s the problem. Thanks for all your help, especially responding at such a late hour. You guys have awesome customer service!


#18

Turns out I had a really old version of nginx on my server. Once I upgraded to the latest version websockets started working. Thanks again for all your help in narrowing the problem down to nginx!


#19

Yay! Great to hear. Could you let us know the old version number? We can add a note about that to the config page.

(And thanks! But this isn’t even real customer support … Adam and I are dev engineers helping out in spare time. Our real support engineers are professional ninjas who will parachute into your data center at 3am on a Sunday and rebalance all of your JSONs or whatever.)


#20

I am using Ubuntu 12.04, which came with nginx v1.1.19 by default. Minimum required version is 1.3 for websocket support. To check installed version you can run “nginx -v’”. upgraded by running (as root user):

service nginx stop
add-apt-repository ppa:nginx/stable
apt-get update
apt-get install nginx