SyncGW PUT request, response taking 5 to 8 min - Log Analysis


#1

Following is the log snippet from syncGW log for a single document PUT operation. Response from syncGW is getting delayed during the PUT operation, Log snippet is provided at the bottom for a document. From the below log snippet the below is the understanding

  1. HTTP: #135: PUT —> PUT operation from client [ 18:49:40.571745 ]
    2.CRUD+: Invoking sync —> sync function is invoked by syncGW [ 18:49:40.572226 ]
  2. SAVING #1378921 [ 18:49:40.572881 ] -----> save request is sent to CB - Is my understanding correct?
  3. CRUD: Stored doc [ 18:49:40.573059 ] -----> what is happening in this?? - Does this mean request has been successfully sent to CouchBase to store the document in CBS?
    What is happening in #3 and #4 and #6 ?
  4. Events+: Event queue worker sending event [ 18:49:40.573168 ] -------> Event Handler flow hit for this document
    6 Cache: Received #1378921 [ 18:49:42.246648 ] -------> Does this mean that SyncGW got response from CBS that document is stored?
  5. HTTP+: #135: --> 201 (5967.6 ms) [ 18:49:46.539320 ] -------> response is sent to the client PUT request

Following is the log snippet

  1. 18:49:40.571745 2016-04-14T18:49:40.571+05:30 HTTP: #135: PUT /ptxdata/s918553899999_ptx100614042016064915_access.doc (ADMIN)
    18:49:40.572226 2016-04-14T18:49:40.572+05:30 CRUD+: Invoking sync on doc “s918553899999_ptx100614042016064915_access.doc” rev 1-b3642bc299c4a2768b8aa51b4404d9a0
    18:49:40.572881 2016-04-14T18:49:40.572+05:30 Cache: SAVING #1378921
    18:49:40.573059 2016-04-14T18:49:40.573+05:30 CRUD: Stored doc “s918553899999_ptx100614042016064915_access.doc” / "1-b3642bc299c4a2768b8aa51b4404d9a0"
    18:49:40.573168 2016-04-14T18:49:40.573+05:30 Events+: Event queue worker sending event Document change event for doc id: s918553899999_ptx100614042016064915_access.do to: Webhook handler [http://prod-C1-AMS-V1.service.kodiakptt.com:8080/kodiakams/changesFeed/PTX]
    18:49:42.246648 2016-04-14T18:49:42.246+05:30 Cache: Received #1378921 after 1667ms (“s918553899999_ptx100614042016064915_access.doc” / “1-b3642bc299c4a2768b8aa51b4404d9a0”)
    18:49:46.539320 2016-04-14T18:49:46.539+05:30 HTTP+: #135: --> 201 (5967.6 ms)

How to find out what is causing the delay??. Access() is getting called for this document in sync function. Will the access call cause delay in PUT operation of a document.?

When running traffic the response is getting delayed too much after sometime and the application is just hanging waiting for response. Unable to proceed with traffic run.

– Gracelin
In the log what is “Deferring #1378937 (2 now waiting for #1378934#1378935)” mean?? Will it cause any delay in response??


#2

The request is actually taking 5-8 seconds (not minutes), based on the logs, correct?

Based on the logs, it looks to me like it’s the webhook processing that’s causing the delay. The ‘Cache:’ logging indicates that the doc has already been successfully written to Couchbase Server, and is being seen on the server’s mutation feed (used by Sync Gateway to build the channel index).

Have you tuned any of the webhook parameters? If not, it looks a lot like it’s reaching the default wait timeout for webhook event processing (5 seconds). Some details on the parameters are here:
http://developer.couchbase.com/documentation/mobile/1.2/develop/guides/sync-gateway/server-integration/webhooks/index.html

In general it looks to me like the webhook processing can’t keep up with your write load. You could validate that by either disabling webhooks or setting wait_for_process to zero.

If that’s the case, your options include:

  • identify whether your webhook endpoint response time can be reduced
  • limit the number of webhook events raised (using a filter)
  • increase the number of processes used for webhook processing (at a cost of CPU/memory on SG)
  • distribute your write load across more than one SG node

#3

Thanks a lot Adamf for the quick response.

I am extremely sorry…Response is taking 5 to 8 seconds not minutes…Thanks for pointing that…

There is no webhook for this document type. Following is the web hook code in the syncgw config file

     "event_handlers": { "max_processes":200, "wait_for_process":"30000", "document_changed":[
            {
                "handler":"webhook",
                "url":"http://xxxxx.com:8080/ktest/changesFeed/Ptx",
                "timeout":30,
                "filter":`function(doc) {
                            if (((doc.type == "ptxxxx") || (doc.type == "ptxyyy")) && (doc.state == "pending")) {
                                return true;
                            }
                            return false;
                        }`
            } ]
        }

Since this document [ s918553899999_ptx100614042016064915_access.doc] has type as ptxaccess, webhook filter function will not return true and hence it wont be posted in the URL

Events+: Event queue worker sending event Document change event for doc id: s918553899999_ptx100614042016064915_access.doc --> This entry in syncgw log will get printed for every document whether the webhook filter function returned true or false. Is it not?

When you mentioned in the reply “In general it looks to me like the webhook processing can’t keep up with your write load” - My understanding about webhook is it is a call made by syncgw asynchronously. when client does a PUT operation, does the response for the PUT is returned to the client only after webhook processing is completed fully or when the document is written to CBS and then webhook queue it is put then client will get a success message. Can you please point me to some URL or document or explain the sequence of operations that happens when a PUT request[ new document ] is made by the client?

Regards,
Gracelin