SyncGW Stopped Processing Abruptly

query

#1

Sync GW 1.2 Community Edition

Traffic was pumped using Jmeter @ 20cps. It started and was going on. After around an hour, the below error is seen in syncGW log.

sync_gateway_error.log:21:27:52.304034 2016-04-14T21:27:52.304+05:30 WARNING: Skipped Sequence 1379240 didn’t show up in MaxChannelLogMissingWaitTime, and isn’t available from the * channel view. If it’s a valid sequence, it won’t be replicated until Sync Gateway is restarted. – db.(*changeCache).CleanSkippedSequenceQueue.func1() at change_cache.go:220
sync_gateway_error.log:21:27:52.312309 2016-04-14T21:27:52.312+05:30 WARNING: Skipped Sequence 1716576 didn’t show up in MaxChannelLogMissingWaitTime, and isn’t available from the * channel view. If it’s a valid sequence, it won’t be replicated until Sync Gateway is restarted. – db.(*changeCache).CleanSkippedSequenceQueue.func1() at change_cache.go:220

Can someone please elaborate under which scenario syncGW will throw this error and why syncGW stopped processing abruptly after an hour of traffic??

– Gracelin


#2

There isn’t enough information there to know why SG stopped processing - that error on it’s own won’t stop processing.

That error indicates that the specified sequence numbers (1379240, 1716576) were expected but didn’t arrive on the server’s mutation feed, and also weren’t found when querying the server via view.

This usually means there was a previous write error for the documents with the sequences listed, and isn’t necessarily a problem (if the previous writes failed, the caller would have been notified and could resend).

More importantly, these shouldn’t cause Sync Gateway to stop processing. When you say ‘stop processing’, what exactly happened? The SG process crashed? Requests started returning error?


#3

Thanks for the update Adamf. I will provide the required information.

Do u mean to say that the below error is harmless for syncgw and syncgw will continue to work as expected even after throwing the below error?

WARNING: Skipped Sequence 1379240 didn’t show up in MaxChannelLogMissingWaitTime, and isn’t available from the * channel view. If it’s a valid sequence, it won’t be replicated until Sync Gateway is restarted

Traffic was started arnd 20:20:00 and data was pumped using JMeter. Fom syncgw log, PUT requests were coming to syncgw till 20:37:01. Around 20:37:02 the below errors ie deferring #xxx / Cache: Received #1907039 after xx ms / Received deduplicated #1906905 for (“s918553899955_ptx27540514042016081618_meta.doc” etc are seen every other second of operation but the processing of PUT request in the syncgw log is not seen after 20:37:02… then @ 21:27 the skipped sequence error was seen. syncgw process was up and running but i am clueless why the above errors are thrown and there is no PUT request handling in syncgw. Can you please help to debug? From Jmeter side what happened @ 20:37, I will provide the detail after a couple of hours.

Can you please explain the scenario when the below message can be seen in syncgw log?

  1. Cache: Deferring #1906951 (49 now waiting for #1906694#1906694)
  2. Cache: Received deduplicated #1906766 for (“s918553899999_ptx7537814042016081618_A7537814042016081618_Attachment.doc” / “2-142295ec58a463ec2ffff5bc62b8fd5e”)
  3. Cache: Received #1907034 after 88ms (“s918553899999_ptx27541414042016081618_access.doc” / “2-9aaf44a550a9b14d59587160eccacce1”)

20:37:02.927175 2016-04-14T20:37:02.927+05:30 HTTP: #1811775: GET /pttdata/s918553899955_ClientProfile.doc (ADMIN)
20:37:02.927225 2016-04-14T20:37:02.927+05:30 Events+: Webhook handler ran for event. Payload posted to URL http://prod-C1-AMS-V1.service.kodiakptt.com:8080/kodiakams/changesFeed/PTX, got status 200 OK
20:37:02.927259 2016-04-14T20:37:02.927+05:30 HTTP: #1811774: GET /pttdata/s918553899999_ClientProfile.doc (ADMIN)
20:37:02.927437 2016-04-14T20:37:02.927+05:30 Cache: Received #1906716 after 1796ms (“s918553899955_ptx27536914042016081618_meta.doc” / “3-c1fd2df584a177044972f9ebc70e0d62”)
20:37:02.927463 2016-04-14T20:37:02.927+05:30 Cache: Deferring #1906716 (1 now waiting for #1906693#1906715)
20:37:02.927529 2016-04-14T20:37:02.927+05:30 HTTP: #1811776: GET /pttdata/s918553899955_ClientProfile.doc (ADMIN)
20:37:02.927566 2016-04-14T20:37:02.927+05:30 Cache: Received deduplicated #1906917 for (“s918553899999_ptx7540014042016081618_A7540014042016081618_Attachment.doc” / “2-2c9d7866d38ea70c242c082c7396f64a”)
20:37:02.927580 2016-04-14T20:37:02.927+05:30 Cache: Deferring #1906917 (2 now waiting for #1906693#1906715)
20:37:02.927532 2016-04-14T20:37:02.927+05:30 HTTP+: #1811775: --> 200 (0.4 ms)
20:37:02.927595 2016-04-14T20:37:02.927+05:30 Cache: Received #1906920 after 528ms (“s918553899999_ptx7540014042016081618_A7540014042016081618_Attachment.doc” / “2-2c9d7866d38ea70c242c082c7396f64a”)
20:37:02.927612 2016-04-14T20:37:02.927+05:30 Cache: Received #1907039 after 72ms (“s918553899999_ptx7541814042016081618_A7541814042016081618_Attachment.doc” / “1-3941a2934aa3c3b6f298681d1228fb72”)
20:37:02.927625 2016-04-14T20:37:02.927+05:30 Cache: Deferring #1906920 (3 now waiting for #1906693#1906715)
20:37:02.927636 2016-04-14T20:37:02.927+05:30 Cache: Deferring #1907039 (4 now waiting for #1906693#1906715)
20:37:02.927762 2016-04-14T20:37:02.927+05:30 Cache: Received deduplicated #1906905 for (“s918553899955_ptx27540514042016081618_meta.doc” / “3-acd89d4b7243b5b21e0f6fea5aa704fe”)
20:37:02.927775 2016-04-14T20:37:02.927+05:30 Cache: Deferring #1906905 (5 now waiting for #1906693#1906715)
20:37:02.927783 2016-04-14T20:37:02.927+05:30 Cache: Received deduplicated #1906923 for (“s918553899955_ptx27540514042016081618_meta.doc” / “3-acd89d4b7243b5b21e0f6fea5aa704fe”)
20:37:02.927788 2016-04-14T20:37:02.927+05:30 Cache: Received deduplicated #1906891 for (“s918553899955_ptx27540314042016081618_meta.doc” / “3-59113691397e7dbe748e837ffa286496”)
20:37:02.927790 2016-04-14T20:37:02.927+05:30 Cache: Deferring #1906923 (6 now waiting for #1906693#1906715)

Regards,
Gracelin