I am running Couch base server 4.0 with sync gateway 1.1.1 on 5 nodes community version.
(After this issue started then i have upgraded couchbase from 3.0.1 to 4.0 and sync gateway from 1.1.o to 1.1.1)
Since last week Couchbase sync gateway CPU was suddenly goes high and it never come back down unless restart server or restart sync gateway. It happens randomly on 1 or 2 nodes and rest of the nodes cpu utilization less than 5%
When this thing happens some times sync gateway does not respond anymore and only option to restart the server and start sync gateway.
Please see syncgateway logs during high CPU.
15:48:10.052627 2016-03-29T15:48:10.052+11:00 HTTP: #788877: --> 599 Write error: write tcp 172.16.1.76:17733: broken pipe (0.0 ms)
15:48:10.150459 2016-03-29T15:48:10.150+11:00 HTTP: #786922: --> 599 Write error: write tcp 172.16.1.76:9273: broken pipe (0.0 ms)
15:48:10.150843 2016-03-29T15:48:10.150+11:00 HTTP: #786921: --> 599 Write error: write tcp 172.16.1.76:62977: broken pipe (0.0 ms)
15:48:10.151158 2016-03-29T15:48:10.151+11:00 HTTP: #786923: --> 599 Write error: write tcp 172.16.1.76:54098: broken pipe (0.0 ms)
15:48:10.152551 2016-03-29T15:48:10.152+11:00 HTTP: #786924: --> 599 Write error: write tcp 172.16.1.76:31611: broken pipe (0.0 ms)
15:48:10.207185 2016-03-29T15:48:10.207+11:00 HTTP auth failed for username=“user_1750535”
BTW 172.16.1.76 is our hardware load balance IP as external traffic comes through load balance and then forward that traffic to couchbase sync gateway nodes.
I have tried settings revision limit to 20 and increased “maxFileDescriptors” to 10000 from 5000 default limit and also increased open files limit to “20000” in centos.
Please let me know what other things i can try to fix this issue as this is causing system unstable and going offline quite often. If you need more information please inform me.