Windows 2016 Goes Unresponsive after a week or two


#1

We loaded Couchbase CE 5.0.1 Build 5003 on 10 servers running Windows Server 2016 in April of this year (2018). We have not been able to use them consistently because 2 or 3 of the servers go unresponsive every week. It does not blue screen but you cannot manage it remotely or log into it via RDP or the console.

We have worked with Microsoft and Dell and they have not found any issues. The only clue we have is that the servers have not gone unresponsive in over a week after we disabled the Couchbase service so we are looking to find a fix regarding Couchbase.

The last incident occurred on a server that was rebooted after patching at 7 PM and the server went unresponsive at 10:00 PM the same day. There was nothing actively getting/setting data in Couchbase after the reboot so it was just idle for a 2-3 hrs then went unresponsive. The pic shows the logs for the patching window and when the server was rebooted by a technician after going unresponsive.

Any ideas?

These serves are beasts…
Physical server
640 GB RAM
36 CPU Cores and 72 logical processors