We have a 10 node cluster and we have been monitoring the count of healthy nodes as reported from the API. We’re monitoring this metric from all the nodes’ API and we’ve noticed the following weird behaviour: every once in a while one of the nodes (the same each time) reports another one of the nodes (the same each time but different than the previous one) as being in status warmup.
Looking through the documentation it seems that the warmup status should occur whenever the service restarts but that is not the case, the coucbhase services have an uptime equal to the machine uptime.
Checking the error logs we found various logs but these appear on all nodes so we guess this should be unrelated to the warmup status thing.
Any pointers on why would this be happening and/or where to look next?