Operations over threshold warning


#1

Hello,

We have been observing warnings in our logs:

WARN  ThresholdLogReporter:150 - Operations over threshold: 
[
  {
    "top": [
      {
        "operation_name": "view",
        "last_local_id": "7830659CD9EC346E/000000003AC83B6A",
        "last_local_address": "127.0.0.1:54246",
        "last_remote_address": "127.0.0.1:9001",
        "last_dispatch_us": 4863675,
        "total_us": 4864145
      },
      {
        "operation_name": "view",
        "last_local_id": "7830659CD9EC346E/000000003AC83B6A",
        "last_local_address": "127.0.0.1:54246",
        "last_remote_address": "127.0.0.1:9001",
        "last_dispatch_us": 6028562,
        "total_us": 6033743
      }
    ],
    "service": "view",
    "count": 2
  }
]
  • Can someone help us understand the reason of this warning?
  • Although nothing seems to break during our executions, we would still like to know if this is something to be worried about?

Regards,
c


#2

Hello @cdz117,

The newer versions of the SDK has a new feature called Tracing where operations that are considered slow are logged. Depending on the environment this can mean one of two things:

  1. The environment has a performance issue that later could cause an timeout. (This should not “break” anything in the application as it should be design to handle timeouts) Which should be investigated further.
  2. The threshold logging is set to low for the environment and those operations are always expect to preform at this level. In those cases the threshold might need to be increased to stop excessive logging.

I would recommend having a read over the Response Time Observability with the Java SDK blog and the SDK documentation on Tracing from the Java SDK with Couchbase Server I have linked to the Java SDK but each SDK doc has that information.

In this case there were two views operations that were above the threshold. Focusing just on the first one:

      {
        "operation_name": "view",
        "last_local_id": "7830659CD9EC346E/000000003AC83B6A",
        "last_local_address": "127.0.0.1:54246",
        "last_remote_address": "127.0.0.1:9001",
        "last_dispatch_us": 4863675,
        "total_us": 4864145
      }

It took a total time of 4.8 (“total_us”: 4864145) seconds to complete, depending on the how complex the view query is, how much data is being returned and what hardware/network the system is deploy on this performs could be expected.


#3

Thank you @pvarley. This was something I thought as well, but it is good to get it confirmed.

Regards,
c