Inconsistent behaviour on N1QL/Index

Hello,

I’ve been helping a customer get the most out of their Couchbase/Sync Gateway servers and we’ve been encountering some issues that we would like to clarify in order to get an understanding of what is going on with inconsistent behaviours/performance when using different parts of the Couchbase stack.

So, we have a query that goes something along these lines:

SELECT *
FROM process
WHERE meta().id NOT LIKE "_sync%"
AND type = "workflow-job"
AND entity.state = "Pending"
AND entity.runAfter = "2018-02-26T09:07:57.693Z"
AND array_count(process.entity.dependsOnJobs) = 0
AND entity.runOn = "BCG-BE-6C4AE161"
ORDER BY entity.startDate ASC
LIMIT 10

Execution times for this query, without any index have been in the order of 10 to 40s, so in order to optimize things we’ve introduced an index for this specific query that goes like this:

CREATE INDEX idx_test ON `process`(type, entity.state) WHERE type = "workflow-job" AND entity.state = "Pending"

We also have another similar index that goes like this:

CREATE INDEX process_workflow_index2 ON `process`(entity.state) WHERE meta().id NOT LIKE "_sync%" AND type = "workflow-job" AND entity.state = "Pending"

The explain plan goes something like this:

[
  {
    "plan": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "Sequence",
          "~children": [
            {
              "#operator": "IntersectScan",
              "scans": [
                {
                  "#operator": "IndexScan",
                  "index": "idx_test",
                  "index_id": "b177b75eb23e40f6",
                  "keyspace": "process",
                  "namespace": "default",
                  "spans": [
                    {
                      "Range": {
                        "High": [
                          "\"workflow-job\"",
                          "\"Pending\""
                        ],
                        "Inclusion": 3,
                        "Low": [
                          "\"workflow-job\"",
                          "\"Pending\""
                        ]
                      }
                    }
                  ],
                  "using": "gsi"
                },
                {
                  "#operator": "IndexScan",
                  "index": "process_workflow_index2",
                  "index_id": "500a23cdebc0904a",
                  "keyspace": "process",
                  "namespace": "default",
                  "spans": [
                    {
                      "Range": {
                        "High": [
                          "\"Pending\""
                        ],
                        "Inclusion": 3,
                        "Low": [
                          "\"Pending\""
                        ]
                      }
                    }
                  ],
                  "using": "gsi"
                },
                {
                  "#operator": "IndexScan",
                  "index": "#primary",
                  "index_id": "f7935ad2b645fee2",
                  "keyspace": "process",
                  "namespace": "default",
                  "spans": [
                    {
                      "Range": {
                        "Inclusion": 0,
                        "Low": [
                          "null"
                        ]
                      }
                    }
                  ],
                  "using": "gsi"
                }
              ]
            },
            {
              "#operator": "Fetch",
              "keyspace": "process",
              "namespace": "default"
            },
            {
              "#operator": "Parallel",
              "~child": {
                "#operator": "Sequence",
                "~children": [
                  {
                    "#operator": "Filter",
                    "condition": "((((((not ((meta(`process`).`id`) like \"_sync:%\")) and ((`process`.`type`) = \"workflow-job\")) and (((`process`.`entity`).`state`) = \"Pending\")) and (((`process`.`entity`).`runAfter`) < \"2018-02-26T09:07:57.693Z\")) and (array_count(((`process`.`entity`).`dependsOnJobs`)) = 0)) and (((`process`.`entity`).`runOn`) = \"BGC-BE-6C4AE161\"))"
                  },
                  {
                    "#operator": "InitialProject",
                    "result_terms": [
                      {
                        "expr": "self",
                        "star": true
                      }
                    ]
                  }
                ]
              }
            }
          ]
        },
        {
          "#operator": "Order",
          "limit": "10",
          "sort_terms": [
            {
              "expr": "((`process`.`entity`).`startDate`)"
            }
          ]
        },
        {
          "#operator": "Limit",
          "expr": "10"
        },
        {
          "#operator": "FinalProject"
        }
      ]
    },
    "text": "SELECT * \nFROM process \nWHERE  meta().id not like \"_sync:%\"  \nAND type = \"workflow-job\"  \nAND entity.state = \"Pending\" \nAND entity.runAfter < \"2018-02-26T09:07:57.693Z\" \nAND array_count(process.entity.dependsOnJobs) = 0 \nAND entity.runOn = \"BGC-BE-6C4AE161\"\nORDER BY entity.startDate ASC \nLIMIT 10"
  }
]

Now, when using the query workbench, we’ve been able to consistently get responses within 100 to 800ms (quite an interval, but way better than it used to be).
However, when running the same exact query from the application code using Couchbase SDK, the response times will always increase back to 10 to 40s to get a reply and sometimes we will even get OOM errors that stop the application execution.

Why is this happening? Why do we get such inconsistent behaviour?

Also, why does the index have to have (type, entity.state) when we already use a WHERE clause in the index definition? At least the type should be able to be dismissed from the index (because of the WHERE) however once we remove it the query performance goes down significantly and the explain shows that the used index is the primary one (which should not exist, in the first place).

We are using:

  • Couchbase Community 4.5.1 (soon to be upgraded to Enterprise 4.6) and
  • Couchbase JS SDK v2.3.5

PS:
As a sidenote, I know that both presented indexes have, more or less, the same information (they are alike), thus the reason why the explain uses both indexes

Drop both the indexes. create following index. If you have primary index drop that also or use USE INDEX hint.

CREATE INDEX ix1 ON `process`(entity.runOn, entity.runAfter,  array_count(entity.dependsOnJobs), entity.startDate)  
WHERE meta().id NOT LIKE "_sync%" AND type = "workflow-job" AND entity.state = "Pending";

Hello,

Thank you. That is a more complete index than using the other two indeed (both were created to test this specific use-case).

Meanwhile while discussing the issue it seems that the network may be having some impact on the response times as I’m testing from the local machine to CB Server that lives somewhere else in the network that has proxy servers and/or firewalls in between.

Thanks

I’m still seeing some performance issues when passing in the query via my application code vs executing it via curl or the query workbench.

The only difference from the app code to the sample query that I left here is that the filters are passed in as parameters and instead of having entity.runAfter = something my query uses entity.runAfter < something

So, like this:

SELECT *
FROM process
WHERE meta().id NOT LIKE "_sync%"
AND type = $1
AND entity.state = $2
AND entity.runAfter < $3
AND array_count(process.entity.dependsOnJobs) = 0
AND entity.runOn = $4
ORDER BY entity.startDate ASC
LIMIT $5

When the runAfter filter is switched from = to < the index is completely ignored and I start getting response times greater than 10s…
I was not expecting that to be the case so why is the index completely ignored by that little change? (I still have the primary index there and I know it must be removed but it’s helpful for development purposes)

These are pat of index conditions you can’t pass as query parameters (unless you use adhoc=true).
In that case change index to

CREATE INDEX ix1 ON `process`(type,entity.state, entity.runOn, array_count(entity.dependsOnJobs),  entity.runAfter)  
WHERE meta().id NOT LIKE "_sync%"

Due to ORDER BY it needs to fetch all qualified documents. Check this out Designing Index for Query in Couchbase N1QL - DZone

Hi vsr1,

Thank you but I found the error:
My index used type = "_sync%" but the app code was passing in type="_sync:%" (notice the :% difference