Named Parameters in the Web Console

Hello -

We have a N1QL query that a handful of our data folks use for adhoc analysis. The queries have a few large arrays of values we use in an IN statement (example below). They need to be able to switch between 5 or so code sets when running their query.

We have an index on v.drgCode, such that performance is GREAT when the array of codes is included in the query. However if we use a sub-select inside the WHERE clause to source the codes out of a document, performance is terrible.

Upon further inspection, when using the sub-select, the drgCodes aren’t making it into the range criteria for the drgCode index. We then tried using the “Named Parameters” in the Web Console, but we couldn’t get this to work either. So our two questions are:

  1. Is there a way to get the query optimizer to execute the sub-selects first to get the data inside the sub-select to help the parent query qualify for better indexing?

  2. If #1 isn’t possible, is it possible to used name parameters to store the drgCode sets?

SELECT v.tenantName,v.visitNumber, v.drgCode, SUBSTR(v.dischargeDate,0,10) As dischargeDate,meta(c).id
FROM cloud_med v
JOIN cloud_med c ON KEY "visit-" || c.tenantName || "-" || c.visitNumber FOR v
WHERE v.type = "visit" AND c.type = "charges" 
AND v.tenantName IN ["testTenant"]
AND v.dischargeDate >= SUBSTR(DATE_ADD_STR(CLOCK_STR(),-120,"day"),0,10)
AND v.drgCode IN ["013","017","022","027","030","033","036","039","042","053","060","063","066","072","076","079","084","087","090","093","096","099","114","117","122","130","132","134","136","138","148","156","159","165","168","179","182","185","188","192","195","198","201","203","218","221","230","241","244","254","257","262","272","282","285","290","293","295","298","301","310","316","328","331","334","337","340","343","346","349","352","355","358","370","373","376","379","382","387","390","395","407","410","413","416","419","422","425","434","437","440","443","446","455","458","465","468","473","476","479","482","487","489","494","497","499","502","505","508","512","514","517","520","538","541","544","547","550","561","566","572","575","578","581","583","585","594","599","601","615","618","621","624","627","630","639","645","655","658","661","664","667","670","672","675","684","688","692","700","708","710","712","714","716","718","724","730","735","738","741","743","745","747","750","756","759","761","766","801","804","810","816","822","825","828","830","836","839","842","845","848","855","858","869","903","905","909","921","929","941","946","950","959","965","976","983","986","989"]

If sub SELECT in where clause executed only when outer table is scanned (index Scan+Fetch). So those values not available and can’t be pushed as index ranges. Instead it needs fetch full range from index on that key and filter needs to be applied as post indexscan/fetch filter.

Alter native if subquery is independent execute standalone generate as array and then use as named parameter and execute as adhoc query.
v.drgCode IN $dgcodes
dgcodes=[“013”,“017”,“022”,“027”]
Non adhoc query will be same issue because because during prepare time dgcodes doesn’t know it need to full range scan on that index key.

Click Preferences in Query Work bench and set like below
13%20PM

2 Likes

What’s the limit of the array length for a named parameter?

I’m unable to make it work for 100000 values. It won’t paste

As such there is no limit on the number of elements. If not pasting in UI use curl or cbq shell.

Thank you! Would you mind helping how to define such a long list of values in the curl query?

curl -v http://localhost:8093/query/service -H "Content-Type: application/json" -d '{"statement":"SELECT d.* FROM default AS d WHERE d.f1 IN $ids;","$names":[1,2,3,4]}'

https://docs.couchbase.com/server/current/n1ql/n1ql-rest-api/index.html

So I’ll have to just use all the 100k values (it’s 380k in fact) in the curl query there? No way to define it somewhere? maybe a file? :frowning:

You can use curl’s -d @/path/to/file option and place the JSON in the file.

e.g.

curl -v http://localhost:8093/query/service -H "Content-Type: application/json" -d @/tmp/t.sql

with

$ cat /tmp/t.sql
{
"statement":"select $val",
"$val":[1,2,3]
}

HTH.

As @dh mentioned you can use file.

FYI:Not sure how the query look like with 100K list of named parameter. If used in predicate index scan might not be optimal. Checkout IN List Handling Improvements in Couchbase Server 6.5 - The Couchbase Blog

I just tried it with 1 value. Exactly as @dh mentioned. Even that is taking forever. Don’t know why.

provide the query , explain and index definitions

Query:

{
"statement":"SELECT t.studentId, t.houseId, ARRAY_LENGTH(t.points) AS points FROM hogwarts t WHERE t.type=\"record-txn\" and t.houseId = \"gryffindor\" and t.studentId IN $students;",
"$students": ["b393c367-ed05-445e-9795-fde7099d07e4"]
}

Index:

CREATE INDEX adv_houseId_array_length_points_type_studentId ON hogwarts`(`houseId`,array_length(`points`),`studentId`) WHERE (`type` = 'record-txn')`
CREATE INDEX ix1 ON hogwarts`(`houseId`,`studentId`, array_length(`points`)) 
WHERE (`type` = 'record-txn')

Use above index. It should not take that long. What version of CB. Did you check out IN List Handling Improvements in Couchbase Server 6.5 - The Couchbase Blog

Created the above index, still the same issue :frowning: , even with one element in the array.

I read the blog and I see the optimisation for the IN list i.e. the dynamic span generation won’t happen if the IN array length is more than 8192.

So do I need to break the list down to groups of 8192 values?

I somehow can’t see the dynamic_in:true in the ‘explain’ section as well.

Unlike the blog saying that there will be a single span when a named parameter is used, I see 2 spans, one for each element in the list.

"spans": [
          {
            "exact": true,
            "range": [
              {
                "high": "\"gryffindor\"",
                "inclusion": 3,
                "low": "\"gryyfindor\""
              },
              {
                "high": "\"2d623615-fcce-417a-be77-f31f41035fe8\"",
                "inclusion": 3,
                "low": "\"2d623615-fcce-417a-be77-f31f41035fe8\""
              }
            ]
          },
          {
            "exact": true,
            "range": [
              {
                "high": "\"gryffindor\"",
                "inclusion": 3,
                "low": "\"gryffindor\""
              },
              {
                "high": "\"b393c367-ed05-445e-9795-fde7099d07e4\"",
                "inclusion": 3,
                "low": "\"b393c367-ed05-445e-9795-fde7099d07e4\""
              }
            ]
          }
        ]

It will use. What you are showing expanded spans.
One element in the array doesn’t take that long. If you using EE execute with one value in UI and post results from Plan Text Tab.

{
    "#operator": "Sequence",
    "~children": [
        {
            "#operator": "IndexScan3",
            "as": "t",
            "covers": [
                "cover ((`t`.`houseId`))",
                "cover (array_length((`t`.`points`)))",
                "cover ((`t`.`studentId`))",
                "cover ((meta(`t`).`id`))"
            ],
            "filter_covers": {
                "cover ((`t`.`type`))": "record-txn"
            },
            "index": "ix1",
            "index_id": "d69be1a58c07aa9",
            "index_projection": {
                "entry_keys": [
                    0,
                    1,
                    2
                ]
            },
            "keyspace": "points",
            "namespace": "default",
            "spans": [
                {
                    "exact": true,
                    "range": [
                        {
                            "high": "\"gryffindor\"",
                            "inclusion": 3,
                            "low": "\"gryffindor\""
                        },
                        {
                            "inclusion": 0
                        },
                        {
                            "high": "\"9e757b11-488c-45f4-bb9a-9936e06e4b8e\"",
                            "inclusion": 3,
                            "low": "\"9e757b11-488c-45f4-bb9a-9936e06e4b8e\""
                        }
                    ]
                }
            ],
            "using": "gsi"
        },
        {
            "#operator": "Parallel",
            "~child": {
                "#operator": "Sequence",
                "~children": [
                    {
                        "#operator": "Filter",
                        "condition": "(((cover ((`t`.`type`)) = \"record-txn\") and (cover ((`t`.`houseId`)) = \"gryffindor\")) and (cover ((`t`.`studentId`)) in $students))"
                    },
                    {
                        "#operator": "InitialProject",
                        "result_terms": [
                            {
                                "expr": "cover ((`t`.`studentId`))"
                            },
                            {
                                "expr": "cover ((`t`.`houseId`))"
                            },
                            {
                                "as": "points",
                                "expr": "cover (array_length((`t`.`points`)))"
                            }
                        ]
                    },
                    {
                        "#operator": "FinalProject"
                    }
                ]
            }
        }
    ]
}

drop the ix1 and re create

CREATE INDEX ix1 ON hogwarts`(`houseId`,`studentId`, array_length(`points`)) 
WHERE (`type` = 'record-txn')

Wow, this worked! How is different from the previous index?

Also this is working on the console, but the curl way is still taking forever to execute :frowning: