create index idx_example1_form on bucket(form) where form='example1' using GSI;
create index idx_example2_form on bucket(form) where form='example2' using GSI;
create index idx_example3_form on bucket(form) where form='example3' using GSI;
I’m now suspecting that my index definition might not be correct, but it still does not explain why I see IndexScan in cbq.
Both of these take PrimaryScan
explain select * from default where form=$1;
explain select * from default use index (ix2) where form=$1;
Update: created a new index without where clause and everything works as expected now. Is this considered a bug or am I understanding how index works incorrectly?
When I run the exact query in cbq without positional parameters, it takes index without specifying index condition. It looks like to me that it is a bug where rest api endpoint does not parse the positional parameters before sending the query to the query service.
Hi @moon0326, so everything works correctly as intended. If you use a parametrized query, we cannot choose a conditional / partial index, because the query plan must work with any value of the parameter.
So you have two options:
(1) Use a conditional index and fully specify the query, without parameters for the index keys.
(2) Use a full index, and you can use a parametrized query for the index keys.
Select * from bucket where form=‘example1’; – this chooses the index idx_example1_form because we know the value of the predicate.
Select * from bucket where form= $1; – this cannot choose any of the index during plan creation since your value could be ANYTHING. The argument is only used to bind the predicate at runtime and not choose the index.
Separately, your actual definitions is likely different. In this example, you have both index key and the index condition as ‘form’, which is not highly useful. If you do actually have distinct keys, you can use the condition in the statement you’re preparing.
Example:
create index idx_example1_form on bucket(quantity) where form=‘example1’ using GSI;
Select * from bucket where form=‘example1’ and quantity=$1; – this will use the right index.
I don’t think I can comment on it further as I don’t fully understand how query planner works. I think this use case should be mentioned in the documentation though. It makes sense from the technical point of view, but it really does not make sense as a enduser. Thank you for the help. I appreciate it.
When you issue a query with strong consistency of REQUEST_PLUS, the indexer will ensure it catches up to the changes made to the bucket up to the request point before the results are returned.
The create indices with WHERE condition will only help you with managing indexes (like index range partitioning in Oracle) and not with speeding up REQUEST_PLUS latency.
But if I create two indexes for ‘form’ separately, aren’t they technically two indexes?
How does that affect my query on #2 when index of #1 is being used? I thought REQUEST_PLUS only waits for the index that must be synced to ensure consistency.
Hi @moon0326, all the feedback you are getting here is spot on.
One more thing in response to the question you asked me: for your case, you should only use parametrized queries together with prepared statements. If you are sending the full query statement each, there is no advantage to using parameters, and there are the disadvantages you encountered.
And so, please avoid parametrized queries, and then you can use any indexing strategy you choose.
You should not take user input and directly inject it into query strings. When you use parameters, the planner does not use the parameter values in generating the query plan. We will consider that for the future.
Also, N1QL generally uses a single index per query (with some exceptions). With that in mind, one good option for you would be to not use conditional indexes, or not use parameters, and instead provide your users with a select list of inputs, so that the users are not injecting arbitrary input into your queries.
Both good options. Let us know if you run into other issues. As for this particular issue, there are no additional workarounds for now.
Given that you know the possible values of “form” upfront (in order to create the conditional indexes), you should put those values in a selection list and then avoid parametrized queries. This solves all the issues.
It’s a bit more complicated than that. In 4.1, the answer is yes. In 4.5, there is also a read-your-own-writes setting, which means query 2 would only have to wait for query 1 if they were issued from the same client session. Anyway, separate subject.