ParameterizedN1qlQuery with IN statement


#1

Continuing the discussion from How to query JsonStringArray:


#2

Hi, not sure if you saw my suggestion – can you try named parameters?

They both work, but named parameters might be simpler for your case.


#3

That totally worked !! with below syntax (for future forum browsers)
Edit: That did not work actually, may be my syntax is not right. But it is not doing what IN statement would do. It is actually working only with the first where clause i.e date = $1 . Its getting me all docs which satisfies the date criteria with out heed to IN statement in my syntax.

ParameterizedN1qlQuery
 query1 = N1qlQuery.parameterized("select * from `mybucket` WHERE date =
 $1 and $2", placeholderValues);```

**Note that the query structure has slightly changed**

#4

I’m trying to do the same with with the .NET SDK (2.2.8), but I’ve got a problem.

These are two example documents I’m working with:

{
    "groupProviderId": "LaZ6QC2S1EmohJeADuTQdQ",
    "created": "2014-03-21T13:52:46Z",
    "docType": "Group"
}
{
    "groupProviderId": "ZJZiBujFCEyzb8DN4_kd8w",
    "created": "2015-12-16T15:57:29Z",
    "docType": "Group",
}

This is my query to find ‘Group’ documents based on the groupProviderId attribute:

SELECT META().id
FROM `core_north-europe`
WHERE docType="Group" AND groupProviderId IN ["LaZ6QC2S1EmohJeADuTQdQ","ZJZiBujFCEyzb8DN4_kd8w"];

When I execute it in cbq.exe, both the example documents appear in the results. I’m attempting to execute the same query, parameterized, from the .NET SDK with the following code:

var statement =
    "SELECT META().id " +
    "FROM `core_north-europe` " +
    "WHERE docType=\"Group\" AND groupProviderId IN [$g];";

var request = new QueryRequest()
    .Statement(statement)
    .AddNamedParameter("g", "LaZ6QC2S1EmohJeADuTQdQ,ZJZiBujFCEyzb8DN4_kd8w");

var result = _bucket.Query<JObject>(request);
if (!result.Success) throw result.Exception;

return result.Rows.Select(r => r["id"].Value<string>()).ToList();`

I don’t get any results from this query. I’ve tried many variations for the parameter, like:

    .AddNamedParameter("g", "\"LaZ6QC2S1EmohJeADuTQdQ\",\"ZJZiBujFCEyzb8DN4_kd8w\"")

…and:

    .AddNamedParameter("g", new[] { "LaZ6QC2S1EmohJeADuTQdQ", "ZJZiBujFCEyzb8DN4_kd8w" })

…and:

    .AddNamedParameter("g", JArray.Parse("[\"LaZ6QC2S1EmohJeADuTQdQ\",\"ZJZiBujFCEyzb8DN4_kd8w\"]"))

…but still no results. If I only include a single value as the named parameter however, i.e.:

    .AddNamedParameter("g", "ZJZiBujFCEyzb8DN4_kd8w")

…I get one of the documents in the results, so it feels like I’ve just not got the format of passing an array as a parameter correct.

Has anyone managed to do this?


#5

Hello,

You should remove the brackets around $g.

var statement =
    "SELECT META().id " +
    "FROM `core_north-europe` " +
    "WHERE docType=\"Group\" AND groupProviderId IN $g;";

#6

Thanks, @geraldss.

I had tried that initially, but I get:

4000 - No primary index on keyspace core_north-europe. Use CREATE PRIMARY INDEX to create one.

I’ve tried all formats of AddNamedParameter and it makes no difference. The closest I’ve gotten to a good result is groupProviderId IN [$g] and a single value in AddNamedParameter.

Can you confirm what I should be passing to AddNamedParameter please, i.e. a comma separated string, an array of strings, a JArray, or something else?

I’ve tried to up the logging level in CB in an effort to see the actual queries that are being run, but can’t see them recorded anywhere. Is there something I can do to see that, so that I can see what the client is actually sending to CB?


#7

Can you run

CREATE PRIMARY INDEX ON \core_north-europe`;`

And then rerun my earlier suggestion (remove brackets)?

You should be passing an array of strings.


#8

That worked.

I stumbled across Getting meta().id with Covering Index and went back to look at my index.

I changed my create index statement from:

CREATE INDEX Idx_Core_Group_GroupProviderId
ON `core_north-europe`(`groupProviderId`)
WHERE docType="Group" USING GSI;

…to:

CREATE INDEX Idx_Core_Group_GroupProviderId
ON `core_north-europe`(`docType`,`groupProviderId`,META().id)
WHERE docType="Group" USING GSI;

…and now the query executes correctly from the client, and as a cover index, without a primary index present. It looks like the index needed both the META().id key and the docType key included (even though I’ve got a where clause on docType to behave).

Thanks for your help!


#9

The index usage is better in 4.5 GA. With 4.5 GA, you can use your original index, and it will be a covering index and work correctly.


#10

Any indication on when 4.5.0 will be released?


#11

We are targeting the release of 4.5 in the next few weeks. The goal is before the end of June but all depends on how fast we can hit the quality bar we set.
thanks
-cihan


#12

Has anyone seen varying performance using different syntax of the query criteria?

"SELECT META().id " +
"FROM core_north-europe " +
“WHERE docType=“Group” AND groupProviderId IN $g;”;

vs.

"SELECT META().id " +
"FROM core_north-europe " +
“WHERE docType=“Group” AND ANY g IN $g SATISFIES g = groupProviderId END;”;

vs.

"SELECT META().id " +
"FROM core_north-europe " +
“WHERE docType=“Group” AND ARRAY_CONTAINS($g, groupProviderId);”;

I am having trouble getting strong performance using input arrays. Just flipping the = operator to an IN operator against a single array value causes dramatically higher response time.


#13

Hi @Ross.Sanger,

You should use 4.5 with array indexing.


#14

Thanks Gerald. In this case my inputs are arrays but my data is flat. So, I have a field on each record with 1 value but I need to find all records that have any of 10 different values from the caller’s perspective. I’m not sure how array indexing would affect that. There are a lot of records (~30 Million) and I have 2 main fields to index. Would you suggest building a different type of document (normalized) that has an array of values?


#15

Interesting. Can you generate a chain of ORs after you receive the input parameter? Even better, can you generate the full query after you receive the input parameter, so that the array parameter becomes a static array? Your pattern is valid. We will keep it in mind for future improvement.

And you need an index on (groupProviderId, docType).


#16

Sorry - I was using the original thread’s query and data to present my question. That is not my structure, we just have the same topic. I have tried an OR style syntax and it behaves identical to the IN [] as best I can tell. The other query styles do not perform well at all (same result just slow). I am going to try to break into several indexes by range and hope that works a little better next.


#17

I think I found some knowledge in other threads that may be somewhat pertinent. If your query spans multiple criteria, .i.e field1 IN [?,?] AND field2 IN [?,?]; it seems if you should create multiple indices for each field in the where clause. This seems to perform much better then a single index with both fields.


#18

Hi @Ross.Sanger,

A single index should perform better than two separate indexes, all other things being equal. You should also try using a covering index.