Secondary index entries for docs with missing key values

query
n1ql

#1

I would like to create an index such as CREATE INDEX TestIndex ON Test(field) but I cannot query it directly without the condition field IS NOT MISSING.

This is a problem because as schema changes and fields are added, old documents (with new fields missing) will not show up in the index.

Needless to say(?) it’s also annoying to always have to do IS NOT MISSING with every key on every index on every query. And God forbid I miss one and the query fails (because I’m not willing to succumb to using primary indices) or works with poor performance (unexpectedly) for those that do have primary indices.

Here is a use case: I have users in my system, and now want to mark some users as “special”. So I add a boolean field to all my users called “IsSpecial”. This works great when loading the document because if it’s missing, it deserializes to false (at least in my case), which is desired.

Only problem is, if I then add that field as a key to an index (for better coverage), all of my older users that have not been updated recently will have that value missing, rendering the index useless.

Any ideas on how to get around this?
Thanks.


#2

Sure. You can the IFMISSING() function in your indexes and queries.

CREATE INDEX TestIndex ON Test( IFMISSING( IsSpecial, FALSE ), __other index keys__ );

EXPLAIN SELECT __exprs__
FROM Test
WHERE IFMISSING( IsSpecial, FALSE ) ) = TRUE
AND __other conditions__;

If desired, you can use the other index keys to create a covering index.


#3

Thank you. Could I use IFMISSING(IsSpecial, null) ?


#4

Yes, you can use NULL or any other replacement value.


#5

Unfortunately I can’t find a way to target this index. Do you know of any way that I can do that?
CREATE INDEX TestIndex ON Test(IFMISSING(IsSpecial, null))

I can get it to work using false instead of null but false doesn’t work nicely when it’s a non boolean property. And the queries are getting pretty funky. I was trying to get rid of the IS NOT MISSING parts, now I am running IFMISSING computations on the query and that cannot be a good thing.

All your help is really appreciated.


#7

You found a bug. We have filed a ticket at https://issues.couchbase.com/browse/MB-20195

For now, unfortunately, you can use IFMISSING( field, someValue ) but someValue cannot be NULL. We will fix this.


#8

Okay, thanks! Will be looking forward to the next release then.

Referring again to this index: CREATE INDEX TestIndex ON Test(IFMISSING(IsSpecial, null))
After the bug is fixed, will I be able to query that index with:
SELECT * FROM Test, without an IS NOT MISSING part?

If not, will I at least be able to target it by specifying WHERE IFMISSING(IsSpecial, NULL) IS NOT MISSING?


#9

You do not need IS NOT MISSING. My sample query above does not have IS NOT MISSING, and it works with the sample index. You do not need IS NOT MISSING if you have other conditions in your query that refer to the index.


#10

That sounds good. Just to be sure I am understanding correctly:

With index: CREATE INDEX TestIndex ON Test(IFMISSING(IsSpecial, NULL), IFMISSING(IsActive, NULL)) WHERE type="user"

Get all special users:
SELECT * FROM TEST WHERE type="user" AND IsSpecial = TRUE

Get all active users:
SELECT * FROM TEST WHERE type="user" AND IsActive = TRUE

Get all users:
SELECT * FROM TEST WHERE type="user"

Will these 3 queries target TestIndex?

Also, on a similar note, if you are willing I would like to hear what you think about the idea of allowing documents with MISSING key values into an index, therefore making the use of IFMISSING(IsSpecial, NULL) as an index key expression (vs just IsSpecial) unnecessary.


#11

The following two indexes will satisfy all your queries, including MISSING values. The reason is that index keys after the leading key ( after type below ) are allowed to have MISSING values.

Index 1: CREATE INDEX IdxSpecial ON Test( type, IsSpecial ) WHERE type = "user";

Index 2: CREATE INDEX IdxActive ON Test( type, IsActive ) WHERE type = "user";

Query 1: SELECT * FROM TEST WHERE type="user" AND IsSpecial = TRUE;

Query 2: SELECT * FROM TEST WHERE type="user" AND IsActive = TRUE;

Query 3: SELECT * FROM TEST WHERE type="user";

#12

Ok wow that is awesome and makes sense now! So this basically means that indexing on IFMISSING(Prop, NULL) is unnecessary for all my use cases.
Thanks.