Couchbase indexer will not index the document if the leading index key evaluates MISSING.
Assume you have document doc1 with height MISSING field, block_height will not have the doc1.
But query semantics tells give all documents with _class = “acme.domain.Block”. It suppose to give doc1 too.
As Index doesn’t have it will not qualify for this query even you give explicit hint via USE INDEX.
Hint is ignored if not qualified.
This is weird, I would have bet that the index had the documents even if the field was missing. Initially I even started my response with that assumption. How can I have an index that includes all documents of type Block that have no height ? (in my case there are none but just for the example). I would have to write the index backward like this ?
CREATE INDEX missingheight ON mybucket (reward)
WHERE _class = "Block"
AND height IS MISSING
NOTE: Couchbase bucket can have any type of documents (even unrelated). If start indexing first leading key MISSING also, every index will have all the entries in the bucket. This makes index huge. Indexer will index MISSING for other keys once first key value is present.
CREATE INDEX missingheight ON mybucket (_class, height)
WHERE _class = "Block";
Leading index key must evaluate not MISSING.
7.0 has bucket/scope/collection. you can separate different documents as separate collection. MISSING restriction still there. MB-30249
There is other techniques if you want index height MISSING
CREATE INDEX ` block_height `ON` mybucket `( _class, `height` DESC)
WHERE (`_class ` = "acme.domain.Block");
SELECT *
FROM mybucket
WHERE `_class` = "acme.domain.Block"
ORDER BY height DESC
LIMIT 1;
OR
-- height is number if MISSING indexed as "" as functional key
CREATE INDEX ` block_height `ON` mybucket `( IFMISING(`height`, "") DESC)
WHERE (`_class` = "acme.domain.Block");
SELECT *
FROM mybucket
WHERE `_class` = "acme.domain.Block" AND IFMISING(`height`, "") IS NOT MISSING
ORDER BY IFMISING(`height`, "") DESC
LIMIT 1
Thank you for all the information ! I find it hard to get documentation about scopes and collections, I don’t even know how to declare one, what benefit they bring, how are they different than another index, and with what drawback they come. Maybe it will come later ?
I understand the feature is very recent and therefore the doc will follow in the near future.
After reading the pages you still don’t know how to use it with the SDK nor how they behave performance wise or if you still need an index to fetch those data . You simply cannot start using the features after reading the current doc.
So let’s say today I have documents with a field “type”, and an index for this field exists so I can query documents. If I create a collection “A” and put all my documents of type=A in it. Can I query them without the need for an index on type A ?
At current time , you need at least primary index. If you have secondary index (leading index key must be present in the query predicate to choose the index). MB-30249
@JesusTheHun Here is the correct syntax and an example showing you do need an index. Secondary indexes are always on collections (so indexes created without specifying a collection are for the bucket._default._default collection, which is also the collection holding documents whose scopes and collections were not specified). Thus with collections, secondary indexes can perform better because building and updating an index on a collection ignores all documents that are not in that specific collection.
select * from default.myScope.myCollection where type="A"
[
{
"code": 4000,
"msg": "No index available on keyspace `default`:`default`.`myScope`.`myCollection` that matches your query. Use CREATE PRIMARY INDEX ON `default`:`default`.`myScope`.`myCollection` to create a primary index, or check that your expected index is online.",
"query": "select * from default.myScope.myCollection where type=\"A\""
}
]
create index myIndex on default.myScope.myCollection(type)
select * from default.myScope.myCollection where type="A"
[
{
"myCollection": {
"type": "A",
"val1": "Value 1"
}
}
]
So in the end - organisation aside - collections are just semantic groups, that can be used to optimize indexing processes, XDCR replications also I guess, but doesn’t change the requirements around data querying. Is there any performance benefit using collections ?
@JesusTheHun Collections bring performance benefits (reduction in CPU usage) for both index initial builds and ongoing index maintenance. This write-up explains how this works: