Can I create an index from function output?

moon0326 · April 30, 2016, 12:10am

Hi,

Is it possible to create a covered index from function output? Let’s say I have documents with “id” field and the value of “id” can be either an integer or string type.

Can I create a covered index that can be used for the following query?

select id from mybucket where isstring(id) = true

Without understanding how index expression works, I’ve tried the following.

create index idx_test on mybucket(isstring(id))

Explain out:

{
    "requestID": "c5e53221-5b58-4d2d-a9b2-ec969fd25b61",
    "signature": "json",
    "results": [
        {
            "#operator": "Sequence",
            "~children": [
                {
                    "#operator": "IndexScan",
                    "index": "idx_test",
                    "keyspace": "mybucket",
                    "namespace": "default",
                    "spans": [
                        {
                            "Range": {
                                "High": [
                                    "true"
                                ],
                                "Inclusion": 3,
                                "Low": [
                                    "true"
                                ]
                            }
                        }
                    ],
                    "using": "gsi"
                },
                {
                    "#operator": "Parallel",
                    "~child": {
                        "#operator": "Sequence",
                        "~children": [
                            {
                                "#operator": "Fetch",
                                "keyspace": "mybucket",
                                "namespace": "default"
                            },
                            {
                                "#operator": "Filter",
                                "condition": "(is_string((`mybucket`.`id`)) = true)"
                            },
                            {
                                "#operator": "InitialProject",
                                "result_terms": [
                                    {
                                        "expr": "(`mybucket`.`id`)"
                                    }
                                ]
                            },
                            {
                                "#operator": "FinalProject"
                            }
                        ]
                    }
                }
            ]
        }
    ],
    "status": "success",
    "metrics": {
        "elapsedTime": "2.770835ms",
        "executionTime": "2.734089ms",
        "resultCount": 1,
        "resultSize": 1969
    }
}

geraldss · April 30, 2016, 12:53am

Yes, you are using the index. In the EXPLAIN output, IndexScan is the indicator of using the index.

moon0326 · April 30, 2016, 1:44am

Is there a way to use it as a covering index?

geraldss · April 30, 2016, 3:20am

Several options. One option is:

create index idx_test2 on mybucket(isstring(id), id)

moon0326 · May 15, 2016, 1:13am

Hi Geraldss,

Just tested it out. I think it does not work.

Documents

Document 1
{
“published_at”: “value1”,
“form”: “test”
}

Document 2
{
“published_at”: “value2”,
“form”: “test”
}

Index

create index tmp_index on mybucket(type(published_at), published_at, form) where form='test';

select still uses fetch.

Am I doing something wrong?

geraldss · May 15, 2016, 1:53am

Hi @moon0326,

Please post the SELECT and the EXPLAIN, along with EXPLAIN output.

moon0326 · May 15, 2016, 1:57am

Sorry about that.


cbq> explain select published_at from mybucket where type(published_at) = 'missing' and form='test'
   > ;
{
    "requestID": "be124a4a-639f-4ec0-ac11-dc3108f774fa",
    "signature": "json",
    "results": [
        {
            "#operator": "Sequence",
            "~children": [
                {
                    "#operator": "IndexScan",
                    "index": "tmp_a",
                    "keyspace": "mybucket",
                    "namespace": "default",
                    "spans": [
                        {
                            "Range": {
                                "High": [
                                    "successor(\"missing\")"
                                ],
                                "Inclusion": 1,
                                "Low": [
                                    "\"missing\""
                                ]
                            }
                        }
                    ],
                    "using": "gsi"
                },
                {
                    "#operator": "Parallel",
                    "~child": {
                        "#operator": "Sequence",
                        "~children": [
                            {
                                "#operator": "Fetch",
                                "keyspace": "mybucket",
                                "namespace": "default"
                            },
                            {
                                "#operator": "Filter",
                                "condition": "((type((`mybucket`.`published_at`)) = \"missing\") and ((`mybucket`.`form`) = \"test\"))"
                            },
                            {
                                "#operator": "InitialProject",
                                "result_terms": [
                                    {
                                        "expr": "(`mybucket`.`published_at`)"
                                    }
                                ]
                            },
                            {
                                "#operator": "FinalProject"
                            }
                        ]
                    }
                }
            ]
        }
    ],
    "status": "success",
    "metrics": {
        "elapsedTime": "27.622287ms",
        "executionTime": "27.530502ms",
        "resultCount": 1,
        "resultSize": 2023
    }
}

keshav_m · May 15, 2016, 5:18am

What’s your Couchbase version?

In Couchbase 4.5 (now in Beta), I do see the covering index path.

create index ix on beer-sample(type(junk), name) where type = ‘beer’;
explain select name from beer-sample where type(junk) = ‘missing’ and type = ‘beer’;
[
{
“plan”: {
"#operator": “Sequence”,
"~children": [
{
"#operator": “IndexScan”,
“covers”: [
“cover (type((beer-sample.junk)))”,
“cover ((beer-sample.name))”,
“cover ((meta(beer-sample).id))”
],
“filter_covers”: {
“cover ((beer-sample.type))”: “beer”
},
“index”: “ix”,
“index_id”: “1c90c2a1c4f0f20b”,
“keyspace”: “beer-sample”,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
“successor(“missing”)”
],
“Inclusion”: 1,
“Low”: [
"“missing”"
]
}
}
],
“using”: “gsi”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Filter”,
“condition”: “((cover (type((beer-sample.junk))) = “missing”) and (cover ((beer-sample.type)) = “beer”))”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “cover ((beer-sample.name))”
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
},
“text”: “select name from beer-sample where type(junk) = ‘missing’ and type = ‘beer’;”
}
]

moon0326 · May 15, 2016, 8:52am

I’m on 4.1.

I think there is no point of testing out 4.5 for me as I cannot afford using a beta in a production environment.

I hope there is a way to do it in 4.1.

geraldss · May 15, 2016, 1:50pm

Hi @moon0326,

A few points.

(1) In 4.5, the query plan for the above index and query uses a covering index.

(2) In 4.1, you can try removing the WHERE clause from the index definition:

create index tmp_index on default(type(published_at), published_at, form);

(3) Instead of type(published_at) = ‘missing’, you should use published_at IS MISSING. You have found a bug here.

Let us know if this helps.

moon0326 · May 15, 2016, 9:57pm

Thank you Geraldss,

I can’t remove where caluse as there are just so many other documents without published_at field.

Sounds like I have to wait for 4.5. I can solve this with view for now.

Thank you for the help!

Moon