I have some doubt about N1ql

#1

Hey guys,

I have some questions:

1 - How many Index I can have?
2 - How many fields I can have by index?
3 - My creteria to include fields in index is get all field that I use at where sentence, Its ok?

By Example:

select b.costCode 
	from default as b use index (IndexSecondary using GSI)
	where 
	b.type = "CostoCodeCumulative" 
	and b.period = 5 
	and b.category = "Porcentaje" 
	and b.iwpParent IN [result]
	and b.actual > 0

CREATE INDEX IndexSecondary ON default(type,period,category,iwpParent,actual) using GSI;

and...

select b.period, b.category,sum(b.sumActual) 
	from default as b
	where 
	b.type = "CostoCodeCumulative" 
	and b.category != "Porcentaje" 
	and b.costCode IN [result]

CREATE INDEX IndexSecondary2 ON default(type,category,costCode) using GSI;

4 - I have the index and data in the same server, Do It worse at the performance?.

Regards.

Using N1QL, Timeout
#2

In development, you can use multiple indexes and run everything on one node. You can also use a composite index which contains all the fields in your WHERE clause.

#3

@geraldss,

Why when I use a hint is slower and when I do not use it but was created is faster?

And Another question…

I believe that you ROCK! hahahaha you always requested the explain execution.

Well I want to know read perfectly these json, give me your knowledge hahaha.

I don’t know if exist this quote but in spanish is something:
Give me the tools, not give me the solution.

Regards.

#4

@Antx2207, gracias :slight_smile:

To read the output of EXPLAIN, look for “*Scan” near the top. You will see PrimaryScan, IndexScan, IntersectScan, or UnionScan, depending on your query. This will tell you how your query will execute.

PrimaryScan is a full bucket scan, so it is best avoided. You can try building a single composite index with all your WHERE attributes. If your IndexScan uses your composite index, this should give you good performance.

Thanks,
Gerald

#5

Hey @geraldss,

I have a question for you.

I have a query so simple:

SELECT * FROM default WHERE actual = 30 

"status": "success",
    "metrics": {
        "elapsedTime": "428.056921ms",
        "executionTime": "427.883453ms",
        "resultCount": 4221,
        "resultSize": 1769869
    }

Explain plan:

cbq> explain select * from default where actual = 30;
{
    "requestID": "e9cb8173-4043-4596-b11d-9aca4d8d2053",
    "signature": "json",
    "results": [
        {
            "#operator": "Sequence",
            "~children": [
                {
                    "#operator": "IndexScan",
                    "index": "indextest",
                    "keyspace": "default",
                    "limit": 9.223372036854776e+18,
                    "namespace": "default",
                    "spans": [
                        {
                            "Range": {
                                "High": [
                                    "30"
                                ],
                                "Inclusion": 3,
                                "Low": [
                                    "30"
                                ]
                            },
                            "Seek": null
                        }
                    ],
                    "using": "gsi"
                },
                {
                    "#operator": "Parallel",
                    "~child": {
                        "#operator": "Sequence",
                        "~children": [
                            {
                                "#operator": "Fetch",
                                "keyspace": "default",
                                "namespace": "default"
                            },
                            {
                                "#operator": "Filter",
                                "condition": "((`default`.`actual`) = 30)"
                            },
                            {
                                "#operator": "InitialProject",
                                "result_terms": [
                                    {
                                        "star": true
                                    }
                                ]
                            },
                            {
                                "#operator": "FinalProject"
                            }
                        ]
                    }
                }
            ]
        }
    ],
    "status": "success",
    "metrics": {
        "elapsedTime": "2.506426ms",
        "executionTime": "2.295134ms",
        "resultCount": 1,
        "resultSize": 2009
    }

When I did using Elasticsearch.

"query": {
"bool": {
"must": [
{
"term": {
"couchbaseDocument.doc.actual": "30"
}
}
],
"must_not": [ ],
"should": [ ]
}
}

4221 hits. 0.054 seconds.

what is the advantage when I am using n1ql over elasticsearch? or they are different concept.

Elasticsearch always be more faster than n1ql or Am I wrong?

Regards.

#6

They are different concepts, even though there is some overlap.

One is not always faster than the other. It depends on the use case. For example, if you do actual BETWEEN 20 AND 40, you will see differences.

#7

@geraldss,

The result:

{
"query": {
"bool": {
"must": [
{
"range": {
"couchbaseDocument.doc.actual": {
"from": "20",
"to": "40"
}
}
}
],
"must_not": [ ],
"should": [ ]
}
},
"from": 0,
"size": 25000,
"sort": [ ],
"facets": { }
}

87934 hits. 0.477 seconds

The N1QL:

 "status": "success",
    "metrics": {
        "elapsedTime": "21.424015321s",
        "executionTime": "21.423889537s",
        "resultCount": 87934,
        "resultSize": 36879829
    }


cbq> explain select * from default where actual between 20 and 40;
{
    "requestID": "3f8d8d33-6a9b-4021-8942-293f3ec02737",
    "signature": "json",
    "results": [
        {
            "#operator": "Sequence",
            "~children": [
                {
                    "#operator": "IndexScan",
                    "index": "indextest",
                    "keyspace": "default",
                    "limit": 9.223372036854776e+18,
                    "namespace": "default",
                    "spans": [
                        {
                            "Range": {
                                "High": [
                                    "40"
                                ],
                                "Inclusion": 3,
                                "Low": [
                                    "20"
                                ]
                            },
                            "Seek": null
                        }
                    ],
                    "using": "gsi"
                },
                {
                    "#operator": "Parallel",
                    "~child": {
                        "#operator": "Sequence",
                        "~children": [
                            {
                                "#operator": "Fetch",
                                "keyspace": "default",
                                "namespace": "default"
                            },
                            {
                                "#operator": "Filter",
                                "condition": "((`default`.`actual`) between 20 a                                                                             nd 40)"
                            },
                            {
                                "#operator": "InitialProject",
                                "result_terms": [
                                    {
                                        "star": true
                                    }
                                ]
                            },
                            {
                                "#operator": "FinalProject"
                            }
                        ]
                    }
                }
            ]
        }
    ],
    "status": "success",
    "metrics": {
        "elapsedTime": "4.368175ms",
        "executionTime": "4.184228ms",
        "resultCount": 1,
        "resultSize": 2022
    }
}

Well, I will try to do this at elasticsearch:

    select 
    t.category,
    t.period,
    sum(t.sumActual) 
    from 
    default as q 
    inner join default as p on keys q.parent 
    inner join default as t on keys p.iwpCumulatives 
    where
    q.type = 'IwpCumulative' 
    and q.period = 50
    and q.sumActual > 0 
    and q.category = 'Porcentaje' 
    group by t.category,t.period
    order by t.period,t.category;

"status": "success",
    "metrics": {
        "elapsedTime": "12.758127084s",
        "executionTime": "12.757850405s",
        "resultCount": 250,
        "resultSize": 25339,
        "sortCount": 250
    }

{
    "requestID": "7c817808-e1f3-45b6-8be7-d8400d30d467",
    "signature": "json",
    "results": [
        {
            "#operator": "Sequence",
            "~children": [
                {
                    "#operator": "Sequence",
                    "~children": [
                        {
                            "#operator": "IndexScan",
                            "index": "SecondIndex",
                            "keyspace": "default",
                            "limit": 9.223372036854776e+18,
                            "namespace": "default",
                            "spans": [
                                {
                                    "Range": {
                                        "High": [
                                            "50",
                                            "\"IwpCumulative\"",
                                            "successor(\"Porcentaje\")"
                                        ],
                                        "Inclusion": 0,
                                        "Low": [
                                            "50",
                                            "\"IwpCumulative\"",
                                            "\"Porcentaje\"",
                                            "0"
                                        ]
                                    },
                                    "Seek": null
                                }
                            ],
                            "using": "gsi"
                        },
                        {
                            "#operator": "Parallel",
                            "~child": {
                                "#operator": "Sequence",
                                "~children": [
                                    {
                                        "#operator": "Fetch",
                                        "as": "q",
                                        "keyspace": "default",
                                        "namespace": "default"
                                    },
                                    {
                                        "#operator": "Join",
                                        "as": "p",
                                        "keyspace": "default",
                                        "namespace": "default",
                                        "on_keys": "(`q`.`parent`)"
                                    },
                                    {
                                        "#operator": "Join",
                                        "as": "t",
                                        "keyspace": "default",
                                        "namespace": "default",
                                        "on_keys": "(`p`.`iwpCumulatives`)"
                                    },
                                    {
                                        "#operator": "Filter",
                                        "condition": "(((((`q`.`type`) = \"IwpCumulative\") and ((`q`.`period`) = 50)) and (0 \u003c (`q`.`sumActual`))) and ((`q`.`category`) = \"Porcentaje\"))"
                                    },
                                    {
                                        "#operator": "InitialGroup",
                                        "aggregates": [
                                            "sum((`t`.`sumActual`))"
                                        ],
                                        "group_keys": [
                                            "(`t`.`category`)",
                                            "(`t`.`period`)"
                                        ]
                                    }
                                ]
                            }
                        },
                        {
                            "#operator": "IntermediateGroup",
                            "aggregates": [
                                "sum((`t`.`sumActual`))"
                            ],
                            "group_keys": [
                                "(`t`.`category`)",
                                "(`t`.`period`)"
                            ]
                        },
                        {
                            "#operator": "FinalGroup",
                            "aggregates": [
                                "sum((`t`.`sumActual`))"
                            ],
                            "group_keys": [
                                "(`t`.`category`)",
                                "(`t`.`period`)"
                            ]
                        },
                        {
                            "#operator": "Parallel",
                            "~child": {
                                "#operator": "Sequence",
                                "~children": [
                                    {
                                        "#operator": "InitialProject",
                                        "result_terms": [
                                            {
                                                "expr": "(`t`.`category`)"
                                            },
                                            {
                                                "expr": "(`t`.`period`)"
                                            },
                                            {
                                                "expr": "sum((`t`.`sumActual`))"
                                            }
                                        ]
                                    }
                                ]
                            }
                        }
                    ]
                },
                {
                    "#operator": "Order",
                    "sort_terms": [
                        {
                            "expr": "(`t`.`period`)"
                        },
                        {
                            "expr": "(`t`.`category`)"
                        }
                    ]
                },
                {
                    "#operator": "FinalProject"
                }
            ]
        }
    ],
    "status": "success",
    "metrics": {
        "elapsedTime": "4.846268ms",
        "executionTime": "4.704478ms",
        "resultCount": 1,
        "resultSize": 6133
    }
}

Regards.

#8

Interesting. It’s possible that ElasticSearch is just scanning its index, and not fetching the resulting documents from Couchbase. N1QL currently fetches each document after scanning the index.

Thanks,
Gerald

#9

@geraldss,

We very much appreciate your response; We are trying to decide whether to go ahead and use N1QL or elastic search on two of our projects. Speed is of the essence. It is imperative that we compare actual execution speeds and document them. I would appreciate it very much if you were to run a comparison test between the two on a large set f documents.

e.g create 1,000,000 documents in couchbase with the following fields:
Name
Description
% Complete: contains randomly generate a number from 1 to 100%

The Test:

Using N1QL: Filter results by % complete field for items that have more than 50%
Using Elastic Search: Filter results by % complete field for items that have more than 50%

For both results show the results in an array with all document attributes shown

Many thanks in advanced,