N1QL 2 JOINs and Indexes

hector · August 6, 2016, 11:02pm

Hi experts, I’m starting with Couchbase and I have some doubts about indexing and a N1QL query with two JOINs, this is my query:

SELECT twamp.date AS date, utp2.co_name AS rnc_name, utp.co_name AS wbts_name, 
twamp.txtwampmessages AS tx_msg, twamp.losttwampmessages AS lost_msg FROM default as twamp JOIN default AS utp ON KEYS TO_STRING(twamp.wbts_id) JOIN default utp2 ON KEYS TO_STRING(twamp.rnc_id) WHERE utp.type="utp_data" AND twamp.type="twamp_data" 
AND twamp.date >= 1469664000000 AND twamp.date <= 1569664900000 AND utp.co_name="WBTS-California";

This is the index that I’ve created:

CREATE INDEX rep_twamp ON default(type,date) WHERE (type = "twamp_data");

This is part of the EXPLAIN:

                   "#operator": "IndexScan",
                    "index": "rep_twamp",
                    "index_id": "48c2876997b32c37",
                    "keyspace": "default",
                    "namespace": "default",
                    "spans": [
                        {
                            "Range": {
                                "High": [
                                    "\"twamp_data\"",
                                    "1569664900000"
                                ],
                                "Inclusion": 3,
                                "Low": [
                                    "\"twamp_data\"",
                                    "1469664000000"
                                ]
                            }
                        }
                    ],
                    "using": "gsi"

I’m trying to improve the query because it is taking lots of time. Regrading this example, is it possible to index by utp.co_name?

When I try to look for twamp.date the answer time is not bad, but when I try to look for twamp.date and utp.co_name is very slow, I think that it is beacause I’m not indexing utp.co_name.

Thanks for any comments!!!

geraldss · August 7, 2016, 1:17am

Hi, you can try the following:

CREATE INDEX idx_wbts ON default( wbts_id );
CREATE INDEX idx_co_name ON default( co_name );

SELECT twamp.date AS date, utp2.co_name AS rnc_name, utp.co_name AS wbts_name, twamp.txtwampmessages AS tx_msg, twamp.losttwampmessages AS lost_msg
FROM default AS utp
JOIN default AS twamp ON KEY TO_STRING(twamp.wbts_id) FOR utp
JOIN default utp2 ON KEYS TO_STRING(twamp.rnc_id)
WHERE utp.type="utp_data" AND twamp.type="twamp_data" AND twamp.date >= 1469664000000 AND twamp.date <= 1569664900000 AND utp.co_name="WBTS-California";

atom_yang · August 7, 2016, 4:53pm

Are there some principles for create index when I want to improve the query?
As your suggestion,this N1QL will use 3 index to improve query.

dose this means more index will improve query?
If not,should I create index for more properties in one index?
In this case, type, date in twamp and type, co_name in utp should create index,and wbts_id and rnc_id is the JOIN KEY, Does this the better way to create index?

CREATE INDEX idx_wbts ON default( wbts_id );
CREATE INDEX idx_rnc_id ON default( rnc_id );
CREATE INDEX idx_co_name ON default( type,co_name ) WHERE (type == "utp_data");
CREATE INDEX rep_twamp ON default(type,date) WHERE (type = "twamp_data");

geraldss · August 7, 2016, 9:10pm

Hi @atom_yang,

Are you querying the same data as @hector?

If so, what is the performance of the indexes and query that I posted?

atom_yang · August 8, 2016, 2:48am

No,I just want to try to analyze how to create index to improve this query.My N1QL need to improve too.

geraldss · August 8, 2016, 3:07am

I see. No, I was not offering general guidelines, just addressing this specific use case. The performance also depends on data distribution, so the best approach is to test with actual data for the given use case.