N1QL query Performance is getting low irrespective of iterations

4.1
.net
n1ql
#1

Hello Guys,

I am using below Environment:

Version 4.1.0-5005 Enterprise Edition (build-5005)
Couchbase .NET SDK version: 2.2.1

I have 100000 documents in my bucket which is having different types of documents.

I am using Secondary index on fields according to the query.

I am trying to fetch the data in chunks using Limit = 10 and Offset starting from 0.

I am using the below statement within the loop and changing the offset value according to the iteration.

var result = objCouchBucket.Query(query);

Now my problem is that the query response time increases as the iteration increases.

eg.

Time taken for executing CouchBaseQuery: 00:01.190 PageNo: 1
Time taken for executing CouchBaseQuery: 00:00.832 PageNo: 2
Time taken for executing CouchBaseQuery: 00:00.534 PageNo: 3
Time taken for executing CouchBaseQuery: 00:00.638 PageNo: 4
Time taken for executing CouchBaseQuery: 00:01.721 PageNo: 5
Time taken for executing CouchBaseQuery: 00:01.675 PageNo: 6
Time taken for executing CouchBaseQuery: 00:01.960 PageNo: 7
Time taken for executing CouchBaseQuery: 00:03.102 PageNo: 8
Time taken for executing CouchBaseQuery: 00:01.872 PageNo: 9
Time taken for executing CouchBaseQuery: 00:02.035 PageNo: 10

Time taken for executing CouchBaseQuery: 01:17.503 PageNo: 261
Time taken for executing CouchBaseQuery: 01:01.168 PageNo: 262
Time taken for executing CouchBaseQuery: 01:36.633 PageNo: 263
Time taken for executing CouchBaseQuery: 01:14.127 PageNo: 264
Time taken for executing CouchBaseQuery: 01:20.398 PageNo: 265
Time taken for executing CouchBaseQuery: 01:13.244 PageNo: 266
Time taken for executing CouchBaseQuery: 01:37.856 PageNo: 267
Time taken for executing CouchBaseQuery: 01:29.662 PageNo: 268
Time taken for executing CouchBaseQuery: 01:39.530 PageNo: 269
Time taken for executing CouchBaseQuery: 01:18.455 PageNo: 270
Time taken for executing CouchBaseQuery: 01:13.062 PageNo: 271
Time taken for executing CouchBaseQuery: 01:25.460 PageNo: 272

I have also tried with increasing chunk size (i.e. Limit = 100),but then also I am facing the same problem.

Can any one please help me on this.

Regards
Mukesh Raghuwanhsi

#2

@mukeshraghu1c

Can you provide the results of an EXPLAIN of your query? This might shed some light on the situation.

Thanks,
Brant

#3

Hi @btburnett3,

Thanks for quick reply.

please find below the result of an EXPLAIN of my query.

{
"#operator": “Sequence”,
"~children": [
{
"#operator": “Sequence”,
"~children": [
{
"#operator": “UnionScan”,
“scans”: [
{
"#operator": “IndexScan”,
“index”: “PROD_INVOICE_Rule1_idx”,
“keyspace”: “N1QL_PROD”,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
“successor(“INVOICE”)”
],
“Inclusion”: 1,
“Low”: [
"“INVOICE”"
]
}
},
{
“Range”: {
“High”: [
“successor(“INVOICE”)”
],
“Inclusion”: 1,
“Low”: [
"“INVOICE”"
]
}
}
],
“using”: “gsi”
}
]
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Fetch”,
“as”: “INVOICE_0”,
“keyspace”: “N1QL_PROD”,
“namespace”: “default”
},
{
"#operator": “Filter”,
“condition”: “(((((INVOICE_0.DOCUMENT_TYPE) = “INVOICE”) and ((INVOICE_0.IH_DID_ID_LOC) like “D%”)) and (not (ifnull((INVOICE_0.IH_BTP_C_PRI_NAME), “”) = “Thorsten Weinands”))) and ((INVOICE_0.VERSION) between 1 and 1))”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “(INVOICE_0.ID)”
},
{
“expr”: “(INVOICE_0.CONNECTIONID)”
},
{
“expr”: “(INVOICE_0.INITIAL_EXTRACT_DATE)”
},
{
“expr”: “(INVOICE_0.LAST_EXTRACT_DATE)”
},
{
“expr”: “(INVOICE_0.ACTIVE_FLAG)”
},
{
“expr”: “(INVOICE_0.CUSTOMER_ID)”
},
{
“expr”: “(INVOICE_0.INVOICE_ID)”
},
{
“expr”: “(INVOICE_0.LOAD_KEY)”
},
{
“expr”: “(INVOICE_0.IH_BTP_C_PRI_NAME)”
},
{
“expr”: “(INVOICE_0.IH_BTP_PID_ID)”
},
{
“expr”: “(INVOICE_0.IH_BTP_PID_NID)”
},
{
“expr”: “(INVOICE_0.IH_CP_C_PRI_NAME)”
},
{
“expr”: “(INVOICE_0.IH_CP_PID_ID)”
},
{
“expr”: “(INVOICE_0.IH_CP_PID_NID)”
},
{
“expr”: “(INVOICE_0.IH_DCMNT_DATE)”
},
{
“expr”: “(INVOICE_0.IH_DID_ID)”
},
{
“expr”: “(INVOICE_0.IH_DID_ID_AE)”
},
{
“expr”: “(INVOICE_0.IH_DID_ID_LID)”
},
{
“expr”: “(INVOICE_0.IH_DID_ID_LOC)”
},
{
“expr”: “(INVOICE_0.IH_DID_ID_VARIATION_ID)”
},
{
“expr”: “(INVOICE_0.IH_DID_NID)”
},
{
“expr”: “(INVOICE_0.IH_DSC)”
},
{
“expr”: “(INVOICE_0.IH_EXTND_AMT_TXN)”
},
{
“expr”: “(INVOICE_0.IH_EXTND_AMT_TXN_ID)”
},
{
“expr”: “(INVOICE_0.IH_LAST_MODIFICATION_DATE)”
},
{
“expr”: “(INVOICE_0.IH_PT_D_AMT_TXN)”
},
{
“expr”: “(INVOICE_0.IH_PT_DSC)”
},
{
“expr”: “(INVOICE_0.IH_PT_P_AMT_TXN_01)”
},
{
“expr”: “(INVOICE_0.IH_PT_T_ID)”
},
{
“expr”: “(INVOICE_0.IH_S_CD)”
},
{
“expr”: “(INVOICE_0.IH_SELF_BLNG_INDCTR)”
},
{
“expr”: “(INVOICE_0.IH_TOTAL_AMT_TXN)”
},
{
“expr”: “(INVOICE_0.IH_TOTAL_AMT_TXN_ID)”
},
{
“expr”: “(INVOICE_0.CUSTOMER_ID_TEMP_FK)”
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
},
{
"#operator": “Offset”,
“expr”: “0”
},
{
"#operator": “Limit”,
“expr”: “10”
}
]
}

#4

Hi, two points. First, you can try using a covering index. by adding the INVOICE doc as an additional field in the index. If you run EXPLAIN, you will see cover() in the output for covering indexes.

Second, we have added a performance fix for ORDER BY + LIMIT. This will be available in the next release, as well as the developer preview for the next release.

#5

Hi @geraldss,

Thanks for the reply.

I have gone through the documentation for covering index but till now I am not much clear about covering index.
Can you please explain how can I create covering index.

My documents structure is like -

{
“DOCUMENT_TYPE”: “INVOICE”,
“DOCUMENT_ID”: “INVOICE:100:1”,
“VERSION”: 1,
“ID”: 100,
“CONNECTIONID”: 4,
“INITIAL_EXTRACT_DATE”: “2015-04-02T06:47:57.78-04:00”,
“LAST_EXTRACT_DATE”: “2015-04-02T06:47:57.78-04:00”,
-
-
-
-
-
“INVOICE_ID”: 100,
“INVOICE_LINE”: [
{
“DOCUMENT_TYPE”: “INVOICE_LINE”,
“DOCUMENT_ID”: “INVOICE_LINE:154:1”,
“VERSION”: 1,
“ID”: 154,
“CONNECTIONID”: 4,
“INITIAL_EXTRACT_DATE”: “2015-04-02T06:47:57.78-04:00”,
“LAST_EXTRACT_DATE”: “2015-04-02T06:47:57.78-04:00”,
“ACTIVE_FLAG”: “Y”,
“INVOICE_ID”: 100,
“INVOICE_LINE_ID”: 154,
“ITEM_ID”: 117
},
{
“DOCUMENT_TYPE”: “INVOICE_LINE”,
“DOCUMENT_ID”: “INVOICE_LINE:155:1”,
“VERSION”: 1,
“ID”: 155,
“CONNECTIONID”: 4,
“INITIAL_EXTRACT_DATE”: “2015-04-02T06:47:57.78-04:00”,
“LAST_EXTRACT_DATE”: “2015-04-02T06:47:57.78-04:00”,
“ACTIVE_FLAG”: “Y”,
“INVOICE_ID”: 100,
“INVOICE_LINE_ID”: 155,
“ITEM_ID”: 117
}
],
“ACTIVE_FLAG”: “Y”,
“CUSTOMER_ID”: 125
}

My current index statement is -

CREATE INDEX PROD_INVOICE_Rule1_idx ON N1QL_PROD(DOCUMENT_TYPE,DOCUMENT_ID,VERSION,CONNECTIONID,IH_DID_ID_LOC,IH_BTP_C_PRI_NAME) WHERE (DOCUMENT_TYPE = “INVOICE”) USING GSI

Here is my query -

SELECT INVOICE_0.ID,INVOICE_0.CONNECTIONID,INVOICE_0.INITIAL_EXTRACT_DATE,INVOICE_0.LAST_EXTRACT_DATE,INVOICE_0.ACTIVE_FLAG,INVOICE_0.CUSTOMER_ID,INVOICE_0.INVOICE_ID,INVOICE_0.LOAD_KEY,INVOICE_0.IH_BTP_C_PRI_NAME,INVOICE_0.IH_BTP_PID_ID,INVOICE_0.IH_BTP_PID_NID,INVOICE_0.IH_CP_C_PRI_NAME,INVOICE_0.IH_CP_PID_ID,INVOICE_0.IH_CP_PID_NID,INVOICE_0.IH_DCMNT_DATE,INVOICE_0.IH_DID_ID,INVOICE_0.IH_DID_ID_AE,INVOICE_0.IH_DID_ID_LID,INVOICE_0.IH_DID_ID_LOC,INVOICE_0.IH_DID_ID_VARIATION_ID,INVOICE_0.IH_DID_NID,INVOICE_0.IH_DSC,INVOICE_0.IH_EXTND_AMT_TXN,INVOICE_0.IH_EXTND_AMT_TXN_ID,INVOICE_0.IH_LAST_MODIFICATION_DATE,INVOICE_0.IH_PT_D_AMT_TXN,INVOICE_0.IH_PT_DSC,INVOICE_0.IH_PT_P_AMT_TXN_01,INVOICE_0.IH_PT_T_ID,INVOICE_0.IH_S_CD,INVOICE_0.IH_SELF_BLNG_INDCTR,INVOICE_0.IH_TOTAL_AMT_TXN,INVOICE_0.IH_TOTAL_AMT_TXN_ID,INVOICE_0.CUSTOMER_ID_TEMP_FK
FROM N1QL_PROD AS INVOICE_0
WHERE INVOICE_0.DOCUMENT_TYPE = “INVOICE” AND INVOICE_0.IH_DID_ID_LOC LIKE “D%” AND IFNULL(INVOICE_0.IH_BTP_C_PRI_NAME,"") <> “Thorsten Weinands” AND INVOICE_0.VERSION BETWEEN 1 AND 1 LIMIT 10 OFFSET 0;

Can you please provide me a covering index statement for above query.

Regards
Mukesh Raghuwanshi

#6

Thanks @geraldss,

I tried with the below index statement and now it is showing covering in explain result -

CREATE INDEX PROD_INVOICE_Rule1_Cover_idx ON N1QL_PROD(DOCUMENT_TYPE,DOCUMENT_ID,VERSION,CONNECTIONID,IH_DID_ID_LOC,IH_BTP_C_PRI_NAME,ID,INITIAL_EXTRACT_DATE,LAST_EXTRACT_DATE,ACTIVE_FLAG,CUSTOMER_ID,INVOICE_ID,LOAD_KEY,IH_BTP_PID_ID,IH_BTP_PID_NID,IH_CP_C_PRI_NAME,IH_CP_PID_ID,IH_CP_PID_NID,IH_DCMNT_DATE,IH_DID_ID,IH_DID_ID_AE,IH_DID_ID_LID,IH_DID_ID_VARIATION_ID,IH_DID_NID,IH_DSC,IH_EXTND_AMT_TXN,IH_EXTND_AMT_TXN_ID,IH_LAST_MODIFICATION_DATE,IH_PT_D_AMT_TXN,IH_PT_DSC,IH_PT_P_AMT_TXN_01,IH_PT_T_ID,IH_S_CD,IH_SELF_BLNG_INDCTR,IH_TOTAL_AMT_TXN,IH_TOTAL_AMT_TXN_ID,CUSTOMER_ID_TEMP_FK) WHERE (DOCUMENT_TYPE = “INVOICE”) USING GSI;

Performance is better than using without covering index ,But here also as the iteration increases query response time increases.

I have one doubt.

Is covering index is applicable with joins and unnest.

If yes then how ?

Regards
Mukesh Raghuwanshi

#7

Hi @geraldss,

Is covering index applicable with joins and unnest ?

can you please tell ?

#8

Hi @geraldss,

When I am executing below query with expalin then it is showing that it is using covering index but I am getting exception.

SELECT COUNT(*) AS Count FROM N1QL_PROD AS INVOICE_0 WHERE INVOICE_0.DOCUMENT_TYPE = “INVOICE” AND INVOICE_0.IH_DID_ID_LOC LIKE “D%” AND IFNULL(INVOICE_0.IH_BTP_C_PRI_NAME,"") <> “Thorsten Weinands” AND INVOICE_0.VERSION BETWEEN 1 AND 1;

Error: Error Code: 12015
Message: Index scan timed out - cause: Index scan timed out

Csn you please help me on this.

Regards,
Mukesh Raghuwanshi

#9

Hi @geraldss,

Please reply me on this my whole POC got stuck because of this.

Now we are in confusion that we should go with N1QL or with ElasticSearch.

Regards
Mukesh Raghuwanshi

#10

Hi @mukeshraghu1c, covering indexes do apply to multi bucket operations like JOIN.

Index scan timeout is returned in cases where the index scan takes a long time. That may be because the index scan range is too large or the HW you have is not powerful enough to finish the scan in a given amount of time. To answer the first issue I need a little more data from you; Could you send out the explain for this query above query and tell me the total item count you’d expect when filtering only on each predicate as well?
INVOICE_0.DOCUMENT_TYPE = "INVOICE"
INVOICE_0.IH_DID_ID_LOC LIKE “D%“
IFNULL(INVOICE_0.IH_BTP_C_PRI_NAME,””) <> "Thorsten Weinands"
INVOICE_0.VERSION BETWEEN 1 AND 1

thanks
-cihan

#11

Hi @cihangirb,

Thanks for the reply.

Below is the explain for above query.

I have total 100000 documents in my bucket and by using elastic search I am getting around 60000 documents as output for above query conditions.

{“Object” :{
"#operator": “Sequence”,
"~children": [
{
"#operator": “UnionScan”,
“scans”: [
{
"#operator": “IndexScan”,
“covers”: [
“cover((meta(INVOICE_0).id))”,
“cover((INVOICE_0.DOCUMENT_TYPE))”,
“cover((INVOICE_0.DOCUMENT_ID))”,
“cover((INVOICE_0.VERSION))”,
“cover((INVOICE_0.ID))”,
“cover((INVOICE_0.CONNECTIONID))”,
“cover((INVOICE_0.INITIAL_EXTRACT_DATE))”,
“cover((INVOICE_0.LAST_EXTRACT_DATE))”,
“cover((INVOICE_0.ACTIVE_FLAG))”,
“cover((INVOICE_0.CUSTOMER_ID))”,
“cover((INVOICE_0.INVOICE_ID))”,
“cover((INVOICE_0.LOAD_KEY))”,
“cover((INVOICE_0.IH_BTP_C_PRI_NAME))”,
“cover((INVOICE_0.IH_BTP_PID_ID))”,
“cover((INVOICE_0.IH_BTP_PID_NID))”,
“cover((INVOICE_0.IH_CP_C_PRI_NAME))”,
“cover((INVOICE_0.IH_CP_PID_ID))”,
“cover((INVOICE_0.IH_CP_PID_NID))”,
“cover((INVOICE_0.IH_DCMNT_DATE))”,
“cover((INVOICE_0.IH_DID_ID))”,
“cover((INVOICE_0.IH_DID_ID_AE))”,
“cover((INVOICE_0.IH_DID_ID_LID))”,
“cover((INVOICE_0.IH_DID_ID_LOC))”,
“cover((INVOICE_0.IH_DID_ID_VARIATION_ID))”,
“cover((INVOICE_0.IH_DID_NID))”,
“cover((INVOICE_0.IH_DSC))”,
“cover((INVOICE_0.IH_EXTND_AMT_TXN))”,
“cover((INVOICE_0.IH_EXTND_AMT_TXN_ID))”,
“cover((INVOICE_0.IH_LAST_MODIFICATION_DATE))”,
“cover((INVOICE_0.IH_PT_D_AMT_TXN))”,
“cover((INVOICE_0.IH_PT_DSC))”,
“cover((INVOICE_0.IH_PT_P_AMT_TXN_01))”,
“cover((INVOICE_0.IH_PT_T_ID))”,
“cover((INVOICE_0.IH_S_CD))”,
“cover((INVOICE_0.IH_SELF_BLNG_INDCTR))”,
“cover((INVOICE_0.IH_TOTAL_AMT_TXN))”,
“cover((INVOICE_0.IH_TOTAL_AMT_TXN_ID))”,
“cover((INVOICE_0.CUSTOMER_ID_TEMP_FK))”,
“cover((INVOICE_0.INVOICE_LINE))”
],
“index”: “PROD_INVOICE_Rule1_Cover_idx”,
“keyspace”: “N1QL_PROD”,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
“successor(“INVOICE”)”
],
“Inclusion”: 1,
“Low”: [
"“INVOICE”"
]
}
},
{
“Range”: {
“High”: [
“successor(“INVOICE”)”
],
“Inclusion”: 1,
“Low”: [
"“INVOICE”"
]
}
}
],
“using”: “gsi”
}
]
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Filter”,
“condition”: “((((cover((INVOICE_0.DOCUMENT_TYPE)) = “INVOICE”) and (cover((INVOICE_0.IH_DID_ID_LOC)) like “D%”)) and (not (ifnull(cover((INVOICE_0.IH_BTP_C_PRI_NAME)), “”) = “Thorsten Weinands”))) and (cover((INVOICE_0.VERSION)) between 1 and 1))”
},
{
"#operator": “InitialGroup”,
“aggregates”: [
“count()"
],
“group_keys”: []
}
]
}
},
{
"#operator": “IntermediateGroup”,
“aggregates”: [
"count(
)”
],
“group_keys”: []
},
{
"#operator": “FinalGroup”,
“aggregates”: [
“count()"
],
“group_keys”: []
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “InitialProject”,
“result_terms”: [
{
“as”: “count”,
“expr”: "count(
)”
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
}}

I am using only one bucket with different type of documents.
So can you please tell me any example that how can i use covering index for self join and unnest.
Or otherwise please provide me any documentation link where I can find this.

I have searched alot for same but did not find anything relative to join with covering.

Regards
Mukesh Raghuwanshi

#12

Hi @mukeshraghu1c,

Here is the pattern:

CREATE INDEX idx1 ON b1(expr4, expr0, expr2, expr3);

SELECT b1.expr0, b2, b3
FROM mybucket b1 UNNEST b1.expr2 AS b2 JOIN mybucket b3 ON KEYS b1.expr3
WHERE b1.expr4 = val4;

This will use idx1 as a covering index.