Perfomance issues

Oren_Deri · January 19, 2016, 2:38pm

Hi,

I have 2M documents, each contains 30 properties.
I use a single node on a strong machine with 40 logical CPU and 32GB RAM.

When filter by time range and 6 additional properties the query took more then 1 minutes.
I have all relevant properties indexed (GSI)

The N1QL looks somthing like:

and it return 100 documents after 1 minutes.

How can I improve it? On SQL Server it takes 5-10 seconds

daschl · January 19, 2016, 3:09pm

Can you show the EXPLAIN for the query?

Also, your query might benefit from Covered Indexes in 4.1… @geraldss anything specific in that query in combination with covered indexes that one needs to be careful of?

Oren_Deri · January 20, 2016, 8:18am

Hi,

Thanks for your support
I suspect that when query need to return more documents it takes more time. For example if query found only 50 matched documents it takes 10 seconds, but if it found 10000 matched document it takes 60 seconds. I always limit it to 10000 documents since this is the maximum that my application need per query (no paging allowed since I need all the data per query call).

I created 3 indexes:

document primery key
Definition: CREATE PRIMARY INDEX #primary ON suspectentity USING GSI
by time and channelid
Definition: CREATE INDEX IndexBasic ON suspectentity(time,channelId) USING GSI
by additional metadata properties
Definition: CREATE INDEX IndexMetadata ON suspectentity((metadata.clothingShirtColor),(metadata.clothingPantsColor),(metadata.faceBeard),(metadata.faceEyeglasses),(metadata.faceHairColor),(metadata.faceHat),(metadata.faceMustache),(metadata.height)) USING GSI

see the EXPLAIN:
cbq.exe> {6-01-19T12:00’ and channelId=12 and metadata.faceHat=true and metadata.faceBeard=false and metadata.clothingShirtColor=1 and metadata.clothingPantsColor=1 LIMIT 10000;
{
“requestID”: “53779823-ff46-49b3-990d-bb48271c2f00”,
“signature”: “json”,
“results”: [
{
"#operator": “Sequence”,
"~children": [
{
"#operator": “Sequence”,
"~children": [
{
"#operator": “IntersectScan”,
“scans”: [
{
"#operator": “IndexScan”,
“index”: “IndexBasic”,
“keyspace”: “suspectentity”,
“limit”: 9.223372036854776e+18,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
"“2016-01-19T12:00"”,
“12”
],
“Inclusion”: 3,
“Low”: [
"“2016-01-19T09:00"”,
“12”
]
},
“Seek”: null
}
],
“using”: “gsi”
},
{
"#operator": “IndexScan”,
“index”: “IndexMetadata”,
“keyspace”: “suspectentity”,
“limit”: 9.223372036854776e+18,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
“1”,
“1”,
“successor(false)”
],
“Inclusion”: 1,
“Low”: [
“1”,
“1”,
“false”
]
},
“Seek”: null
}
],
“using”: “gsi”
}
]
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Fetch”,
“keyspace”: “suspectentity”,
“namespace”: “default”
},
{
"#operator": “Filter”,
“condition”: “(((((((“2016-01-19T09:00” \u003c= (suspectentity.time)) and ((suspectentity.time) \u003c= “2016-01-19T12:00”)) and
((suspectentity.channelId) = 12)) and (((suspectentity.metadata).faceHat) = true)) and (((suspectentity.metadata).faceBeard) = false)) and (((suspectentity.meta data).clothingShirtColor) = 1)) and (((suspectentity.metadata).clothingPantsColor) = 1))”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “(suspectentity.id)”
},
{
“expr”: “(suspectentity.time)”
},
{
“expr”: “(suspectentity.signature)”
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
},
{
"#operator": “Limit”,
“expr”: “10000”
}
]
}
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “15.0141ms”,
“executionTime”: “15.0141ms”,
“resultCount”: 1,
“resultSize”: 5340
}
}

geraldss · January 20, 2016, 5:30pm

Hello,

Yes, you need to switch to 4.1 and use covering indexes. They will show up in your EXPLAIN output.

Oren_Deri · January 21, 2016, 7:44am

I cannot switch to couchbase 4.1 since it cost money and community edition support only version 4.0, we are in the phase of evaluation and look for the appropriate NoSQL technology. If it was free then I can try it.

I already use covering indexes and see that EXPLAIN use it - Definition: CREATE INDEX IndexBasic ON suspectentity(time,channelId) USING GSI.
but when create query that filter only fields in the index I still get very slow results.
75 seconds for fetching 10K elements based on time and channelId on a 2M document repository.

cihangirb · January 21, 2016, 1:14pm

Hi @Oren_Deri,
Enterprise Edition is free for development, test and POC efforts. Production use of EE do require a purchase. Besides covering indexes there are a number of other performance and scale capabilities in EE that would make it perform queries faster under load. So I highly recommend using EE for benchmarking.
thanks
-cihan

Oren_Deri · January 31, 2016, 1:41pm

I’m now on CB 4.1 with better hardware.

See here: Couchbase 4.1 N1QL - perfomance issue

Oren_Deri · January 31, 2016, 1:41pm

I’m now on CB 4.1 with better hardware.still have bad perfomance

See here: Couchbase 4.1 N1QL - perfomance issue