Perfomance issues

query
#1

Hi,

I have 2M documents, each contains 30 properties.
I use a single node on a strong machine with 40 logical CPU and 32GB RAM.

When filter by time range and 6 additional properties the query took more then 1 minutes.
I have all relevant properties indexed (GSI)

The N1QL looks somthing like:

and it return 100 documents after 1 minutes.

How can I improve it? On SQL Server it takes 5-10 seconds

#2

Can you show the EXPLAIN for the query?

Also, your query might benefit from Covered Indexes in 4.1… @geraldss anything specific in that query in combination with covered indexes that one needs to be careful of?

#3

Hi,

Thanks for your support :sunglasses:
I suspect that when query need to return more documents it takes more time. For example if query found only 50 matched documents it takes 10 seconds, but if it found 10000 matched document it takes 60 seconds. I always limit it to 10000 documents since this is the maximum that my application need per query (no paging allowed since I need all the data per query call).

I created 3 indexes:

  1. document primery key
    Definition: CREATE PRIMARY INDEX #primary ON suspectentity USING GSI
  2. by time and channelid
    Definition: CREATE INDEX IndexBasic ON suspectentity(time,channelId) USING GSI
  3. by additional metadata properties
    Definition: CREATE INDEX IndexMetadata ON suspectentity((metadata.clothingShirtColor),(metadata.clothingPantsColor),(metadata.faceBeard),(metadata.faceEyeglasses),(metadata.faceHairColor),(metadata.faceHat),(metadata.faceMustache),(metadata.height)) USING GSI

see the EXPLAIN:
cbq.exe> {6-01-19T12:00’ and channelId=12 and metadata.faceHat=true and metadata.faceBeard=false and metadata.clothingShirtColor=1 and metadata.clothingPantsColor=1 LIMIT 10000;
{
“requestID”: “53779823-ff46-49b3-990d-bb48271c2f00”,
“signature”: “json”,
“results”: [
{
"#operator": “Sequence”,
"~children": [
{
"#operator": “Sequence”,
"~children": [
{
"#operator": “IntersectScan”,
“scans”: [
{
"#operator": “IndexScan”,
“index”: “IndexBasic”,
“keyspace”: “suspectentity”,
“limit”: 9.223372036854776e+18,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
"“2016-01-19T12:00"”,
“12”
],
“Inclusion”: 3,
“Low”: [
"“2016-01-19T09:00"”,
“12”
]
},
“Seek”: null
}
],
“using”: “gsi”
},
{
"#operator": “IndexScan”,
“index”: “IndexMetadata”,
“keyspace”: “suspectentity”,
“limit”: 9.223372036854776e+18,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
“1”,
“1”,
“successor(false)”
],
“Inclusion”: 1,
“Low”: [
“1”,
“1”,
“false”
]
},
“Seek”: null
}
],
“using”: “gsi”
}
]
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Fetch”,
“keyspace”: “suspectentity”,
“namespace”: “default”
},
{
"#operator": “Filter”,
“condition”: “(((((((“2016-01-19T09:00” \u003c= (suspectentity.time)) and ((suspectentity.time) \u003c= “2016-01-19T12:00”)) and
((suspectentity.channelId) = 12)) and (((suspectentity.metadata).faceHat) = true)) and (((suspectentity.metadata).faceBeard) = false)) and (((suspectentity.meta data).clothingShirtColor) = 1)) and (((suspectentity.metadata).clothingPantsColor) = 1))”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “(suspectentity.id)”
},
{
“expr”: “(suspectentity.time)”
},
{
“expr”: “(suspectentity.signature)”
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
},
{
"#operator": “Limit”,
“expr”: “10000”
}
]
}
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “15.0141ms”,
“executionTime”: “15.0141ms”,
“resultCount”: 1,
“resultSize”: 5340
}
}

#4

Hello,

Yes, you need to switch to 4.1 and use covering indexes. They will show up in your EXPLAIN output.

#5

I cannot switch to couchbase 4.1 since it cost money and community edition support only version 4.0, we are in the phase of evaluation and look for the appropriate NoSQL technology. If it was free then I can try it.

I already use covering indexes and see that EXPLAIN use it - Definition: CREATE INDEX IndexBasic ON suspectentity(time,channelId) USING GSI.
but when create query that filter only fields in the index I still get very slow results.
75 seconds for fetching 10K elements based on time and channelId on a 2M document repository.

#6

Hi @Oren_Deri,
Enterprise Edition is free for development, test and POC efforts. Production use of EE do require a purchase. Besides covering indexes there are a number of other performance and scale capabilities in EE that would make it perform queries faster under load. So I highly recommend using EE for benchmarking.
thanks
-cihan

#7

I’m now on CB 4.1 with better hardware.

See here: Couchbase 4.1 N1QL - perfomance issue

#8

I’m now on CB 4.1 with better hardware.still have bad perfomance

See here: Couchbase 4.1 N1QL - perfomance issue