Very low performance and 100% CPU usage when running simple N1QL query by multiple workers

maciej.tomaszewski · September 2, 2016, 5:17pm

Hi there,

I wrote similar post long time ago, but it was related to Couchbase Server 4.0 CE. I got a feedback from you it’s probably because of some common bugs and I should wait until 4.1 appear cause it should be already fixed.
So I was waiting and waiting and finally upgraded CB to 4.1 CE. Great! Unfortunately my problem still exist.

I’m trying to query for limited number of documents (<100) from bucket that has 9000 elements but I can’t get it work… fast (i mean, less than 30 seconds). It’s ok when only 1 worker is executing this query (few ms to get the result) but when I run for example 60, it’s getting very slow (60s per query (CBQ shows: Elapsed 30s, Execution: 7s, Result Size: 71343B)) and I can see 100% CPU usage in Couchbase metrics. What the hell? I tried with different consistency level - without difference.

Some details:

each document has about 15 properties - it’s basically small, short varchars and integers
bucket has 9,000 records but there might be >2,000,000 in the future
I have primary index and 3 secondary indexes on this bucket (filtering the data by one property which is integer - 0, 1 or 2)
secondary index: CREATE INDEX queue-test-index ON queue(service) WHERE service = 1 USING GSI
query is very simple: SELECT * FROM queue USE INDEX (queue-test-index USING GSI) WHERE service = 1 ORDER BY created LIMIT 60 OFFSET 0
each worker is using different offset but the same secondary index
machine has 4 core CPU, 32GB RAM

Below you can find EXPLAIN result:

[
  {
    "#operator": "Sequence",
    "~children": [
      {
        "#operator": "Sequence",
        "~children": [
          {
            "#operator": "IndexScan",
            "index": "queue-test-index",
            "keyspace": "queue",
            "namespace": "default",
            "spans": [
              {
                "Range": {
                  "High": [
                    "1"
                  ],
                  "Inclusion": 3,
                  "Low": [
                    "1"
                  ]
                }
              }
            ],
            "using": "gsi"
          },
          {
            "#operator": "Parallel",
            "~child": {
              "#operator": "Sequence",
              "~children": [
                {
                  "#operator": "Fetch",
                  "keyspace": "queue",
                  "namespace": "default"
                },
                {
                  "#operator": "Filter",
                  "condition": "((`queue`.`service`) = 1)"
                },
                {
                  "#operator": "InitialProject",
                  "result_terms": [
                    {
                      "expr": "self",
                      "star": true
                    }
                  ]
                }
              ]
            }
          }
        ]
      },
      {
        "#operator": "Order",
        "sort_terms": [
          {
            "expr": "(`queue`.`created`)"
          }
        ]
      },
      {
        "#operator": "Offset",
        "expr": "0"
      },
      {
        "#operator": "Limit",
        "expr": "60"
      },
      {
        "#operator": "FinalProject"
      }
    ]
  }
]

Can somebody help me? I can still use views (which are perfect by the way) but it’s not an ideal solution for me…
Cheers

keshav_m · September 3, 2016, 2:51pm

create index q1 on queue(created, service) where service=1;

select * from
(select meta().id as dockey
from queue use index(q1)
where created is not missing and service=1
order by created limit 60 offset 0) as myqueue
INNER JOIN queue ON KEYS myqueue.docid

Try create the index below and your original query on 4.5.
4.5 has significant improvements here.

create index q1 on queue(created, service) where service=1;

maciej.tomaszewski · September 3, 2016, 6:39pm

Hi, will it work on 4.1? 4.5 CE is not available yet. If yes - I will check your solution today. I made some tests yesterday and it came out that “order by” caused about 30% drop in performance. I hate N1QL so far, really. Everything works perfect except queries. Best, Maciej

keshav_m · September 3, 2016, 9:21pm

Let me just answer to the technical problem.

If you don’t have the right index, no query engine can help the query.
You can try the suggestion above in 4.1 itself. 4.5 will run the original query with better with the right index

maciej.tomaszewski · September 4, 2016, 12:27pm

@keshav_m I tried your solution, unfortunately without luck. Your query (with INNER JOIN) is slower than my original one.
Here are some screenshots:

My original query:

Your query (with INNER JOIN):

It’s not a big difference and I could live with that if it would work on “heavy” load (I mean more than 1 request per second…).
The problem is when I run more than 10 (for example 20, 30, 60 etc) concurrent workers (threads) which are querying Couchbase Server 4.1 CE (as I mentioned, this is the latest CE version I can use and I’m not able to run 4.5 CE without official docker container).

How long does it take to run N1QL query when there are multiple threads in the background? Check this out:

I guess it’s connected to CPU usage and OPS:

Red arrow marks the moment of running multiple workers.

Do you have any idea why views are so fast and does not cause such resources usage when N1QL can’t handle even the simpliest queries?

By the way: there were so many views of this topic during last days - I believe I am not the only one having this problem.
Maybe @cihangirb/@geraldss have something to say?

Best

maciej.tomaszewski · September 6, 2016, 7:46am

Anything? I ran out of ideas.

geraldss · September 6, 2016, 4:51pm

Try the following on 4.5:

CREATE INDEX queue-test-index ON queue(service, created) WHERE service = 1;

SELECT * FROM queue USE INDEX (queue-test-index) WHERE service = 1 ORDER BY service, created LIMIT 60 OFFSET 0;

maciej.tomaszewski · September 6, 2016, 5:28pm

Hi and thanks for your reply @geraldss.
How many times should I repeat - I can’t use 4.5. I’m using latest version of community edition - 4.1.
Why can’t you give us 4.5 CE? EE is available for few months already…
Back to the topic - anything on 4.1?

geraldss · September 7, 2016, 5:33pm

Hi, @prasad can comment on 4.5 CE availability.

hanswong · September 8, 2016, 7:15am

Even the 4.5 the “order by” has bug, so don’t use “order by” in N1QL. it will fetch all the records from data node instead of the query/index node(even I already indexed the field which I used in ‘order by’).
If you need ‘order by’ and pick the top record as I do. I use View to do it as it is the fastest and stable way to make the job done.

my 2 cents.

maciej.tomaszewski · September 8, 2016, 9:56am

@hanswong Thanks for your reply. Actually I’m using views as you suggested - I fetch all the data (with order), then cut it down on application side. But it’s not the best way to do it - sometimes I have to get 2mln records to cut it to the last 60 - it takes a lot of RAM to keep all the data for a while. ORDER BY cause performance drop but it’s not significant - I can live with 3-5 seconds per query, but 30s is a “litte” to much! And This is not my fault, Couchbase Server is not stable when using N1QL and I want to know why and will it be better on 4.5?

@prasad Any news?

hanswong · September 8, 2016, 10:48am

Its better in some cases but not all.
if you are doing like ordering a indexed field with more than 100K filtered records, N1QL will be very slow.
I’ve tried more than 500K - 1.4M indexed records and the result is time-out.
even 4.5 can’t solve this.
View has better performance on sorting.

maciej.tomaszewski · September 8, 2016, 11:47am

I agree - views are incomparably faster and that’s the reason I prefer to model data and then fetch it easily.
In this case I need to extend that functionality (probably) using N1QL to ORDER and apply LIMIT and OFFSET. It can’t be done without SQL-like engine (N1QL in this case) or application (cutting the results after it fetch everything). The good thing is I can still use views limit because I know where is the “end” of expected set of data - position. But it doesn’t change the fact application shouldn’t be responsible for that. It’s still database layer.

If you don’t get the point:

This is my bucket with 2mln ordered documents (let’s say each “x” = set of 60 documents):
±-----------------------------------------------------------------------------------------------+
| x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x |
±-----------------------------------------------------------------------------------------------+
I want to get exactly 60 documents marked with “o”:
±-----------------------------------------------------------------------------------------------+
| x x x x x x x x x x x x x x x x x x x x x x x x x x x x o x x x x x x x x x x x x x x |
±-----------------------------------------------------------------------------------------------+
Currently I have to get all the data before (on the left side of “o”) and then apply an offset manually <- it sucks.

Do you have any idea how to achieve the impossible?

geraldss · September 8, 2016, 5:15pm

Hi @hanswong, I’m not aware of any bug with ORDER BY.

4.5 provides index ordering, which you are not using.

Hi @maciej.tomaszewski, you can test with 4.5 EE, and then deploy on 4.5 CE when available. Running on 4.5 will give you actual information as opposed to speculation.

prasad · September 8, 2016, 7:36pm

Hi @maciej.tomaszewski,
4.5 CE should be available in Dec’2016.

maciej.tomaszewski · September 9, 2016, 9:15am

Hi @geraldss, I believe 4.5 will be faster and better optimized - this is how product development work. Each version should be better than previous one. I will test it on EE in the meantime. I was looking for the best solution because I need it now. It look like I have to use views instead.

Hi @prasad, Thanks for the update! Can’t wait to test it.

keshav_m · September 15, 2016, 9:30pm

@hanswong

Couchbase 4.5 exploits the index ordering to avoid the full fetch and sorting.
If you care to post your query and explain on 4.5, we can take a look. Thanks.