Slow query,4 seconds, for 300 elements returned in a 40k dataset

spatial

#1

I find couchbase spatial query it very slow for the amount of data I have.

I have an amazon ec2 xlarge instance, with 16gb RAM and 4 vCores.

I use two buckets:
cold_storage keeps all the documents
location documents epire after 12 hours. It’s used as a cache bucket, should be small & fast

Number of documents in each bucket:
http://puu.sh/kS0B1/d7504afd70.png

top in ubuntu
http://puu.sh/kS1l0/2a62a167bb.png

http://puu.sh/kS1nF/9ad03955d6.png

CPU ussage in the last minute:
http://puu.sh/kS1MU/6f249c53b2.png

The following stats are measured DAILY:
http://puu.sh/kS0Ky/f05c65a73c.png

cold_storage bucket (big one)
http://puu.sh/kS0Sy/64b016a6b6.png

http://puu.sh/kS0TL/e6694a3244.png

location bucket (smaller one)
http://puu.sh/kS0XG/0b13b6093e.png

http://puu.sh/kS0ZH/2cb0e1ccb9.png

I have the following view file in both buckets.
http://puu.sh/kS1uZ/a61ba129c9.png

Reads
http://puu.sh/kS1ba/e67221657c.png

http://puu.sh/kS1cR/f9fb1e17cd.png

Writes
http://puu.sh/kS1ff/9a302295c7.png

I would appreciate any insight I get. Moving forward I wanted to move our entire infrastructure to couchbase, but I first want tot make sure I completly understand it’s performance.


#2

Hi bogdantirca,

thanks for the detailed description and sorry for coming back to you so late.

I wouldn’t expect it to be so slow. From your code it looks like you’re using the default staleness (which should be update_after. Though can you please try with stale=ok, just to make sure it’s not the indexing that slows things down.

And another note: the approach with the using the location bucket as kind of a cache won’t help for the indexing performance as expiry of documents is the same a delete, which means the index needs to get updated.


#3

In the NodeJs sdk, the default stale is “ok”. I also manually set it it to “ok”, with no visible changes.

I’m using the smaller “location” bucket as a cache because most of my gets are for recent locations which stay in “location” bucket.

Lately I’m getting the following results on average:

  • 137ms for 10 items, 100k bucket
  • 142ms for 27 items, 100k bucket
  • 551ms for 12 items, 1.5M bucket
  • 1739ms for 100 items, 1.5M bucket
  • 4064ms for 300 items, 1.5M bucket

#4

@bogdantirca Hi,i’m also find the SpatialView’s query is slow,how’s your cluster set?


#5

I tried things out locally. A dataset with 1M rectangles and a queries that return different sizes. I it with wrk. With 10 concurrent connections I got (I post the maximum latency of each 60s run):

  • 229.19ms for 4 items
  • 552.07ms for 122 items
  • 1006ms for 1334 items

Is your client in the same dataset as your Couchbase instance? Could it be that the network connection is slow? You might want to try raw HTTP requests, to see if it’s some client or server issue.


#6

The nodejs server is on the same computer as couchbase instance and I’m only using 1 couchbase server.


#7

My test was also on a single Couchbase instance on my local laptop. Could you try doing the HTTP requests directly? Either with nodejs or some other tool?


#8

And another idea (thanks for @simonbasle). Can you try to compact your spatial view and see if it improves the query response time?


#9

None of those fixes the problem.
I recently discovered that my index tab is empty: http://puu.sh/l5juX/7cfba2de24.png
It might be that my views are not indexed?


#10

Sorry for the late reply, I somehow didn’t get notified that you posted. The indexing tab is for secondary indexes and not for views. The spatial views will show up in the views tab.

I’m running out of ideas, if I could get access to your dataset (or a dataset that shows similar issues) that would be great. Then I could more easily debug the issue.