Time Series Query View

java
query

#1

Hi,

I have a bucket with lots of time series data, about one year, and one document per minute.
I also have a view where the emit key is the timestamp as an array and the value is an object with some values that are aggregated by sum in a custom reduce function.

My problem is the query view performance.
My code is like the documentation suggestion (I’m using Java SDK 2.2.6):

ViewResult result = bucket.query(ViewQuery.from("doc_name", "view_name"));

The performance of this command is ok, but ViewResult class has a method called allRows() that is really useful and returns all the rows of the query. When I call this method it seems that the object is lazily evaluated and is takes to much time to load the data.

I already tried to slice the work into some threads, decreasing the total time. I tested with 1, 2, 4, 8, 16 and 32 threads. But this strategy is still slow, and all tests showed me another problem.
The first threads are taking some milliseconds to finish the job (400ms e.g) , but the last threads are taking about 16 seconds to do exactly the same job.

I need more performance to get all rows or a way to slice the job and ensure all of them will spend about the same time.

Any help would be appreciated.

Regards,
Angelo


#2

@angeloassis in this case I recommend you take a look at our async api (bucket.async().query(…)) which returns an Observable and asynchronously streams you the rows as they arrive. This will be the fastest way to access records as the server streams them back.