Couchbase querying performance issue

java
n1ql

#1

Hi,

We are migrating to use couchbase for storage and i was doing some tests to mesure the performance.
We have installed server 4.6 and I’m using the java client 2.4.3.

The query I’m using is the following:

SELECT T., Product., Account.* FROM MyBucket T
JOIN MyBucket Product ON KEYS T.product_ref
JOIN MyBucket Account ON KEYS T.account_ref
where T.type=‘Transaction’ and Product.type = ‘Product’ and Account._type=‘Account’

I have indexed the field _type and my bucket contains 300 000 documents.

I’m using the following code for test:

N1qlQuery query = N1qlQuery.simple(queryString);
final AtomicInteger count=new AtomicInteger(0);
CountDownLatch countDownLatch=new CountDownLatch(1);
bucket.async().query(query)
.timeout(60000, TimeUnit.SECONDS) //global timeout for the query
//.flatMap(asyncN1qlQueryResult -> asyncN1qlQueryResult.rows()) // //browse the resulting rows
.flatMap(asyncN1qlQueryResult ->asyncN1qlQueryResult.errors()
.flatMap(error ->
Observable.error(
new RuntimeException("Error while performing query: "
+ error)))
.switchIfEmpty(asyncN1qlQueryResult.rows()))
.map(asyncN1qlQueryRow -> asyncN1qlQueryRow.value()) ////extract the value of rows that were obtained
.subscribe(new Subscriber() {
@Override
public void onCompleted() {
System.out.println(“complete”);
countDownLatch.countDown();
}
@Override
public void onError(Throwable e) {
System.out.println(“Error…”);
e.printStackTrace();
countDownLatch.countDown();
//throw new RuntimeException(e);

  	}
  	@Override
  	public void onNext(JsonObject t) {
  		//System.out.println("next.."+t);
  		count.incrementAndGet();				
  	}
  });		
  try {
  	countDownLatch.await();
  	System.out.println("Finished in "+((System.currentTimeMillis()-start)/1000));
  	System.out.println("Found "+count.intValue());
  } catch (InterruptedException e1) {
  	e1.printStackTrace();
  }

When testing on my local couchbase server, the query finishes in 37 seconds which is high!

But, when testing the same code on a remote couhbase server,I’m surprised about the result:

java.lang.RuntimeException: Error while performing query: {“msg”:“Timeout 1h0m0s exceeded”,“code”:4080}
Finished in 3620
Found 0
at …
at rx.internal.operators.OnSubscribeMap$MapSubscriber.onNext(OnSubscribeMap.java:69)
at rx.internal.operators.OnSubscribeMap$MapSubscriber.onNext(OnSubscribeMap.java:77)
at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber.call(OperatorObserveOn.java:224)
at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

The query didnn’t finish even after 1 houre (I have set 1 hour as a query timeout when building my CouchbaseEnvironment.

Any help please about this results?


#2

Post the EXPLAIN output of query and index definitions.
Also set pretty=false and run the query in cbq shell and measure timings.


#3

Thanks vsr for your quick reply

Here is the execution plan;

[{
“plan” : {
"#operator" : “Sequence”,
"~children" : [{
"#operator" : “IndexScan”,
“index” : “type_index”,
“index_id” : “8408eff485a764df”,
“keyspace” : “MyBucket “,
“namespace” : “default”,
“spans” : [{
“Range” : {
“High” : [
”“Transaction””
],
“Inclusion” : 3,
“Low” : [
"“Transaction”"
]
}
}
],
“using” : “gsi”
}, {
"#operator" : “Parallel”,
"~child" : {
"#operator" : “Sequence”,
"~children" : [{
"#operator" : “Fetch”,
“as” : “T”,
“keyspace” : “MyBucket “,
“namespace” : “default”
}, {
#operator” : “Join”,
“as” : “Product”,
“keyspace” : “MyBucket “,
“namespace” : “default”,
“on_keys” : “(T.Product_ref)”
}, {
#operator” : “Join”,
“as” : “Account”,
“keyspace” : “MyBucket “,
“namespace” : “default”,
“on_keys” : “(T.account_ref)”
}, {
#operator” : “Filter”,
“condition” : “(((((T._type) = “Transaction”) and ((Product._type) = “Product”)) and ((Account._type) = “Account”)) and (not ((T.Transaction_date) = “0”)))”
}, {
"#operator" : “InitialProject”,
“result_terms” : [{
“expr” : “T”,
“star” : true
}, {
“expr” : “Product”,
“star” : true
}, {
“expr” : “Account”,
“star” : true
}
]
}, {
"#operator" : “FinalProject”
}
]
}
}
]
},
“text” : “SELECT T., Product., Account.* FROM MyBucket T \r\n JOIN MyBucket Product ON KEYS T.Product_ref\r\n JOIN MyBucket Account ON KEYS T.account_ref \r\nwhere T._type=‘Transaction’ and Product._type = ‘Product’ and Account._type=‘Account’ and T.Transaction_date != ‘0’”
}
]

for the index, I have only the secondary index:

CREATE INDEX type_index ON MyBucket(_type)


#4

Try the following index and set pretty=false. See how long it takes in cbq shell

CREATE INDEX transactiondate_idx ON MyBucket(Transaction_date) WHERE _type = 'Transaction'


#5

Hi vsr1,

I have created the index and set pretty=false.
The result is:

“status”: “success”,
“metrics”: {“elapsedTime”: “42.107263743s”,“executionTime”: “42.107222927s”,“resultCount”: 109197,“resultSize”: 80271289}
}


#6

With the index query plan is optimized. As you have 3 way join and r results size is 80MB it taking time to process and transport it.