Issue while fetching data from couchbase

himanshu.mps · April 11, 2018, 7:38am

Hi,

I am facing issues while fetching data from couchbase. The SQL is running within milliseconds but the data retrieval is taking time. The size of the data is 30446 KB.

Here are the statistics:

$ curl -o out.txt -w “@curl-format.txt” -s -k -X GET “http://localhost:8080/getProductsObservable?limit=1000”
time_namelookup: 0.016000
time_connect: 0.016000
time_appconnect: 0.000000
time_pretransfer: 0.016000
time_redirect: 0.000000
time_starttransfer: 11.813000
----------
time_total: 14.109000

So ideally it is taking 11.81 seconds to get the data which is huge for the size of the data that is being retrieved. Is there any settings/port that needs to be configured so that it takes less time in transfer of data.

This is how my bucket looks like:

Thanks,
Himanshu

himanshu.mps · April 11, 2018, 9:29am

When I run the similar test using Python, it is really fast. Is Java SDK slow in retrieving the data ?

daschl · April 11, 2018, 9:52am

@himanshu.mps can you show us the code you use for python and java to compare?

himanshu.mps · April 11, 2018, 10:14am

Reactive Java Code:

@Repository
public interface ProductRepo extends ReactiveCouchbaseRepository<Product, String>{
}

@Autowired
ProductRepo productRepo;

@GetMapping("/getProductsObservable")
public Observable<Product> getProductsObservable(@RequestParam int limit) {
	return productRepo.getCouchbaseOperations().findByN1QL(N1qlQuery.simple("select META(pmc).id as _ID, META(pmc).cas as _CAS, pmc.* from pmc USE KEYS (select raw meta().id from pmc where meta().id like \"product_%\" order by meta().id LIMIT " + limit + ") order by meta().id"), Product.class);
}

Non-Reactive Java Code:

@GetMapping("/getRecords")
@ResponseBody
public String getDataByKey(@RequestParam String keyPrefix, @RequestParam(required=false) Integer limit, @RequestParam(required=false) Integer skip) {
	if(skip==null) {
		skip=0;
	}
	if(limit==null) {
		limit=10;
	}
	StringBuffer stringBuffer = new StringBuffer();
	StringBuffer query = new StringBuffer();
	try {
		N1qlParams n1qlParams = N1qlParams.build().pretty(false).disableMetrics(false);
		query.append("select * from pmc USE KEYS ");
		query.append("(select raw meta().id from pmc where meta().id like '").append(keyPrefix).append("%'").append(" order by meta().id LIMIT ").append(limit).append(" OFFSET " ).append(skip).append(") order by META().id");
		
		Instant startTime = Instant.now();
		stringBuffer.append("Query: Start time: " + startTime).append("\n<br/>\n");
		System.out.println("Query: " + query.toString());
		N1qlQueryResult result = template.getCouchbaseBucket().query(N1qlQuery.simple(query.toString(), n1qlParams));
		Instant endTime = Instant.now();
		stringBuffer.append("Query: End time: " + endTime).append("\n<br/>\n");
		stringBuffer.append("Query: Total Time Taken: " + (endTime.toEpochMilli() - startTime.toEpochMilli())).append("\n<br/>\n");
		stringBuffer.append("N1qlMetrics: " + result.info()).append("\n<br/>\n");
		startTime = Instant.now();
		stringBuffer.append("Print Data: Start time: " + startTime).append("\n<br/>\n");
		stringBuffer.append("Count: " + result.allRows().size()).append("\n<br/>\n");
		for (N1qlQueryRow row : result) {
			stringBuffer.append(row.value().toString().substring(0, 60)).append(".....").append("\n<br/>\n");
		}
		endTime = Instant.now();
		stringBuffer.append("Print Data: End time: " + endTime).append("\n<br/>\n");
		stringBuffer.append("Print Data: Total Time Taken: " + (endTime.toEpochMilli() - startTime.toEpochMilli())).append("\n<br/>\n");
		System.out.println(stringBuffer.toString());
		return stringBuffer.toString();
	}catch(Exception e) {
		e.printStackTrace();
	}
	return null;
}

Python code:

from couchbase.cluster import Cluster
from couchbase.cluster import PasswordAuthenticator
from couchbase.n1ql import N1QLQuery
from datetime import datetime
cluster = Cluster('couchbase://localhost')
authenticator = PasswordAuthenticator('pmc', '*******')
cluster.authenticate(authenticator)
cb = cluster.open_bucket('pmc')
print "Query: start time: " + str(datetime.now())
row_iter = cb.n1ql_query(N1QLQuery('select pmc.* from pmc USE KEYS (select raw meta().id from pmc where meta().id like "product_%" order by meta().id LIMIT 1000 OFFSET 100) order by meta().id'))
print "Query: end time: " + str(datetime.now())
print "Print Data: Start time: " + str(datetime.now())	
for row in row_iter: 
  print 'Data: {0}'.format(row['product_attribute1'])
print "Print Data: End time: " + str(datetime.now())

himanshu.mps · April 11, 2018, 10:16am

Couchbase configuration Java:

DefaultCouchbaseEnvironment.builder().queryTimeout(10000000).kvTimeout(10000000).searchTimeout(1000000).socketConnectTimeout(1000000000).managementTimeout(1000000000).connectTimeout(10000000).viewTimeout(10000000).autoreleaseAfter(1000000000).build();

himanshu.mps · April 11, 2018, 10:29am

If you want me to run any code to get the timings to troubleshoot the issue, I can run that as well.

daschl · April 11, 2018, 10:54am

@himanshu.mps thanks for sharing - and what is the time difference you are observing? as in the python code and the java code per query. and how long does it take when executed in the UI?

himanshu.mps · April 11, 2018, 11:31am

This is the result from Pyth0n

Query: start time: 2018-04-11 16:59:29.606000
Query: end time: 2018-04-11 16:59:29.606000
Print Data: Start time: 2018-04-11 16:59:29.606000
Data: 5e3805c4-a7d2-4e42-b361-843d15a6cbf3
Data: d658420f-003b-4103-b23a-bee3dd1fcdf0
Data: 4ab87f63-2782-4aa3-acd1-322c874282d5
Data: fd7152a1-e1e5-496a-813c-70f35dd9f116
Data: 943d57b7-5172-4f14-beb7-877e343d08a6
…
Print Data: End time: 2018-04-11 16:59:39.235000

The query execution is fast in both Java and Python but it makes a lot of difference in time taken to get the data in Java