REQUEST_PLUS does not work as expected

Hi,

I have an integration test for my small app.

Whenever a test runs, it flushes a bucket, drop the primary index, create a new primary index, and then seed some test documents.

When I set REQUEST_PLUS flag to my query, most of times, my tests wait for couchbase and then couchbase returns an empty array.

Is there a way to get around this issue? It looks like my couchbase can’t keep up with deleting and creating a primary index so frequently.

On which Server version are you on? If not 4.1, just to see if it works, can you try with 4.1?

Thank you @daschl.

I’m on 4.0 at the moment. I will test it out with 4.1 over the weekend.

Hi daschl

Just tested with 4.1. 4.1 does not have the same problem as 4.0, but it still has a problem with REQUEST_PLUS.

This is what my test does

  1. Flush
  2. Create a new document with “test” as its key
  3. Query with N1QL and verify that there is a document created in step #2.
  4. Delete the document.

The test runs fine for the first time, but it fails #3 when I run the same test multiple times.

I’m using PHP SDK with CB 4.1

It makes testing almost impossible.

Hi there,
are there any updates on this?

I have the same problem, but I’m using the Java client (2.2.3) and testing on CB4.0.
Writing a single document and retrieving it by submitting a n1ql query does not work in a predictable way.

Same result using STATEMENT_PLUS consistency.
If many documents are persisted, the server hangs for a while and just returns an empty array.

My feeling is that the query times up while waiting for the index to be up to date with the latest writes.
However, the response doesn’t indicate any time out info, which makes it impossible to attempt a retry approach (because it is not possible to distinguish between a query that has returned 0 elements because there are no and one that has timed out while wait for the index to update).

Thanks in advance for any input.

bump…

Still having the same issue.

@moon0326 we have some integration test using REQUEST_PLUS in the java client suite and they pass, do you think its possible to create a reproducible code/test case that I can run and see whats going on there?

@daschl

Thank you for the follow up. I will try to come up with a code sample soon.
My tests are passing now after I started using n1ql for everything (including flushing documents). I will dig into the code sample I was using.

That would be great, N1QL should work together with KV as well, so if something doesn’t as expected we should look into it!

@daschl

I realized that this might not be an issue with scan_consistency at all and it is such an edge case. It only happens when I flush, drop the primary index, and re-create the primary index. I really don’t expect to do that on production. Even in test environment, deleting documents with n1ql works just fine.

Anyway, here is test code and video.

Test code: https://github.com/moon0326/cb-test (take a look at tests/RequestPlusTest.php)
Video: https://www.youtube.com/watch?v=yhyGULXaoUU&feature=youtu.be

@moon0326 hm, this smells a lot like https://issues.couchbase.com/browse/MB-16957 or a variation of this. I’ll try to reproduce your code locally and if I can open a new issue if there is not a duplicate yet.

Couchbase version: 4.6.3
Java client version: 2.4.7

Problem:

  • Request plus scan consistency is not working as expected

Steps (multiple threads are performing the following steps at the same time):

  1. Query by n1ql with “request plus” scan consistency a document A
  2. If it does not exist then create the document A

Expected behaviour:

  • The document A is created once and not several times

Current result:

  • Several A documents are created despite the request plus setting.
  • If it has the same key just one is created and the rest get the “document already exist exception” but this is not our case.

Considerations:

  • The key of the document is different in each iteration.
  • We tried to simulate that failing scenario and realized that the request plus works perfect with a delay of more than 40ms between the first request and the second one.

Workaround:

  • At the beginning of the process we check if a custom document exists getting by key. If not, create a document with an expiration time of 2 seconds and a unique key that identifies the user performing the action so that further requests get ignored if that doc exists. That way we keep consistent data in our database.

Questions:

  1. Any solution for the request plus?
  2. Is it ok our workaround?

Thanks in advance!

Java code snippet to test the problem:

public static void main(String... args) throws Exception {
Cluster cluster = CouchbaseCluster.create(DefaultCouchbaseEnvironment.create(), "localhost");
    Bucket bucket = cluster.openBucket("requestplus");
    bucket.bucketManager().createN1qlPrimaryIndex(true, false);
    
    new Thread(() -> queryAndCreate(bucket, 1)).start();
    //Thread.sleep(40);
    new Thread(() -> queryAndCreate(bucket, 2)).start();
    new Thread(() -> queryAndCreate(bucket, 3)).start();
    new Thread(() -> queryAndCreate(bucket, 4)).start();
}

private static void queryAndCreate(Bucket bucket, int i) {
	try {
	int docId = 1;

	JsonObject placeholderValues = JsonObject.create().put("id", docId);
	N1qlParams n1qlParams = N1qlParams.build().consistency(ScanConsistency.REQUEST_PLUS);
	N1qlQueryResult result = bucket.query(N1qlQuery.parameterized("SELECT name FROM requestplus WHERE id = $id", placeholderValues, n1qlParams));

	List<String> lista = result.allRows().stream().map(row -> row.value().getString("name")).collect(Collectors.toList());
	final String data = lista.isEmpty() ? null : lista.get(0);
	System.out.println("query " + i + " " + data);
	if (data == null) {
	    JsonObject user = JsonObject.create().put("id", i).put("name", "carlos " + i);
	    bucket.insert(JsonDocument.create("u:" + i, user));
	    System.out.println("query " + i + " created the user");
	}
	} catch(DocumentAlreadyExistsException ex) {
		System.out.println("document exist exception: error query: " + i);
	}
}

Interesting… we test to this but maybe there’s an uncovered case. @subhashni can you see if you see the same thing and open a JCBC if so? Thanks!

Hi @charz,

Several customers use workaround like you. You’d have to decide the TTL depending on edge response time for your workload.

Here are some techniques used by Cisco:


1 Like

Hi Keshav,

thanks for the quick response. It seems to be ok what we are doing right now because it is similar to the cisco’s lock as explained in the video right?