Java SDK v3 deficiencies

I’ve got a wrapper library around the Couchbase client, which is fortunate due to the changing underlying implementation.

The current things I’ve identified as gaps (so far) are:

  1. Counters. Appear to be totally missing.
    ANSWER: This is in BinaryCollection. Trying to figure this out now.
  2. Documentation. For example, there’s not documentation on how to perform bulk operations, as there was with 2.x.
  3. Append. I’d swear I saw something about this, but I haven’t been able to find it…
    ANSWER: This is in BinaryCollection. Trying to figure this out now.

-H

2 Likes

https://docs.couchbase.com/java-sdk/current/project-docs/migrating-sdk-code-to-3.n.html

That is a reasonable place to start.

2 Likes

Hi unhuman,

Thanks for the feedback – we really appreciate it, especially at this early stage in the 3.x line.

Append and Counters have moved to the BinaryCollection API, as mentioned in the migration guide. Here’s a little example that demonstrates calling Collection.binary() to access the binary operations:

Cluster cluster = Cluster.connect("localhost", "Administrator", "password");
Collection c = cluster.bucket("default").defaultCollection();
c.binary().increment("myCounter", IncrementOptions.incrementOptions().initial(0));

To your second point, yes, the documentation is still pretty thin. We’ve been working hard to align the SDK release with the release of Couchbase Server 6.5, and as a result the SDK documentation is not as complete as we’d like… but we’re moving in the right direction, I hope.

1 Like

Hi unhuman,
I echo David’s point that yes, we are not yet where we want with the SDK3 documentation. You should see much of the shortfall remedied over the next month.
Meanwhile, you can find bulk ops at:
https://docs.couchbase.com/java-sdk/3.0/howtos/concurrent-async-apis.html#batching
Perhaps not the most intuitive place - we have amalgamated bulk ops & async docs to even out coverage across the SDKs - but it will show up if you search java bulk in our searchbar.
Thanks for being an early adopter of the 3.0 SDK!

2 Likes

The migration has been going pretty well so far, but I’ve run into a few issues that are fairly severe for us:

  1. The N1QL DSLs were removed. We’d built a query framework around them. What’s the reasoning behind their elimination? Seems like I could copy the entire DSL over into our own repo, but… That doesn’t seem right.

  2. Bulk operations are a bit tricky. They work in the success path, but I’ve got some issues with error handling. For example: my logic for bulk write:

     // parameter passed in: Map<K, T> entities
    
     Flux
         .fromIterable(entities.keySet())
         .flatMap(key -> reactiveCollection.upsert(
                 getKeyProvider().getCouchbaseKey(key), entities.get(key), upsertOptions).onErrorResume(e -> {
                     if (LOG.isDebugEnabled()) {
                         LOG.debug("Failed to upsert document with Id: " + key.toString(), e);
                     }
                     errorResults.put(key, e);
                         // Note MutationResult can't be null
                         // so we blockLast() to ensure we get what we need.
                         return Mono.empty();
                     }))
             .doOnNext(successResults::add)
             .blockLast();
    

I changed from the example .last().block() to .blockLast() to get past:
java.util.NoSuchElementException: Flux#last() didn't observe any onNext signal

When multiple documents are written, but one of them is bad (in my test, the key is too long), the “good” writes fail with RequestCanceledException. This isn’t desirable. We still want the good writes to still go through.

Another bad case I had was if there is a null key parameter passed in, it blows up with:
java.lang.NullPointerException: The iterator returned a null value on the blockLast(). I’m not so worried about this case because having a null key is a pretty invalid case. It worked, however, with 2.x.

One thing I’d like to try is returning (instead of Mono.empty(), which sort of breaks things that are worked around with blockLast()) is my own FailedMutationResult, but that’s not possible because there’s no constructor access (package-private).

1 Like

okay so this is tricky. we’ve been thinking about this but felt it is not worth bringing forward because while it works in simple cases in complex queries it falls apart because of its complexity in the java language. (expressing it all). Also it is tricky to keep it extending with N1QL changes and try to cover it all.

One of the potential remedies is I could see this evolving as a separate project somewhere, but maybe not as part of the SDK directly. I still think there is value to it, we just haven’t seen as much value in the field as we’d hoped initially. And 3.0 felt like a chance to “clean” things up a bit.


w.r.t to your problems with Reactor - can you show me the 2.x equivalent that worked? I can help you come up with the 3.x equivalent and the same semantics (modulo certain changes that rxjava accepted but reactor is a little more strict on, but we might be able to work around this as well).

If you have a standalone sample I can run with what is expected and what should happen, I can work with that too.

1 Like

I will create. Should be pretty trivial.

The code for existing (with internal underlying implementations):

        // Map to collect exceptions during async inserts, use ConcurrentHashMap since it needs to be thread-safe
        Map<K, OperationStatus> upsertExceptions = new ConcurrentHashMap<>();

        // See: http://docs.couchbase.com/developer/java-2.1/documents-bulk.html
        Observable
                .from(documents.entrySet())
                .flatMap(entry -> getCouchProvider().getBucket().async()
                        .upsert(entry.getValue(), getCouchProvider().getDefaultPersistTo())
                        .doOnError(e -> {
                            if (LOG.isDebugEnabled()) {
                                LOG.debug("Could not upsert document with Id: " + entry.getKey(), (Throwable) e);
                            }
                            upsertExceptions.put(entry.getKey(), new OperationStatus((Throwable) e));
                        })
                        .onErrorResumeNext(Observable.<Document>empty()))
                .toList()
                .toBlocking()
                .single();

Still need to put together examples…

@unhuman In your example I’m not sure if you even need t the list creation at the end, unless you are actually doing something with the result in your real world code. If the list codes away the stream might be empty so you’d need to switch to singleOrDefault(null) or so.

Otherwise I think you can pretty much turn those operators to reactor 1:1:

Flux
  .fromIterable(data.entrySet())
  .flatMap(entry -> collection
    .reactive()
    .upsert(entry.getKey(), entry.getValue())
    .onErrorResume(throwable -> {
      System.err.println("Got error: " + throwable);
      // write to your map here
      return Mono.empty();
  }))
  .blockLast();

You can combine the logging into the onErrorResume as well i think like in my example above.

Yep, for our batch processing, we handle each response.

This isn’t as bad as I first thought.

But… I can highlight a problem still.

Attached, please find a zip file containing 2 different versions of my test code. I don’t necessarily think this is a v2 vs v3 problem, but… Still some funny business in v3.

CouchbaseV2V3BulkComparison.zip (16.0 KB)

Now, the problem is caused by the key being too long. In 2.x one cannot create a document with a key that is too long, so the problem would actually never gets to the upsert, so this example really couldn’t happen with v2.

If you look at the code (as provided) for the 3x version, it fails all 3 documents:

class com.couchbase.client.core.error.ReplicaNotConfiguredException Not enough replicas configured on the bucket: Could not upsert document with Id: U:king_arthur_jr
class com.couchbase.client.core.error.CouchbaseException UpsertRequest failed with unexpected status code: Could not upsert document with Id: U:king_arthur_jr1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
class com.couchbase.client.core.error.RequestCanceledException UpsertRequest: Could not upsert document with Id: U:king_arthur
Key: U:king_arthur_jr Success: false
Key: U:king_arthur_jr1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 Success: false
Key: U:king_arthur Success: false

If the arthur_baby entity with the very long key is commented out (lines 48-54), then it works as expected:

class com.couchbase.client.core.error.ReplicaNotConfiguredException Not enough replicas configured on the bucket: Could not upsert document with Id: U:king_arthur_jr
Key: U:king_arthur_jr Success: false
Key: U:king_arthur Success: true

But, I wouldn’t expect the class com.couchbase.client.core.error.CouchbaseException UpsertRequest failed to force-cancel the “good” request (RequestCanceledException)

EDIT: one more issue… If you add a null-keyed value to the map, it totally breaks everything… Something like:

        entities.put(null, arthur);

Again, this probably falls into a similar category (nobody should ever do this) but the code does just flat out fail in that case.

Hi @unhuman,

This is a bit of a guess, but when I’ve seen this pattern happen in my own testing (a failed request followed by a RequestCanceledException on the next), it’s usually because the first failure has caused the KV node we’re connected to, to disconnect. It can do this for safety reasons, e.g. if it believes it may have corrupted data and can no longer trust the framing. It takes a little time for the client to recreate this connection, which is why the RequestCanceledException happens next.
You can check this by looking at the KV logs (/opt/couchbase/var/lib/couchbase/logs/memcached.log.0000*.txt if on Linux). The KV service is usually good at logging why it’s disconnected - which, from the above, looks like it’s related to the large key.
I’ve opened two tickets to investigate the large key and null key issues further.
https://issues.couchbase.com/browse/JCBC-1586
https://issues.couchbase.com/browse/JVMCBC-815

@unhuman the too long validation is fixed here: http://review.couchbase.org/#/c/121860/ so that will make 3.0.2.

The other issue I could not reproduce in isolation, there must be something going on in your code. can you try a collection.upsert(null, null) or collection.upsert(null, value) in isolation? It should throw an invalid argument exception (same for the reactive equivalent).

1 Like

It’s 100% reproducible for me here. I’ll update again with an example.

Here you go. Couchbase3xBulkInsert.v2.zip (8.2 KB)

All you have to do is run this… and you’ll see the null pushes:

Exception in thread "main" java.lang.NullPointerException: The iterator returned a null value
	at java.base/java.util.Objects.requireNonNull(Objects.java:246)
	at reactor.core.publisher.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:230)
	at reactor.core.publisher.FluxIterable$IterableSubscription.request(FluxIterable.java:201)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.onSubscribe(FluxFlatMap.java:363)
	at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:139)
	at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:63)
	at reactor.core.publisher.Flux.subscribe(Flux.java:8143)
	at reactor.core.publisher.Flux.blockLast(Flux.java:2481)
	at Couchbase3xBulkInsert.main(Couchbase3xBulkInsert.java:82)
	Suppressed: java.lang.Exception: #block terminated with an error
		at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:93)
		at reactor.core.publisher.Flux.blockLast(Flux.java:2482)
		... 1 more

Again, this is really bad (what we’re doing) - a null document key - but it was something we had tested for previously… Although I think it breaks the creation of the document instead.

1 Like