Data loss and writing data slowly and frequent timeout after auto failover

I don’t know your actual numbers, but keep in mind that when you failover you have one node less in the cluster to handle your requests, but it should be able to withstand it more or less closely.

Can you share some actual numbers that you are seeing?

toBlocking() should not be used with the retry builder since it’s async at this point, but downstream the observable you can block, finally if you want to.

now i need bring millions of data into cb ,and imagine that I meet with node failure, the speed is so intolerable.
How can i do?

So you are performing a bulk import? Or is it part of a OLTP appplication - this changes the retry semantics quite a bit because you mostly can accept longer delays on latency and go for max throughput (in the bulk case).

no,

I need the data to complete the verifying process of data within the files uploaded in the web app.

And the data in couchbase server is updated frequentely.

@xiger are you able to share the full code to reproduce with some logs and more? So we can have an accurrate reproduction case here and help you moving forward.

The situation is that :smile:

we havenot use the couchbase server in the web app.

we used the memcahed server to store the data to support the speed of the process of parsing and verfying of the file uploaded, and now as of memcahed’s bad avaliablity, we use couchbase server to replace the memcached.
because data changes in time, we also do the data syn.
And out storage environment: sqlserver and memcached.–subsequently---->sqlserver and couchbase server

I find .toBlock same as latch.await() above.

@xiger the toBlock is more efficient in a way that you can do much more with it. the latch works fine for a single result, but it gets more tricky if you are iterating over the results in a blocking fashion. Also, it’s idiomatic RxJava code. You don’t need to stick to it but it’s recommended.

ob,I see ,Thank you .

I want to ask a question:

when I deal with the node failure excpetion in coding, and while using .toBlock() to retrive the data without timeout,i cannot utilize the retry builder, but how can i catch the excpetion to retry the CRUD operation.

hi,

.retryWhen(anyOf(Exception.class).max(4).delay(Delay.exponential(TimeUnit.SECONDS))));

this i cannot find the api, where is it .

You need to use at least 2.1.2 to have it available.

see this other topic where I tried to provide a base rewrite of the code with the RetryBuilder: The problem of bulk get
I’ll edit it to make the insert cold :cold_sweat:

you need to static import the RetryBuilder.anyOf method (or prefix it with the class). Also note that there should be a .build() after delay(...).

Thank you very much.
I have succeeded.

but the data loss have existed.

so you are calling 1000 times your add method, and it returns true every time? what is the calling code like? what are the keys like? are they generated?

do you see the count difference even before rebalancing (eg by looking in the web console see how many documents are listed)?

Note that I’ve changed the code sample referred above, to make the insert inside a flatMap.

quote from @xiger

Yes, 1000 return true , before reblance it is 999 , and after rebalance, it still is 999.
but it is rare.

The situation:
while loading 1000 items,
I kill one of four nodes, and auto-failover ends, but i donnot reblance, and let it do so, and at the same time i notice that the item count is right early, and i find the item count wrong in the end, but the record count printed in logs is right .

Hi,
I have come, and can you give me some ideas?