Parent cluster object has been closed

@Rajasekar I have bug problem with this error, it happens many times and there is nothing wrong in myside, May be we have some temporary network failure
Currently I catch the error and re-instantiate the cluster

@brett19 I recreate the Cluster object for the “parent cluster is already closed”
Currently I have too many “The connection has been closed” error and after the error Cluster object is useless any subsequent operation has the error and never retried for reconnection , we must recreate the Cluster

@AV25242 It is more than 8 months that I reported the bug, I know there is an open issue [JSCBC-706] Client does not attempt to reconnect on bucket connect errors - Couchbase database for 19 months , Why the bug exists?

Hey @socketman2016,

Unfortunately there are a large number of underlying behaviours which need to be adjusted in order to make this possible, and some of these have compatibility concerns which make them more challenging still. Fortunately, we have made some significant progress in the past couple of releases and as of 3.2.2 the cases where this error appears should be far less common. Can you elaborate on the specific scenario where you are commonly seeing this error appear?

Cheers, Brett

@brett19 Thanks for your response

Can you elaborate on the specific scenario where you are commonly seeing this error appear?

Unfortunately there is no specific scenario, Just in my production server my NodeJS application configured to connect to 3 Couchbase node in local network
AFAIK there is no network issue in out local network, after hours the Cluster object is useless
As you told before I catch “parent cluster is already closed” and recreate the Cluster object, But in SDK 3.2.1 I figured out that “The connection has been closed” has same issue too
Now I upgrade lib to 3.2.2

I just wonder why this critical bug is not priority to fixing and why (I think) a little people facing it

Hi Dears
I also have a problem in various situations, including
Do you have any updates on this?

at Object.translateCppError (/home/xxx/core/node_modules/couchbase/dist/bindingutilities.js:180:20)
at Object.<anonymous> (/home/xxx/core/node_modules/couchbase/dist/connection.js:147:54)

name: “TimeoutError”
cause: {“code”:201}
context: undefined
pid: 2740
hostname: core1.xxx

Thanks

I have the TimeoutError error , and the error is same as other errors, it never tries to fix the instance, I have to re-create instance, it is not a timeout as the TimeoutError throws immediately

After upgrading to the SDK to 3.2.2, I have this error

nodejs.TimeoutError: timeout
    at Object.translateCppError (/home/myapp/node_modules/couchbase/dist/bindingutilities.js:180:20)
    at Object.<anonymous> (/home/myapp/node_modules/couchbase/dist/connection.js:147:54)
name: "TimeoutError"
cause: {"code":201}
context: undefined

@brett19 The SDK v3 in my nightmare , The v2 is much much better

I restarted my NodeJS application, there is no error right now, But after days it will appear

I upgrade the SDK to 3.2.3 now

v 3.2.3 has the same bug

Hey @socketman2016, sorry to hear you’ve been having this problem. We’ve been working hard to improve the SDK and the 3.2.3 release has been very stable. In fact, in preparing that release, we did perform several long running tests (and left them running for several days) and haven’t seen any problems like you’re reporting here. A couple other teams have done similar.

Perhaps there is something unique about the environment you’re running in or some configuration options you’re using, etc? Are you running this in a cloud instance or virtual instance that is paused and resumed in some way perhaps? Is there a network or VPN failure causing the timeout?

Unfortunately, since we’re unable to reproduce the problem ourselves at this point, we really cannot do much more to help unless you can figure out a way to share something more specific that reproduces the issue you’re seeing.

1 Like

Are you running this in a cloud instance or virtual instance that is paused and resumed in some way perhaps?

We are running the NodeJs application and Couchbase server in a same datacenter but different machines, We have 3 node for Couchbase server
We are using ESXi for virtualizing, the instances are long running for years, no pause

Is there a network or VPN failure causing the timeout?

No, absolutely no , I have two different application (In two different server) that connected to the same Couchbase server, one user SDK v2 another SDK v3, The SDK v2 works well while SDK v3 have TOO MANY problem, really TOO MANY

unless you can figure out a way to share something more specific that reproduces the issue you’re seeing.

How? @brett19 already told that by running with DEBUG=* node index , but it is not possible for me as the error occurred without pattern, The only pattern I know that cause the TimeoutError error (not others, I have ConnectionClosedError and parent cluster object has been closed) is that, when I reboot the server hosting NodeJS application, we have TimeoutError forever, But after restaring NodeJS application, it solved until happens again after hours or days

One way I found that is pretty consistent( around 70/80% of times in my case) to reproduce the timeout error in my case is when debugging node js application with visual studio code or visual studio community.
It has become a nightmare for me to use the debugger, didn’t have any issue with the sdk 2.
When starting node normally(without debug) I never get the timeout error.
It will be so much easier if you would return to the old way of connecting, at least we can know what is happening behind the scenes, instead of keeping a black box and discovering the issues at the first operation.
I saw the java sdk 3.2 has a bootstrap functionality, why can’t we have it?!

@00christian00 Please show us how you can reproduce the timeout error ?

I am not doing anything in particular, I simply start my node application with debugger of visual studio code or visual studio community.
This is enough most of the time to get a timeout error when doing any couchbase operation.
I haven’t tested if it is the case with any code or just with my specific code.
My guess is that the slower execution of the node server due to the attached debugger (it is much slower with the debugger attached) is enough to go into timeout.
Don’t know if passing a longer timeout does change anything, it is slower but definitively it is not exceeding 10 seconds connection time and 2.5 seconds operation time, the start up phase is slower but after that unless there are breakpoints it’s a matter of milliseconds.

Oh and to add to context, I am testing on the Windows version of couchbase 6.5 which by itself is pretty slow, don’t know why but the same version running on the same machine but virtualized on vmware and ubuntu OS is thousand times faster.

@00christian00 is subsequent operations, ended with timeout error forever and you have to instantiate a new Cluster object ?

Ended forever, cannot do anything, I need to restart the server as I didn’t implement the new cluster workaround in the script yet, I was hoping it would be resolved by now and wouldn’t need to implement it. I am checking JIRA and it is always planned for the next sdk version, but every single time it get delayed to the next minor version…
I am developing a mobile game as a single man company, so wearing too many hats don’t have much time on my hands.

1 Like

@raycardillo @brett19 Can you please take a look
@00christian00 confirms that incase of error, it’s exists forever, like me, I just re-create cluster instance as following

if (
  e.message === 'parent cluster object has been closed'
  ||
  e instanceof Couchbase.ConnectionClosedError
  ||
  e instanceof Couchbase.TimeoutError
) 
{createNewCouchbaseCluster();}

I saw the workaround but I guess I need to pass the new cluster and bucket object reference around if I recreate it, probably a nightmare for me to in case I forget to pass the new reference to some part of the code. If it is a one time thing I could try to make a test document get key just to see if the cluster is operative, but still I would hope they fix it at the sdk side, patching thing my side could create unforeseen issues.

but still I would hope they fix it at the sdk side, patching thing my side could create unforeseen issues.

Sure, I agree with you

Fortunately, I have a global error handler and a single reference to couchbase instance, and easily implement the workaround , But still having issues with SDK itself, because after catching error I just send error 500 response and create the cluster again, So I have many error 500 now and have no easy way to retry the errored operation in application layer