XDCR - Fatal is not a valid log level

tippy_top · June 23, 2020, 8:49pm

Trying to replicate to a remote cluster and I keep receiving the errors from the Pipeline manager . Mainly its:

** 2020-06-23T20:39:16.657Z INFO GOXDCR.XDCRFactory: Pipeline f59ff1e35afda44461ee1e54d50aa162/bucket1/bucket2 has been constructed

2020-06-23T20:39:16.658Z INFO GOXDCR.PipelineMgr: Pipeline f59ff1e35afda44461ee1e54d50aa162/bucket1/bucket2 is constructed. Starting it.

2020-06-23T20:39:16.659+00:00 [INFO] Using plain authentication for user @goxdcr

2020-06-23T20:39:16.679Z ERRO GOXDCR.PipelineMgr: Failed to start the pipeline

2020-06-23T20:39:16.679Z ERRO GOXDCR.PipelineMgr: Failed to update pipeline f59ff1e35afda44461ee1e54d50aa162/bucket1/bucket2, err=genericPipeline.context.Start : Fatal is not a valid log level

2020-06-23T20:39:16.679Z ERRO GOXDCR.PipelineMgr: Update of pipeline f59ff1e35afda44461ee1e54d50aa162/bucket1/bucket2 failed with errors=genericPipeline.context.Start : Fatal is not a valid log level

2020-06-23T20:39:16.679Z INFO GOXDCR.PipelineMgr: Replication f59ff1e35afda44461ee1e54d50aa162/bucket1/bucket2 update experienced error(s): genericPipeline.context.Start : Fatal is not a valid log level. Scheduling a redo.
INFO GOXDCR.PipelineMgr: Replication status is updated with error(s) genericPipeline.context.Start : Fatal is not a valid log level, current status=name={f59ff1e35afda44461ee1e54d50aa162/bucket1/bucket2}, status={Pending},**

XDCR still seems to replicate some documents, and initially moved about 90,000 of the 120,000 documents over. Not sure why, but this cluster has always been difficult when it comes to XDCR.

UPDATE: It looks as if there is one node that is not participating in XDCR. It is the only one with outstanding permutations, and the number of permutations matches the number of outstanding documents reported by XDCR logs.

pavithra.mahamani · June 24, 2020, 5:51pm

hi [tippy_top],

Could you please share which version of couchbase you are on and how many nodes you have on the source and target clusters?

tippy_top · June 24, 2020, 6:17pm

One scenario I have tried, both source and destination clusters running 6.0. Source runs 6; destination runs 4.

Recent scenario, source is the same. Destination is only 1 node running 6.5. Same behavior is observed in both situations. Replication initially takes off, and but halts. New documents still transfer that do not reside on one particular node, which is also the source of the goxdcr.log messages.

tippy_top · June 28, 2020, 10:20pm

Never mind, all good. The node in question was running 6.0.0, and the rest were 6.0.4. Doing a swap rebalance and updating the node fixed the issue.