@jens Thanks for the reply,
Yes, it be a great help (for debugging) to be able to recognize if replication is not working.
Currently, we need this functionality for our everyday application purpose.
We need to prevent our customers from taking certain actions when they don’t have a valid replication connection.
For example, our application allows users to create documents based off of the contents of an existing document. We need to prevent users from doing this when the replication connection isn’t active (so we can avoid users creating entirely separate documents based off of the same original doc… and end up with similar/duplicated docs)
This is also very important for technical support with our customers.
In our application, we have arrows indicating the replication status of the device. If the replication status isn’t valid (idle, active), obviously, the customer can run in to unexpected behavior. Further, If a customer does experience issues, it is incredibly helpful to know if their replication status was valid.
It is very common for our customers internet to drop many times throughout the day—causing replication to stop until the heartbeat makes replicators re-fire, so being able to recognize these periods without valid replication is important.
Additionally, if we can accurately recognize when gateway replication cannot be reached, we can start syncing data Peer to Peer to ensure replication between devices.
As a developer, I need to know what the status of the replication is to be able to properly debug any issues.
I would say any reason the current “offline” status would be used, can also be applied to when the replication cannot reach the gateway~send/receive data. They would be used for the same purpose.
Any time a replicator is not working correctly (not able to pull/push data), we need to be able to recognize that and add proper logic to avoid potential issues in our app.
Let me know if you have any further questions about this.
Alternatively, revealing the status/errors of the TCP socket connection could work. Is that something I can currently access? Ultimately, we need some way to recognize this error in the current 1.4 build. Any suggestions?
As I said, our pull replicator sends “heartbeat” messages every few minutes, so when one of those fails to be delivered the replicator should discover that the connection is broken. At that point it’ll start trying to reconnect, and eventually give up.
When you say “Give up,” do you mean when the heartbeat stops trying, it should then set the replication status to a failed state? that is never happening. From my observations, the heartbeat stops after it reaches the 512 second retry time, but it never changes the replication status.
Thanks for the detailed reply