Couchbase Lite Delete document exception

Thank you Jim. Has the replication over https issue been included in the latest release (Couchbase Lite 1.2.0.3 (this version))? Upgrade to the latest version might ease some of the pain of debugging… :grin:

The version you are using is the same in the current release, but it is changed in the current master to be an easier method (no external registration required).

Not sure if this is just how the Push Replication works or an issue… I was able to replicate part of the issue with my test project. I have two asp.net csharp projects running side by side. Both running push replication to the other project.

Both websites startup
Push replication happened so both site synced.
Shutdown Site B.
Edit a record on Site A. Save the edit. Then delete the record.
Start up Site B.
Non of those two actions get replicated to Site B.
The record remain on site B.

How long do you wait before determining that the actions have not synced? The retry logic schedules them for 60 intervals. Also, confirm that the replication has not stopped or errored for some reason (via the Changed event, you can get various information about the replication from the arguments passed there).

Thanks for the reply Jim!

What is the default wait and retry period? Every minute?

I am going to add more logging to the Changed event handler. Could you please tell me what is the best way to get a reason for the status change (ideally, it will tell me what exception caused the pull replication to stop or offline)…

I switched to run Pull request only on both nodes since we changed the two nodes to master - slave relationship. This helped clarify the situation a little bit.
You were right. The issue came from Node A (the node that CRUD operation happening on). Push replication will make things worse.

Our app changes each document exactly 3 times.

  1. create
  2. change one field
  3. delete

The issue we are having is, after the app has been running for a while, we started to get heaps of revisions for deletions with no parents. Looks like somehow, the db stopped creating revision for creating and editing for some document. Or something deleted those revision at some point…

90|24|3-91283f1ca5468d779bb681d98eebec66||1|1|{}|1| 102|27|3-2a7d680529004c4f8d32270428a7ed1f||1|1|{}|1| 107|28|3-5f6e279204528ed69f16d672faca65ac||1|1|{}|1| 111|29|3-f44dc4d2c3b0dac086cc36abff48b07f||1|1|{}|1|

Is there any code in CouchDb Lite that deletes a revision? If not, then it must be a insert fail… Whenever this happens, the document is still marked as delete on Node A but Node B will not delete those documents… Since the document is delete on Node A, should Node B delete the document regardless of revision reference issue?

I meant to say “60 second intervals” not “60 intervals”, sorry!

This information is not made fully available to the consumer, but there are some things you can examine. If there was an error it will be in the LastError property of the event arguments. Otherwise, in normal operation, you will receive an update every time the replicator changes state, and when a change happens to the ChangesCount or CompletedChangesCount of the replication.

This is the issue I want to get to the bottom of. Do you know what is going on in your program when this starts to happen?

Node B will not take any actions on its own unless you tell it to. Don’t think of it as deleting documents, think of it as “Node A is going to copy its revs table verbatim to Node B” So if Node A sends over a document with no parent, Node B is going to store it that way. It doesn’t look at it or examine it to determine what it is, other than to see if it already has it or its specified ancestors.

There is actually a pattern to the revision records when it happened…
We have a document in the database simulates a sequence in relational database. Before the app insert a new document, it retrieves the document that stores the sequence number, sets the sequence number to sequence + 1 and return the value. Then the returned value get set to a document and the document stored into the database. Looks like
the issue happens after the increased sequence number get written back to the database and the app is trying to save the document that had that sequence number assigned to one of its properties. Code coming in the next post.

When everything is working:

14|1|5-f211779d75141b67beccd3bc69966946|11|0|0||1| ← incremented the sequence number
15|6|1-f477a5582bf9b3f85ede57ae0e805428||0|0||1| ← saved new record
16|6|2-f5055346b771705fe74b30f276548e27|15|0|0||1| ← updated new record
17|5|3-96067d1f5829eaf2cf10a0cd96c46228|13|1|1|{}|1| ← delete a record
18|6|3-378911c82b9aaac74cec6bda0bb61d50|16|1|1|{}|1| ← new record deleted
19|4|3-bfcb3d8456d69cc15fa5132181b3c51f|10|1|1|{}|1|
20|1|6-dd004986ff33028fbf508da20f272004|14|0|0||1|

When things goes wrong:

86|1|23-4ce38aaa4277c7020b22f32970c9c0a5|83|0|0||1| ← incremented the sequence
← no revision for save
← no revision for update
89|23|3-64a45c81fb7b164679639f3e6a1fbd9a|85|1|1|{}|1| ← delete some record
90|24|3-91283f1ca5468d779bb681d98eebec66||1|1|{}|1| ← delete the newly created record couldn’t find parent revision…
91|22|3-28eae6299fb17d1a33ff1864bdd570f3|82|1|1|{}|1| ← deleting more record
92|1|24-e2fbc85bef96c810fd6854bd719c3152|86|0|0||1|

public long GetNextSequenceNumber() { long nextId = 0;
try
{
    var doc = database.GetDocument(DocumentKey);

    doc.Update(revision =>
    {
        if (revision.Properties.ContainsKey(SequenceKey))
        {
            revision.Properties[SequenceKey] = nextId = Convert.ToInt64(revision.Properties[SequenceKey]) + 1;
        }
        else
        {
            revision.SetProperties(new Dictionary<string, object>
            {
                {SequenceKey,  1}
            });
        }

        return true;
    });
}
catch (Exception ex)
{
    log.Error(ex, "Failed to get next sequence.");
    throw;
}

return nextId;

}

Then in app:

var message = new Message {Body = messageText, Environment = "Test", Sequence = messageService.GetNextSequenceNumber()}; <- has revision created all the time. messageId = messageService.Save(message); <- not creating revision sometimes.

I now start to get the feeling this could be a racing condition thing…

What happens inside of the messageService.Save() method to save those documents to the database?

public string Save(Message message) { Document document = null; string docId = null; SavedRevision revision = null;
        if (!string.IsNullOrEmpty(message.Id))
        {
            document = database.GetExistingDocument(message.Id);
        }
        lock (_lock)
        {
            try
            {
                Dictionary<string, object> properties;
                if (document == null)
                    document = database.CreateDocument();
                    properties = new Dictionary<string, object>
                                    {
                                        { "Environment" , message.Environment.ToLower()},
                                        { "Body" , message.Body},
                                        { "CreatedAt", DateTime.Now},
                                        { "LastUpdatedAt", DateTime.Now},
                                        { "Sequence", message.Sequence }
                                    };
                }
                else
                {
                    properties = new Dictionary<string, object>(document.Properties);
                    properties["Body"] = message.Body;
                    properties["Environment"] = message.Environment.ToLower();
                    properties["LastUpdatedAt"] = DateTime.Now;
                }

                docId = document.Id;
                revision = document.PutProperties(properties);
            }
            catch (Exception e)
            {
                log.Error(e, $"Error saving document.");
            }

        }

        if (revision == null) throw new ApplicationException($"Failed to save a document.");
        return docId;
    }

The format is all broken. Please let me know if you cannot read it.

I don’t see anything suspicious there either. Where are you at with logging? Would you be able to build a debug build (soon you won’t have to make a new build just to enable more logging anymore, but for 1.2 you still do)? Make sure it has the DEBUG, TRACE, and VERBOSE compile symbols in your project settings. This will log every single SQL statement the app makes. I want to check it for DELETE statements.

Also out of curiousity, what happens if you stop compacting the database after each delete? Does it help the issue?

Trying both now. :smile: