Lock/CAS timeout on existing large document updates

pr3d4t0r · April 19, 2018, 7:55pm

Greetings!

Our application deals with very large documents (4-100 MB). The Couchbase Server 5.0 runs on AWS servers, enhanced networking, same region/DC. During the update we get the CAS lockID, commit, and release. We see time outs on the larger objects, with an exception indicating that the CAS is no longer valid.

try:
    lockID = messagesTable.lock(messageID)
    messagesTable.put(messageID, message, lockID)
    messagesTable.unlock(messageID, lockID)
except Exception as ex0:
    # We're sure here that the exception happens when we try putting the record back,
    # based on the stack info and observed behavior during interactive debugging; this
    # Python API encapsulates the details in the Couchbase API.
    #
    # Latest libcouchbase under Linux and macOS, current dev build - :slight_smile:

Can you please recommend a way to overcome this issue? The 30+ second saves appear to be a combination of latency in grabbing the lock, committing the change, and whatever else is going on in the Couchbase data servers. We have 2 ea. data, index, and query, no FTS. Large multi-core machines with lots of memory.

I’m aware that we might have a tuning situation as well – not sure about how to identify and resolve that – recommendations welcome.

Thanks in advance, and wishing you all great day!

pr3d4t0r