The docs state that N1QL transactions can be used at any place N1QL can be used. I am using the Go SDK which doesn’t have explicit support for transactions but can use the REST API to communicate with the query service.
The docs also state that CAS value will be used to make sure updates are not lost in case of concurrent transactions that touch the same document.
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
wg.Add(1)
go func() {
results, err := cluster.Query("START TRANSACTION;", nil)
if err != nil {
panic(err)
}
var tx txid
err = results.One(&tx)
if err != nil {
panic(err)
}
opts := &gocb.QueryOptions{Raw: map[string]interface{}{"txid": tx.TxID}}
results, err = cluster.Query("UPDATE customer SET balance = balance - 1 WHERE cid=1;", opts)
if err != nil {
panic(err)
}
results, err = cluster.Query("UPDATE customer SET balance = balance + 1 WHERE cid=2;", opts)
if err != nil {
panic(err)
}
_, err = cluster.Query("COMMIT;", opts)
if err != nil {
panic(err)
}
wg.Done()
}()
}
This code tries to run the following set of queries
START TRANSACTION;
UPDATE customer SET balance = balance - 1 WHERE cid=1;
UPDATE customer SET balance = balance + 1 WHERE cid=2;
COMMIT;
Effectively trying to make a transfer of amount 1. This is done 100 times concurrently and a bunch of updates are lost. Do transactions not provide safety against lost updates? If not, is there no way to achieve safe transfer of balance from one customer to another using the Go SDK?
Hi @Muneeb_Abbas
First thing to note is that this code will only work against a single-node cluster, as all queries in the same transaction need to go to the same query node. For this you will need to use the REST interface directly, until a Go implemention of transactions is available.
When talking about Lost Updates, as in what the documentation is referring to, there is the specific database phenomenon (described well here What Is a Lost Update in Database Systems? - DZone Database), which requires a read followed by an update, which is not going on in your transaction.
Your transaction involves straightforward single UPDATEs that should certainly be detected as write write conflicts, with one transaction being rejected.
You mention that updates are being lost - what leads you to this conclusion? I would expect from this transaction to always have a constraint that the balance of cid=1 is the negative balance of cid=2 (e.g. -50 vs 50) - assuming that all statements succeed. Is this constraint being broken? Do any statements fail? (Note that N1QL has Statement Level Atomicity, e.g. if an individual statement fails then it will be rolled back, and an error reported back to the user, but the overall transaction is permitted to continue.
It is a single-node cluster but I think that shouldn’t matter as the SDK should be sending the txid extra parameter which is all that should be required to make sure the query is processed by the correct node (Note that in my testing, I have a single node cluster anyways).
The N1QL transactions feature is supported anywhere you use a N1QL query.
The main requirement is that a txid value – which is returned from the START TRANSACTION command – is used with all subsequent N1QL DMLs if they’re a part of the transaction. The same goes for the final COMMIT or ROLLBACK.
I can confirm that the Go SDK passes along the extra parameters such as txid and tximplicit which means that it doesn’t matter if I use the REST interface directly as the SDK is just calling the REST interface I think.
How do I know updates are being lost?
Instead of 100 being transferred, it’s either 1 or 2 (And changes between runs meaning it’s a race condition). What I am missing here is the write-write conflict that is mentioned in the docs.
Hi @Muneeb_Abbas
The Go SDK does use the REST interface under the hood, yes. But txid alone isn’t enough - it’s also necessary to send each query to the same query node. Otherwise you’ll get an error that the transaction id is not recognised.
I think what’s happening is that most of the transactions are actually failing due to write-write conflicts, leading to just one or two succeeding. A transactions library will automatically retry a failed transaction for you (where it is sensible to retry, such as with a write write conflict), but when using query statements directly in this way, it’s necessary to do that yourself.
Also checkout each transaction/each statement is returned error or success. You will have only number of successful transaction begin to end are transferred (not 100).
Also note N1QL transaction uses optimistic concurrency , i.e. All mutations are performed with in the transaction and during commit time it will write with cas value of first mutation of the corresponding document. There is no retry with in the N1QL. Due to optimistic concurrency, If concurrent transactions mutating same documents there will be higher rollbacks.
If you using REST API directly, Application is responsible for sending all statements with in the transaction to same query node. After successful stat transaction, on any error you must rollback instead of abending transaction. Once SDK support transaction all these will be handled by transaction frame work.
If you still have issues, Will you able to provide CB version, GOCB version, Number of CPUs on query node, complete repro.
@graham.pople
I understand that txid alone isn’t enough and we have to manually send each query to the same node. However, I am only using a single node right now for testing so we can get to handling multiple nodes later.
I also understand that the retry will have to be manual. I was not under the impression that it’s something the SDK will handle. What I was trying to look for when writing this example was the error message I will get with the write write conflict so that I can handle it manually. It seems that the SDK does not report it back, at least from what I can see.
I have tried calling Close() on the QueryResult as well but to no avail. What I would like to know is, if there should be any signal or error when using N1QL with Go SDK that will let me handle the retry behavior from the application logic. I think using the REST interface might be the only option from Go though as the SDK might need some more work before it can be used with N1QL transactions. The documentation should also be clear on this in my opinion.
package main
import (
"sync"
"github.com/couchbase/gocb/v2"
)
func main() {
cluster, err := gocb.Connect(
"localhost",
gocb.ClusterOptions{
Username: "muneebabbas",
Password: "muneeb",
})
if err != nil {
panic(err)
}
// Start a transaction and get the transaction ID
type txid struct {
TxID string `json:"txid"`
}
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
wg.Add(1)
go func() {
results, err := cluster.Query("START TRANSACTION;", nil)
if err != nil {
panic(err)
}
var tx txid
err = results.One(&tx)
if err != nil {
panic(err)
}
opts := &gocb.QueryOptions{Raw: map[string]interface{}{"txid": tx.TxID}}
_, err = cluster.Query("UPDATE customer SET balance = balance - 1 WHERE cid=1;", opts)
if err != nil {
panic(err)
}
_, err = cluster.Query("UPDATE customer SET balance = balance + 1 WHERE cid=2;", opts)
if err != nil {
panic(err)
}
results, err = cluster.Query("COMMIT;", opts)
if err != nil {
panic(err)
}
err = results.Close()
if err != nil {
panic(err)
}
wg.Done()
}()
}
wg.Wait()
}
This is the entire test file. I have explained the behavior and necessary pre conditions you would need to test this. I am not sure what more you’d need.
This is the crux of the issue I believe as the error certainly should be getting propagated back. I am not an expert on how gocb will surface this (perhaps @brett19 can help?), but at the JSON level, you will get an errors array, and if this contains a JSON blob with a cause field, then this information can be used to drive your program logic. cause will contain a rollback boolean field indicating whether you should issue a ROLLBACK statement, and a retry boolean field indicating whether you should then retry the full transaction. There is also a nested cause field that contains additional diagnostics information including a human readable description of the error.
I think after every cluster.Query() statement you need to iterate results and check error again (There might be better way)
This gives execution errors.
results, err = cluster.Query("COMMIT;", opts)
if err == nil {
for results.Next() {
}
err = results.Err()
}
if err != nil {
panic(err)
}
I think @vsr1 cracked it. The problem is that we have to iterate over the results and then check results.Err(). Looks a bit silly with the empty for loop though but doing that does return the correct underlying error.
results, err = cluster.Query("COMMIT;", opts)
if err != nil {
panic(err)
}
for results.Next() {
}
err = results.Err()
if err != nil {
panic(err)
}
When I do this, I get the following error when calling results.Err()
Just FYI, I think the error codes are not updated in the documentation. For example, there is no such error code documented on this page. This might need updating
I have one follow up question however. Is there a roadmap for the Go SDK? For example, having built in transactions support like the Java SDK would really help and is that something on the roadmap?
Glad you got to the bottom of how to process the errors. Looking through the Go SDK query documentation, I see that general pattern of reading the results and then checking the Err field is documented there.
Just FYI, I think the error codes are not updated in the documentation.
Ah yes, good point. I’ve raised a docs ticket for this. Until this has been updated, please use this ticket as a reference for the new transactions error codes.
I have one follow up question however. Is there a roadmap for the Go SDK? For example, having built in transactions support like the Java SDK would really help and is that something on the roadmap?
I’ll discuss this with the team today and get back to you.