DB transaction lock & How to handle deadlock

techschoolguru

TECH SCHOOL

Posted on August 27, 2020

DB transaction lock & How to handle deadlock

Locking is not easy as you think!

In the last lecture, we’ve learned how to implement a simple money transfer transaction. However, we haven’t updated the accounts’ balance yet because it’s more complicated and require careful handling of concurrent transactions to avoid deadlock.

So in this lecture we’re gonna implement this feature to learn more about database locking and how to handle a deadlock situation.

Here's:

Test Driven Development

Today I’m gonna use a different implementation approach, which is test driven development (or TDD). The idea is: we write tests first to make our current code breaks. Then we gradually improve the code until the tests pass.

OK, this is the test that we were working on in the previous video:



func TestTransferTx(t *testing.T) {
    store := NewStore(testDB)

    account1 := createRandomAccount(t)
    account2 := createRandomAccount(t)

    // run n concurrent transfer transactions
    n := 5
    amount := int64(10)

    errs := make(chan error)
    results := make(chan TransferTxResult)

    for i := 0; i < n; i++ {
        go func() {
            result, err := store.TransferTx(context.Background(), TransferTxParams{
                FromAccountID: account1.ID,
                ToAccountID:   account2.ID,
                Amount:        amount,
            })

            errs <- err
            results <- result
        }()
    }

    // check results
    for i := 0; i < n; i++ {
        err := <-errs
        require.NoError(t, err)

        result := <-results
        require.NotEmpty(t, result)

        // check transfer
        transfer := result.Transfer
        require.NotEmpty(t, transfer)
        require.Equal(t, account1.ID, transfer.FromAccountID)
        require.Equal(t, account2.ID, transfer.ToAccountID)
        require.Equal(t, amount, transfer.Amount)
        require.NotZero(t, transfer.ID)
        require.NotZero(t, transfer.CreatedAt)

        _, err = store.GetTransfer(context.Background(), transfer.ID)
        require.NoError(t, err)

        // check entries
        fromEntry := result.FromEntry
        require.NotEmpty(t, fromEntry)
        require.Equal(t, account1.ID, fromEntry.AccountID)
        require.Equal(t, -amount, fromEntry.Amount)
        require.NotZero(t, fromEntry.ID)
        require.NotZero(t, fromEntry.CreatedAt)

        _, err = store.GetEntry(context.Background(), fromEntry.ID)
        require.NoError(t, err)

        toEntry := result.ToEntry
        require.NotEmpty(t, toEntry)
        require.Equal(t, account2.ID, toEntry.AccountID)
        require.Equal(t, amount, toEntry.Amount)
        require.NotZero(t, toEntry.ID)
        require.NotZero(t, toEntry.CreatedAt)

        _, err = store.GetEntry(context.Background(), toEntry.ID)
        require.NoError(t, err)

        // TODO: check accounts' balance
    }
}


Enter fullscreen mode Exit fullscreen mode

It creates 5 go routines to execute 5 concurrent transfer transactions, where each of them will transfer the same amount of money from account 1 to account 2. Then it iterates through the list of results to check the created transfer and entry objects.

Now to finish this test, we need to check the output accounts and their balances.

Let’s start with the accounts. First the fromAccount, where money is going out. We check it should not be empty. And its ID should equal to account1.ID.

Similar for the toAccount, where money is going in. The account object should not be empty. And its ID should equal to account2.ID.



func TestTransferTx(t *testing.T) {
    ...

    // check results
    for i := 0; i < n; i++ {
        ...

        // check accounts
        fromAccount := result.FromAccount
        require.NotEmpty(t, fromAccount)
        require.Equal(t, account1.ID, fromAccount.ID)

        toAccount := result.ToAccount
        require.NotEmpty(t, toAccount)
        require.Equal(t, account2.ID, toAccount.ID)

        // TODO: check accounts' balance
    }
}


Enter fullscreen mode Exit fullscreen mode

Next we will check the accounts’ balance. We calculate the difference diff1 between the input account1.Balance and the output fromAccount.Balance. This diff1 is the amount of money that’s going out of account1.

Similarly, we calculate the difference diff2 between the output toAccount.Balance and the input account2.Balance. This diff2 is the amount of money that’s going in to account2.



func TestTransferTx(t *testing.T) {
    ...

    // check results
    for i := 0; i < n; i++ {
        ...

        // check accounts' balance
        diff1 := account1.Balance - fromAccount.Balance
        diff2 := toAccount.Balance - account2.Balance
        require.Equal(t, diff1, diff2)
        require.True(t, diff1 > 0)
        require.True(t, diff1%amount == 0) // 1 * amount, 2 * amount, 3 * amount, ..., n * amount
    }
}


Enter fullscreen mode Exit fullscreen mode

If the transaction works correctly then diff1 and diff2 should be the same, and they should be a positive number.

Also, this difference should be divisible by the amount of money that moves in each transaction. The reason is, the balance of account 1 will be decreased by 1 times amount after the 1st transaction, then 2 times amount after the 2nd transaction, 3 times amount after the 3rd transaction, and so on and so forth.

Because of this, If we compute k = diff1 / amount, then k must be an integer between 1 and n, where n is the number of executed transactions.



func TestTransferTx(t *testing.T) {
    ...

    // check results
    existed := make(map[int]bool)

    for i := 0; i < n; i++ {
        ...

        // check accounts' balance
        ...

        k := int(diff1 / amount)
        require.True(t, k >= 1 && k <= n)

        require.NotContains(t, existed, k)
        existed[k] = true
    }
}


Enter fullscreen mode Exit fullscreen mode

Moreover, k must be unique for each transaction, which means k should be 1 for the 1st transaction, 2 for the second, 3 for the 3rd and so on until k equals to n.

In order to check this, we need to declare a new variable called existed of type map[int]bool. Then in the loop, check that the existed map should not contain k. Then we set existed[k] to true after that.

Eventually, after the for loop, we should check the final updated balances of the 2 accounts.

First we get the updated account 1 from the database by calling store.GetAccount() with a background context and the ID of account 1. This query should not return an error. We get the updated account 2 from the database in the same manner.



func TestTransferTx(t *testing.T) {
    ...

    // check results
    existed := make(map[int]bool)
    for i := 0; i < n; i++ {
        ...
    }

    // check the final updated balance
    updatedAccount1, err := store.GetAccount(context.Background(), account1.ID)
    require.NoError(t, err)

    updatedAccount2, err := store.GetAccount(context.Background(), account2.ID)
    require.NoError(t, err)

    require.Equal(t, account1.Balance-int64(n)*amount, updatedAccount1.Balance)
    require.Equal(t, account2.Balance+int64(n)*amount, updatedAccount2.Balance)
}


Enter fullscreen mode Exit fullscreen mode

Now after n transactions, the balance of account 1 must decrease by n * amount. So we require the updatedAccount1.Balance to equal to that value. amount is of type int64, so we need to convert n to int64 before doing the multiplication.

We do the same for the updatedAccount2.Balance, except that its value should be increasing by n * amount istead of decreasing.

And that’s it! We’re done with the test. But before running it, I’m gonna write some logs to see the results more clearly.

First, let’s print out the balance of the accounts before the transaction. Then print out their updated balances after all the transactions are executed. I also want to see the result balances after each transaction so let’s add a log in the for loop as well.

OK, this is our final test:



func TestTransferTx(t *testing.T) {
    store := NewStore(testDB)

    account1 := createRandomAccount(t)
    account2 := createRandomAccount(t)
    fmt.Println(">> before:", account1.Balance, account2.Balance)

    n := 5
    amount := int64(10)

    errs := make(chan error)
    results := make(chan TransferTxResult)

    // run n concurrent transfer transaction
    for i := 0; i < n; i++ {
        go func() {
            result, err := store.TransferTx(context.Background(), TransferTxParams{
                FromAccountID: account1.ID,
                ToAccountID:   account2.ID,
                Amount:        amount,
            })

            errs <- err
            results <- result
        }()
    }

    // check results
    existed := make(map[int]bool)

    for i := 0; i < n; i++ {
        err := <-errs
        require.NoError(t, err)

        result := <-results
        require.NotEmpty(t, result)

        // check transfer
        transfer := result.Transfer
        require.NotEmpty(t, transfer)
        require.Equal(t, account1.ID, transfer.FromAccountID)
        require.Equal(t, account2.ID, transfer.ToAccountID)
        require.Equal(t, amount, transfer.Amount)
        require.NotZero(t, transfer.ID)
        require.NotZero(t, transfer.CreatedAt)

        _, err = store.GetTransfer(context.Background(), transfer.ID)
        require.NoError(t, err)

        // check entries
        fromEntry := result.FromEntry
        require.NotEmpty(t, fromEntry)
        require.Equal(t, account1.ID, fromEntry.AccountID)
        require.Equal(t, -amount, fromEntry.Amount)
        require.NotZero(t, fromEntry.ID)
        require.NotZero(t, fromEntry.CreatedAt)

        _, err = store.GetEntry(context.Background(), fromEntry.ID)
        require.NoError(t, err)

        toEntry := result.ToEntry
        require.NotEmpty(t, toEntry)
        require.Equal(t, account2.ID, toEntry.AccountID)
        require.Equal(t, amount, toEntry.Amount)
        require.NotZero(t, toEntry.ID)
        require.NotZero(t, toEntry.CreatedAt)

        _, err = store.GetEntry(context.Background(), toEntry.ID)
        require.NoError(t, err)

        // check accounts
        fromAccount := result.FromAccount
        require.NotEmpty(t, fromAccount)
        require.Equal(t, account1.ID, fromAccount.ID)

        toAccount := result.ToAccount
        require.NotEmpty(t, toAccount)
        require.Equal(t, account2.ID, toAccount.ID)

        // check balances
        fmt.Println(">> tx:", fromAccount.Balance, toAccount.Balance)

        diff1 := account1.Balance - fromAccount.Balance
        diff2 := toAccount.Balance - account2.Balance
        require.Equal(t, diff1, diff2)
        require.True(t, diff1 > 0)
        require.True(t, diff1%amount == 0) // 1 * amount, 2 * amount, 3 * amount, ..., n * amount

        k := int(diff1 / amount)
        require.True(t, k >= 1 && k <= n)
        require.NotContains(t, existed, k)
        existed[k] = true
    }

    // check the final updated balance
    updatedAccount1, err := store.GetAccount(context.Background(), account1.ID)
    require.NoError(t, err)

    updatedAccount2, err := store.GetAccount(context.Background(), account2.ID)
    require.NoError(t, err)

    fmt.Println(">> after:", updatedAccount1.Balance, updatedAccount2.Balance)

    require.Equal(t, account1.Balance-int64(n)*amount, updatedAccount1.Balance)
    require.Equal(t, account2.Balance+int64(n)*amount, updatedAccount2.Balance)
}


Enter fullscreen mode Exit fullscreen mode

Let's run it!

Alt Text

It fails at line 83, where we expect the fromAccount to be not empty. But of course it is empty at the moment, because we haven’t implemented the feature yet.

So let’s go back to the store.go file to implement it!

Update account balances [the wrong way]

One easy and intuitive way to change an account’s balance is to first get that account from the database, then add or subtract some amount of money from its balance, and update it back to the database.

However, this is often done incorrectly without a proper locking mechanism. I'm gonna show you how!

First we call q.GetAccount() to get the fromAccount record and assign it to account1 variable. If err is not nil, we return it.



func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
    var result TransferTxResult

    err := store.execTx(ctx, func(q *Queries) error {
        ...

        // move money out of account1
        account1, err := q.GetAccount(ctx, arg.FromAccountID)
        if err != nil {
            return err
        }

        result.FromAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
            ID:      arg.FromAccountID,
            Balance: account1.Balance - arg.Amount,
        })
        if err != nil {
            return err
        }
    }

    return result, err
}


Enter fullscreen mode Exit fullscreen mode

Else, we call q.UpdateAccount() to update this account’s balance. The ID should be arg.FromAccountID, and the balance will be changed to account1.Balance - arg.Amount because money is going out of this acount.

The updated account record will be saved to result.FromAccount. And if we get an error, just return it.

After this, we have moved money out of the fromAccount. Now we can do similar thing to move those money into the toAccount.



func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
    var result TransferTxResult

    err := store.execTx(ctx, func(q *Queries) error {
        ...

        // move money out of account1
        ...

        // move money into account2
        account2, err := q.GetAccount(ctx, arg.ToAccountID)
        if err != nil {
            return err
        }

        result.ToAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
            ID:      arg.ToAccountID,
            Balance: account2.Balance + arg.Amount,
        })
        if err != nil {
            return err
        }
    }

    return result, err
}


Enter fullscreen mode Exit fullscreen mode

Here, the account ID should be arg.ToAccountID. The result will be stored in result.ToAccount. And the new balance should be account2.Balance + arg.Amount because money is going into this account.

OK so the implementation is done. However, I’m telling you it’s incorrect. Let’s rerun our test to see how it goes!

Alt Text

The test still fails. But this time the error is on line 94, where we compare the amount of money that goes out of account 1 with those that goes into account 2.

In the log, we can see that the first transaction is correct. The balance of account 1 decreases by 10, from 380 to 370. And the balance of account 2 increases by the same amount, from 390 to 400.

But it doesn’t work correctly in the second transaction. The balance of account 2 increases by 10 more, to 410. While the balance of account 1 stays the same, at 370.

To understand why, let’s look at the GetAccount query:



-- name: GetAccount :one
SELECT * FROM accounts
WHERE id = $1 LIMIT 1;


Enter fullscreen mode Exit fullscreen mode

It’s just a normal SELECT, so it doesn’t block other transactions from reading the same Account record.

Therefore, 2 concurrent transactions can get the same value of the account 1, with original balance of 380. Thus it explains why both of them have the updated balance of 370 after execution.

Query without lock

To demonstrate this scenario, let’s start the psql console in 2 different terminal tabs and run 2 parallel transactions.

In the 1st transaction, let’s run a normal SELECT query to get the account record with ID = 1.



SELECT * FROM accounts WHERE id = 1;


Enter fullscreen mode Exit fullscreen mode

Alt Text

This account has balance of 748 USD.

Now I’m gonna run this query in the other transaction.

Alt Text

As you can see, the same account record is returned immediately without being blocked. This is not what we want. So let’s rollback both transactions and learn how to fix it.

Query with lock

I will start 2 new transactions. But this time, we will add FOR UPDATE clause at the end of the SELECT statement.



SELECT * FROM accounts WHERE id = 1 FOR UPDATE;


Enter fullscreen mode Exit fullscreen mode

Alt Text

Now the first transaction still gets the record immediately. But when we run this query on the second transaction:

Alt Text

It is blocked and has to wait for the first transaction to COMMIT or ROLLBACK.

Let’s go back to that transaction and update the account balance to 500:



UPDATE accounts SET balance = 500 WHERE id = 1;


Enter fullscreen mode Exit fullscreen mode

Alt Text

After this update, the second transaction is still blocked. However, as soon as we COMMIT the first transaction:

Alt Text

We can see that the second transaction is unblocked right away, and it gets the newly updated account with balance of 500 EUR. That’s exactly what we want to achieve!

Update account balance with lock

Let’s go back to the account.sql file, and add a new query to get account for update:



-- name: GetAccountForUpdate :one
SELECT * FROM accounts
WHERE id = $1 LIMIT 1
FOR UPDATE;


Enter fullscreen mode Exit fullscreen mode

Then we open the terminal and run make sqlc to regenerate the code. Now in the account.sql.go file, a new GetAccountForUpdate() function is generated.



const getAccountForUpdate = `-- name: GetAccountForUpdate :one
SELECT id, owner, balance, currency, created_at FROM accounts
WHERE id = $1 LIMIT 1
FOR UPDATE
`

func (q *Queries) GetAccountForUpdate(ctx context.Context, id int64) (Account, error) {
    row := q.db.QueryRowContext(ctx, getAccountForUpdate, id)
    var i Account
    err := row.Scan(
        &i.ID,
        &i.Owner,
        &i.Balance,
        &i.Currency,
        &i.CreatedAt,
    )
    return i, err
}


Enter fullscreen mode Exit fullscreen mode

We can use it in our money transfer transaction. Here, to get the first account, we call q.GetAccountForUpdate() instead of q.GetAccount(). We do the same thing to get the second account.



func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
    var result TransferTxResult

    err := store.execTx(ctx, func(q *Queries) error {
        ...

        // move money out of account1
        account1, err := q.GetAccountForUpdate(ctx, arg.FromAccountID)
        if err != nil {
            return err
        }

        result.FromAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
            ID:      arg.FromAccountID,
            Balance: account1.Balance - arg.Amount,
        })
        if err != nil {
            return err
        }

        // move money into account2
        account2, err := q.GetAccountForUpdate(ctx, arg.ToAccountID)
        if err != nil {
            return err
        }

        result.ToAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
            ID:      arg.ToAccountID,
            Balance: account2.Balance + arg.Amount,
        })
        if err != nil {
            return err
        }
    }

    return result, err
}


Enter fullscreen mode Exit fullscreen mode

Alright, now we expect this to work. Let’s rerun our test.

Alt Text

Unfortunately, it still fails. This time the error is deadlock detected. So what can we do?

Don’t worry! I’m gonna show you how to debug this deadlock situation.

Debug a deadlock

In order to figure out why deadlock occured, we need to print out some logs to see which transaction is calling which query and in which order.

For that, we have to assign a name for each transaction and pass it into the TransferTx() function via the context argument.

Now inside this for loop of the test, I’m gonna create a txName variable to store the name of the transaction. We use the fmt.Sprintf() function and the counter i to create different names: tx 1, tx 2, tx 3, and so on.

Then inside the go routine, instead of passing in the background context, we will pass in a new context with the transaction name.



func TestTransferTx(t *testing.T) {
    ...

    // run n concurrent transfer transaction
    for i := 0; i < n; i++ {
        txName := fmt.Sprintf("tx %d", i+1)

        go func() {
            ctx := context.WithValue(context.Background(), txKey, txName)

            result, err := store.TransferTx(ctx, TransferTxParams{
                FromAccountID: account1.ID,
                ToAccountID:   account2.ID,
                Amount:        amount,
            })

            errs <- err
            results <- result
        }()
    }

    // check results
    ...
}


Enter fullscreen mode Exit fullscreen mode

To add the transaction name to the context, we call context.WithValue(), pass in the background context as its parent, and a pair of key value, where value is the transaction name.

In the documentation, it says the context key should not be of type string or any built-in type to avoid collisions between packages. Normally we should define a variable of type struct{} for the context key.

So I’m gonna add a new txKey variable in the store.go file, because later we will have to use this key to get the transaction name from the input context of the TransferTx() function.



var txKey = struct{}{}

func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
    ...
}

...


Enter fullscreen mode Exit fullscreen mode

Here, the 2nd bracket in struct{}{} means that we’re creating a new empty object of type struct{}.

Now in the TransferTx() function, the context will hold the transaction name. We can get it back by calling ctx.Value() to get the value of the txKey from the context.

Now we have the transaction name, we can write some logs with it. Let’s print out this transaction name and the first operation: create transfer. Then do the same for the rest of the operations:



func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
    var result TransferTxResult

    err := store.execTx(ctx, func(q *Queries) error {
        var err error

        txName := ctx.Value(txKey)

        fmt.Println(txName, "create transfer")
        result.Transfer, err = q.CreateTransfer(ctx, CreateTransferParams{
            FromAccountID: arg.FromAccountID,
            ToAccountID:   arg.ToAccountID,
            Amount:        arg.Amount,
        })
        if err != nil {
            return err
        }

        fmt.Println(txName, "create entry 1")
        result.FromEntry, err = q.CreateEntry(ctx, CreateEntryParams{
            AccountID: arg.FromAccountID,
            Amount:    -arg.Amount,
        })
        if err != nil {
            return err
        }

        fmt.Println(txName, "create entry 2")
        result.ToEntry, err = q.CreateEntry(ctx, CreateEntryParams{
            AccountID: arg.ToAccountID,
            Amount:    arg.Amount,
        })
        if err != nil {
            return err
        }

        // move money out of account1
        fmt.Println(txName, "get account 1")
        account1, err := q.GetAccountForUpdate(ctx, arg.FromAccountID)
        if err != nil {
            return err
        }

        fmt.Println(txName, "update account 1")
        result.FromAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
            ID:      arg.FromAccountID,
            Balance: account1.Balance - arg.Amount,
        })
        if err != nil {
            return err
        }

        // move money into account2
        fmt.Println(txName, "get account 2")
        account2, err := q.GetAccountForUpdate(ctx, arg.ToAccountID)
        if err != nil {
            return err
        }

        fmt.Println(txName, "update account 2")
        result.ToAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
            ID:      arg.ToAccountID,
            Balance: account2.Balance + arg.Amount,
        })
        if err != nil {
            return err
        }
    })

    return result, err
}


Enter fullscreen mode Exit fullscreen mode

Alright, now the logs are added, we can rerun the test to see how it goes.

But to make it easier to debug, we should not run too many concurrent transactions. So I’m gonna change this n to 2 instead of 5.



func TestTransferTx(t *testing.T) {
    ...

    n := 2
    amount := int64(10)

    errs := make(chan error)
    results := make(chan TransferTxResult)

    // run n concurrent transfer transaction
    ...
}


Enter fullscreen mode Exit fullscreen mode

Then let’s run the test!

Alt Text

And voila, we still got the deadlock. But this time, we have a detailed logs of what happened.

As you can see here:

  • Transaction 2 ran its first 2 operations: create transfer and create entry 1.
  • Then transaction 1 jumped in to run its create transfer operation.
  • Transaction 2 came back and continued running its next 2 operations: create entry 2 and get account 1.
  • Finally the transaction 1 took turn and ran its next 4 operations: create entry 1, create entry 2, get account 1, and update account 1.
  • At this point, we got a deadlock.

So now we know exactly what happened. What we have to do is to find out the reason why it happened.

Replicate deadlock in psql console

Here I have opened the simple_bank database in TablePlus. At the moment, it has 2 accounts with the same original balance of 100 USD.

Alt Text

I also prepared the money transfer transaction with the list of SQL queries that should be run exactly as we implemented in our Golang code:



BEGIN;

SELECT * FROM accounts WHERE id = 1;

INSERT INTO transfers (from_account_id, to_account_id, amount) VALUES (1, 2, 10) RETURNING *;

INSERT INTO entries (account_id, amount) VALUES (1, -10) RETURNING *;
INSERT INTO entries (account_id, amount) VALUES (2, 10) RETURNING *;

SELECT * FROM accounts WHERE id = 1 FOR UPDATE;
UPDATE accounts SET balance = 90 WHERE id = 1 RETURNING *;

SELECT * FROM accounts WHERE id = 2 FOR UPDATE;
UPDATE accounts SET balance = 110 WHERE id = 2 RETURNING *;

ROLLBACK;


Enter fullscreen mode Exit fullscreen mode
  • The transaction starts with the BEGIN statement.
  • First we INSERT a new transfer record from account 1 to account 2 with amount of 10.
  • Then we INSERT a new entry record for account 1 with amount of -10.
  • And INSERT another entry record for account 2 with amount of +10.
  • Next we SELECT account 1 for update.
  • And we UPDATE its balance to 100-10, which is 90 USD.
  • Similarly, we SELECT account 2 for update.
  • And we UPDATE its balance to 100+10, which equals to 110 USD.
  • Finally we do a ROLLBACK when a deadlock occurs.

Now just like what we did before, I’m gonna open the terminal and run 2 psql console in order to execute 2 transactions in parallel.

Let’s start the first transaction with BEGIN. Then open another tab and access the psql console. Start the second transaction with BEGIN.

Now, we should follow the steps in the logs. First, transaction 2 should run its 2 first queries to create the transfer and entry 1 records:

Alt Text

Inserted successfully! Now we have to move to transaction 1 and run the 1st query to create transfer record.

Alt Text

Now back to transaction 2 and run its 3rd query to create entry 2 and the 4th query to get account 1 for update.

Alt Text

Now we see that this query is blocked. It is waiting for the transaction 2 to commit or rollback before continue.

It sounds strange because transaction 2 only creates a record in transfers table while we’re getting a record from accounts table. Why a INSERT into 1 table can block a SELECT from other table?

To confirm this, let’s open this Postgres Wiki page about lock monitoring.



SELECT blocked_locks.pid     AS blocked_pid,
        blocked_activity.usename  AS blocked_user,
        blocking_locks.pid     AS blocking_pid,
        blocking_activity.usename AS blocking_user,
        blocked_activity.query    AS blocked_statement,
        blocking_activity.query   AS current_statement_in_blocking_process
FROM  pg_catalog.pg_locks         blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity  ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks         blocking_locks 
    ON blocking_locks.locktype = blocked_locks.locktype
    AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
    AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
    AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
    AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
    AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
    AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
    AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
    AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
    AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
    AND blocking_locks.pid != blocked_locks.pid

JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;


Enter fullscreen mode Exit fullscreen mode

This long and complex query allows us to look for blocked queries and what is blocking them. So let’s copy and run it in TablePlus.

Alt Text

As you can see, the blocked statement is SELECT FROM accounts FOR UPDATE. And the one that’s blocking it is INSERT INTO transfers. So it’s true that queries on these 2 different tables can block each other.

Let’s dig deeper to understand why the SELECT query has to wait for the INSERT query.

If we go back to the Postgres Wiki and scroll down a bit, we will see another query that will allow us to list all the locks in our database.

I’m gonna modify this query a bit because I want to see more information:



SELECT
    a.datname,
    a.application_name,
    l.relation::regclass,
    l.transactionid,
    l.mode,
    l.locktype,
    l.GRANTED,
    a.usename,
    a.query,
    a.pid
FROM
    pg_stat_activity a
    JOIN pg_locks l ON
    l.pid = a.pid
ORDER BY
    a.pid;


Enter fullscreen mode Exit fullscreen mode
  • The a.datname field will show us the database name.
  • Let’s add a.application_name to see which application the lock comes from.
  • The l.relation regclass is actually the name of the table,
  • L.transactionid is the ID of the transaction that the lock come from.
  • L.mod is the mod of the lock.
  • Let’s also add l.lock_type to see the type of the lock.
  • L.granted tells us whether the lock is granted or not.
  • a.usename is the username who run the query.
  • a.query is the query that’s holding or trying to acquire the lock.
  • The time when that query started a.query_start or its age are not very important, so I’m gonna remove them.
  • The last field is a.pid, which is the process ID that's running the transaction.

As you can see, we’re selecting from the pg_state_activity table, alias as a, and join with the pg_locks table, alias as l, on the process ID column.

It’s ordering by query start time, but actually I think order by process ID is better because we have 2 different processes that are running 2 psql consoles with 2 parallel transactions. So it will be easier to see which lock belong to which transaction.

Alright, let’s run it!

Alt Text

Here we can see some locks from TablePlus application, which are not relevant. What we care about is only the locks that came from psql consoles.

So I’m gonna add a WHERE clause to get only the locks with application name equals psql.

The database name is also not important because it’s always simple_bank in our case. So I will remove a.datname as well.

OK let’s run this query again:



SELECT
    a.application_name,
    l.relation::regclass,
    l.transactionid,
    l.mode,
    l.locktype,
    l.GRANTED,
    a.usename,
    a.query,
    a.pid
FROM
    pg_stat_activity a
    JOIN pg_locks l ON
    l.pid = a.pid
WHERE
    a.application_name = 'psql'
ORDER BY
    a.pid;


Enter fullscreen mode Exit fullscreen mode

Alt Text

Now we can see, there is only 1 lock that hasn’t been granted yet. It comes from the SELECT FROM accounts query of the process ID 3053.

The reason it’s not granted is because it is trying to acquire a ShareLock of type transactionid, where the transaction ID is 2442. While this transaction ID lock is being held exclusively by the other process ID 3047 with the INSERT INTO transfers query.

But why a SELECT FROM accounts table needs to get a lock from other transaction that runs INSERT INTO transfers table?

Well, if we look at the database schema, we can see that the only connection between accounts and transfers table is the foreign key constraint:



ALTER TABLE "entries" ADD FOREIGN KEY ("account_id") REFERENCES "accounts" ("id");

ALTER TABLE "transfers" ADD FOREIGN KEY ("from_account_id") REFERENCES "accounts" ("id");

ALTER TABLE "transfers" ADD FOREIGN KEY ("to_account_id") REFERENCES "accounts" ("id");


Enter fullscreen mode Exit fullscreen mode

The from_account_id and to_account_id columns of transfers table are referencing the id column of accounts table. So any UPDATE on the account ID will affect this foreign key constraint.

That’s why when we select an account for update, it needs to acquire a lock to prevent conflicts and ensure the consistency of the data.

Having said that, now if we continue running the rest of the queries on transaction 1 to create entry 1, create entry 2, and select account 1 for update:

Alt Text

We will get a deadlock because this query also has to wait for a lock from transaction 2, while transaction 2 is also waiting for a lock from this transaction 1.

And that clearly explains how the deadlock happens. But how to fix it?

Fix deadlock [the bad way]

As we know, the deadlock is caused by foreign key constraints, so one simple way to avoid it is to remove those constraints.

Let’s try comment out these statements in the init_schema.up.sql file:



-- ALTER TABLE "entries" ADD FOREIGN KEY ("account_id") REFERENCES "accounts" ("id");

-- ALTER TABLE "transfers" ADD FOREIGN KEY ("from_account_id") REFERENCES "accounts" ("id");

-- ALTER TABLE "transfers" ADD FOREIGN KEY ("to_account_id") REFERENCES "accounts" ("id");


Enter fullscreen mode Exit fullscreen mode

Then run make migratedown in the terminal to delete the database schema. And run make migrateup to recreate the new db schema without foreign key constraints.

Alt Text

Alright, now if we run the test again, it will pass because the constraints are gone, so no lock is required when select accounts for update. And no lock means no deadlock.

Alt Text

However, this is not the best solution, because we don’t want to loose our nice constraints that keep our data consistent.

So let’s revert these changes, run make migratedown, then make migrateup again to have those constraints back. Now the test will fail because of deadlock again.

Let’s learn a better way to fix this issue.

Fix dead lock [the better way]

As we already know, the transaction lock is only required because Postgres worries that transaction 1 will update the account ID, which would affect the foreign key constraints of transfers table.

However, if we look at the UpdateAccount query, we can see that it only change the account balance.



-- name: UpdateAccount :one
UPDATE accounts
SET balance = $2
WHERE id = $1
RETURNING *;


Enter fullscreen mode Exit fullscreen mode

The account ID will never be changed because it’s the primary key of accounts table.

So if we can tell Postgres that I’m selecting this account for update, but its primary key won’t be touched, then Postgres will not need to acquire the transaction lock, and thus no deadlock.

Fortunately, it’s super easy to do so. In the GetAccountForUpdate query, instead of just SELECT FOR UPDATE, we just need to say more clearly: SELECT FOR NO KEY UPDATE



-- name: GetAccountForUpdate :one
SELECT * FROM accounts
WHERE id = $1 LIMIT 1
FOR NO KEY UPDATE;


Enter fullscreen mode Exit fullscreen mode

This will tell Postgres that we don’t update the key, or ID column of accounts table.

Now let’s run make sqlc in the terminal to regenerate golang code for this query.



const getAccountForUpdate = `-- name: GetAccountForUpdate :one
SELECT id, owner, balance, currency, created_at FROM accounts
WHERE id = $1 LIMIT 1
FOR NO KEY UPDATE
`

func (q *Queries) GetAccountForUpdate(ctx context.Context, id int64) (Account, error) {
    row := q.db.QueryRowContext(ctx, getAccountForUpdate, id)
    var i Account
    err := row.Scan(
        &i.ID,
        &i.Owner,
        &i.Balance,
        &i.Currency,
        &i.CreatedAt,
    )
    return i, err
}


Enter fullscreen mode Exit fullscreen mode

OK the code is updated. Let’s run our test again!

Alt Text

It passed! Excellent! So our debugging and fixing is done.

Update account balance [the better way]

Now before we finish, I’m gonna show you a much better way to implement this update account balance operation.

Currently, we have to perform 2 queries to get the account and update its balance:



func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
    var result TransferTxResult

    err := store.execTx(ctx, func(q *Queries) error {
        ...

        // move money out of account1
        account1, err := q.GetAccountForUpdate(ctx, arg.FromAccountID)
        if err != nil {
            return err
        }

        result.FromAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
            ID:      arg.FromAccountID,
            Balance: account1.Balance - arg.Amount,
        })
        if err != nil {
            return err
        }

        // move money into account2
        ...
    })

    return result, err
}


Enter fullscreen mode Exit fullscreen mode

We can improve this by using only 1 single query to add some amount of money to the account balance directly.

For that, I’m gonna add a new SQL query called AddAccountBalance to the query/account.sql file.



-- name: AddAccountBalance :one
UPDATE accounts
SET balance = balance + $1
WHERE id = $2
RETURNING *;



Enter fullscreen mode Exit fullscreen mode

It’s similar to the UpdateAccount query, except that, here we set balance = balance + $2.

Let’s run make sqlc to generate the code. A new function is successfully added to the Queries struct:



const addAccountBalance = `-- name: AddAccountBalance :one
UPDATE accounts
SET balance = balance + $1
WHERE id = $2
RETURNING id, owner, balance, currency, created_at
`

type AddAccountBalanceParams struct {
    Balance int64 `json:"balance"`
    ID     int64 `json:"id"`
}

func (q *Queries) AddAccountBalance(ctx context.Context, arg AddAccountBalanceParams) (Account, error) {
    row := q.db.QueryRowContext(ctx, addAccountBalance, arg.Balance, arg.ID)
    var i Account
    err := row.Scan(
        &i.ID,
        &i.Owner,
        &i.Balance,
        &i.Currency,
        &i.CreatedAt,
    )
    return i, err
}


Enter fullscreen mode Exit fullscreen mode

However the balance parameter inside AddAccountBalanceParams struct looks a bit confusing, because we’re just adding some amount of money to the balance, not changing the account balance to this value.

So this parameter’s name should be Amount instead. Can we tell sqlc to do that for us?

Yes, we can! In the SQL query, instead of $2, we can say sqlc.arg(amount), and instead of $1, we should say sqlc.arg(id)



-- name: AddAccountBalance :one
UPDATE accounts
SET balance = balance + sqlc.arg(amount)
WHERE id = sqlc.arg(id)
RETURNING *;


Enter fullscreen mode Exit fullscreen mode

This amount and id will be the name of the generated parameters. Let’s run make sqlc in the terminal to regenerate the code.



const addAccountBalance = `-- name: AddAccountBalance :one
UPDATE accounts
SET balance = balance + $1
WHERE id = $2
RETURNING id, owner, balance, currency, created_at
`

type AddAccountBalanceParams struct {
    Amount int64 `json:"amount"`
    ID     int64 `json:"id"`
}

func (q *Queries) AddAccountBalance(ctx context.Context, arg AddAccountBalanceParams) (Account, error) {
    row := q.db.QueryRowContext(ctx, addAccountBalance, arg.Amount, arg.ID)
    var i Account
    err := row.Scan(
        &i.ID,
        &i.Owner,
        &i.Balance,
        &i.Currency,
        &i.CreatedAt,
    )
    return i, err
}


Enter fullscreen mode Exit fullscreen mode

This time, we can see the parameters’ name have changed to what we want. Cool!

Now come back to the store.go file, I’m gonna remove the GetAccountForUpdate call, and change UpdateAccount() to AddAccountBalance():



func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
    var result TransferTxResult

    err := store.execTx(ctx, func(q *Queries) error {
        ...

        // move money out of account1
        result.FromAccount, err = q.AddAccountBalance(ctx, AddAccountBalanceParams{
            ID:     arg.FromAccountID,
            Amount: -arg.Amount,
        })
        if err != nil {
            return err
        }

        // move money into account2
        result.ToAccount, err = q.AddAccountBalance(ctx, AddAccountBalanceParams{
            ID:     arg.ToAccountID,
            Amount: arg.Amount,
        })
        if err != nil {
            return err
        }

        return nil
    })

    return result, err
    }


Enter fullscreen mode Exit fullscreen mode

Note that the Amount to add to account1 should be -amount because money is moving out.

And we’re done! Let’s rerun the test.

Alt Text

Yee! It passed! Let’s run the whole package test.

Alt Text

All passed!

And that’s it for today’s lecture about locking in db transaction and how to debug a deadlock. I hope you enjoy it.

Stay tuned for the next lecture, because I’m telling you the deadlock issue is not completely resolved yet. There are much more to learn about it.

In the mean time, happy coding and I’ll see you very soon!


If you like the article, please subscribe to our Youtube channel and follow us on Twitter for more tutorials in the future.


If you want to join me on my current amazing team at Voodoo, check out our job openings here. Remote or onsite in Paris/Amsterdam/London/Berlin/Barcelona with visa sponsorship.

💖 💪 🙅 🚩
techschoolguru
TECH SCHOOL

Posted on August 27, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related