TECH SCHOOL
Posted on August 27, 2020
Locking is not easy as you think!
In the last lecture, we’ve learned how to implement a simple money transfer transaction. However, we haven’t updated the accounts’ balance yet because it’s more complicated and require careful handling of concurrent transactions to avoid deadlock.
So in this lecture we’re gonna implement this feature to learn more about database locking and how to handle a deadlock situation.
Here's:
- Link to the full series playlist on Youtube
- And its Github repository
Test Driven Development
Today I’m gonna use a different implementation approach, which is test driven development (or TDD). The idea is: we write tests first to make our current code breaks. Then we gradually improve the code until the tests pass.
OK, this is the test that we were working on in the previous video:
func TestTransferTx(t *testing.T) {
store := NewStore(testDB)
account1 := createRandomAccount(t)
account2 := createRandomAccount(t)
// run n concurrent transfer transactions
n := 5
amount := int64(10)
errs := make(chan error)
results := make(chan TransferTxResult)
for i := 0; i < n; i++ {
go func() {
result, err := store.TransferTx(context.Background(), TransferTxParams{
FromAccountID: account1.ID,
ToAccountID: account2.ID,
Amount: amount,
})
errs <- err
results <- result
}()
}
// check results
for i := 0; i < n; i++ {
err := <-errs
require.NoError(t, err)
result := <-results
require.NotEmpty(t, result)
// check transfer
transfer := result.Transfer
require.NotEmpty(t, transfer)
require.Equal(t, account1.ID, transfer.FromAccountID)
require.Equal(t, account2.ID, transfer.ToAccountID)
require.Equal(t, amount, transfer.Amount)
require.NotZero(t, transfer.ID)
require.NotZero(t, transfer.CreatedAt)
_, err = store.GetTransfer(context.Background(), transfer.ID)
require.NoError(t, err)
// check entries
fromEntry := result.FromEntry
require.NotEmpty(t, fromEntry)
require.Equal(t, account1.ID, fromEntry.AccountID)
require.Equal(t, -amount, fromEntry.Amount)
require.NotZero(t, fromEntry.ID)
require.NotZero(t, fromEntry.CreatedAt)
_, err = store.GetEntry(context.Background(), fromEntry.ID)
require.NoError(t, err)
toEntry := result.ToEntry
require.NotEmpty(t, toEntry)
require.Equal(t, account2.ID, toEntry.AccountID)
require.Equal(t, amount, toEntry.Amount)
require.NotZero(t, toEntry.ID)
require.NotZero(t, toEntry.CreatedAt)
_, err = store.GetEntry(context.Background(), toEntry.ID)
require.NoError(t, err)
// TODO: check accounts' balance
}
}
It creates 5 go routines to execute 5 concurrent transfer transactions, where each of them will transfer the same amount of money from account 1 to account 2. Then it iterates through the list of results to check the created transfer and entry objects.
Now to finish this test, we need to check the output accounts and their balances.
Let’s start with the accounts. First the fromAccount
, where money is going out. We check it should not be empty. And its ID
should equal to account1.ID
.
Similar for the toAccount
, where money is going in. The account object should not be empty. And its ID
should equal to account2.ID
.
func TestTransferTx(t *testing.T) {
...
// check results
for i := 0; i < n; i++ {
...
// check accounts
fromAccount := result.FromAccount
require.NotEmpty(t, fromAccount)
require.Equal(t, account1.ID, fromAccount.ID)
toAccount := result.ToAccount
require.NotEmpty(t, toAccount)
require.Equal(t, account2.ID, toAccount.ID)
// TODO: check accounts' balance
}
}
Next we will check the accounts’ balance. We calculate the difference diff1
between the input account1.Balance
and the output fromAccount.Balance
. This diff1
is the amount of money that’s going out of account1.
Similarly, we calculate the difference diff2
between the output toAccount.Balance
and the input account2.Balance
. This diff2
is the amount of money that’s going in to account2.
func TestTransferTx(t *testing.T) {
...
// check results
for i := 0; i < n; i++ {
...
// check accounts' balance
diff1 := account1.Balance - fromAccount.Balance
diff2 := toAccount.Balance - account2.Balance
require.Equal(t, diff1, diff2)
require.True(t, diff1 > 0)
require.True(t, diff1%amount == 0) // 1 * amount, 2 * amount, 3 * amount, ..., n * amount
}
}
If the transaction works correctly then diff1
and diff2
should be the same, and they should be a positive number.
Also, this difference should be divisible by the amount
of money that moves in each transaction. The reason is, the balance of account 1 will be decreased by 1 times amount after the 1st transaction, then 2 times amount after the 2nd transaction, 3 times amount after the 3rd transaction, and so on and so forth.
Because of this, If we compute k = diff1 / amount
, then k
must be an integer between 1 and n
, where n
is the number of executed transactions.
func TestTransferTx(t *testing.T) {
...
// check results
existed := make(map[int]bool)
for i := 0; i < n; i++ {
...
// check accounts' balance
...
k := int(diff1 / amount)
require.True(t, k >= 1 && k <= n)
require.NotContains(t, existed, k)
existed[k] = true
}
}
Moreover, k
must be unique for each transaction, which means k
should be 1 for the 1st transaction, 2 for the second, 3 for the 3rd and so on until k
equals to n
.
In order to check this, we need to declare a new variable called existed
of type map[int]bool
. Then in the loop, check that the existed
map should not contain k
. Then we set existed[k]
to true
after that.
Eventually, after the for loop, we should check the final updated balances of the 2 accounts.
First we get the updated account 1 from the database by calling store.GetAccount()
with a background context and the ID
of account 1. This query should not return an error. We get the updated account 2 from the database in the same manner.
func TestTransferTx(t *testing.T) {
...
// check results
existed := make(map[int]bool)
for i := 0; i < n; i++ {
...
}
// check the final updated balance
updatedAccount1, err := store.GetAccount(context.Background(), account1.ID)
require.NoError(t, err)
updatedAccount2, err := store.GetAccount(context.Background(), account2.ID)
require.NoError(t, err)
require.Equal(t, account1.Balance-int64(n)*amount, updatedAccount1.Balance)
require.Equal(t, account2.Balance+int64(n)*amount, updatedAccount2.Balance)
}
Now after n
transactions, the balance of account 1 must decrease by n * amount
. So we require the updatedAccount1.Balance
to equal to that value. amount
is of type int64
, so we need to convert n
to int64
before doing the multiplication.
We do the same for the updatedAccount2.Balance
, except that its value should be increasing by n * amount
istead of decreasing.
And that’s it! We’re done with the test. But before running it, I’m gonna write some logs to see the results more clearly.
First, let’s print out the balance of the accounts before the transaction. Then print out their updated balances after all the transactions are executed. I also want to see the result balances after each transaction so let’s add a log in the for loop as well.
OK, this is our final test:
func TestTransferTx(t *testing.T) {
store := NewStore(testDB)
account1 := createRandomAccount(t)
account2 := createRandomAccount(t)
fmt.Println(">> before:", account1.Balance, account2.Balance)
n := 5
amount := int64(10)
errs := make(chan error)
results := make(chan TransferTxResult)
// run n concurrent transfer transaction
for i := 0; i < n; i++ {
go func() {
result, err := store.TransferTx(context.Background(), TransferTxParams{
FromAccountID: account1.ID,
ToAccountID: account2.ID,
Amount: amount,
})
errs <- err
results <- result
}()
}
// check results
existed := make(map[int]bool)
for i := 0; i < n; i++ {
err := <-errs
require.NoError(t, err)
result := <-results
require.NotEmpty(t, result)
// check transfer
transfer := result.Transfer
require.NotEmpty(t, transfer)
require.Equal(t, account1.ID, transfer.FromAccountID)
require.Equal(t, account2.ID, transfer.ToAccountID)
require.Equal(t, amount, transfer.Amount)
require.NotZero(t, transfer.ID)
require.NotZero(t, transfer.CreatedAt)
_, err = store.GetTransfer(context.Background(), transfer.ID)
require.NoError(t, err)
// check entries
fromEntry := result.FromEntry
require.NotEmpty(t, fromEntry)
require.Equal(t, account1.ID, fromEntry.AccountID)
require.Equal(t, -amount, fromEntry.Amount)
require.NotZero(t, fromEntry.ID)
require.NotZero(t, fromEntry.CreatedAt)
_, err = store.GetEntry(context.Background(), fromEntry.ID)
require.NoError(t, err)
toEntry := result.ToEntry
require.NotEmpty(t, toEntry)
require.Equal(t, account2.ID, toEntry.AccountID)
require.Equal(t, amount, toEntry.Amount)
require.NotZero(t, toEntry.ID)
require.NotZero(t, toEntry.CreatedAt)
_, err = store.GetEntry(context.Background(), toEntry.ID)
require.NoError(t, err)
// check accounts
fromAccount := result.FromAccount
require.NotEmpty(t, fromAccount)
require.Equal(t, account1.ID, fromAccount.ID)
toAccount := result.ToAccount
require.NotEmpty(t, toAccount)
require.Equal(t, account2.ID, toAccount.ID)
// check balances
fmt.Println(">> tx:", fromAccount.Balance, toAccount.Balance)
diff1 := account1.Balance - fromAccount.Balance
diff2 := toAccount.Balance - account2.Balance
require.Equal(t, diff1, diff2)
require.True(t, diff1 > 0)
require.True(t, diff1%amount == 0) // 1 * amount, 2 * amount, 3 * amount, ..., n * amount
k := int(diff1 / amount)
require.True(t, k >= 1 && k <= n)
require.NotContains(t, existed, k)
existed[k] = true
}
// check the final updated balance
updatedAccount1, err := store.GetAccount(context.Background(), account1.ID)
require.NoError(t, err)
updatedAccount2, err := store.GetAccount(context.Background(), account2.ID)
require.NoError(t, err)
fmt.Println(">> after:", updatedAccount1.Balance, updatedAccount2.Balance)
require.Equal(t, account1.Balance-int64(n)*amount, updatedAccount1.Balance)
require.Equal(t, account2.Balance+int64(n)*amount, updatedAccount2.Balance)
}
Let's run it!
It fails at line 83
, where we expect the fromAccount
to be not empty. But of course it is empty at the moment, because we haven’t implemented the feature yet.
So let’s go back to the store.go
file to implement it!
Update account balances [the wrong way]
One easy and intuitive way to change an account’s balance is to first get that account from the database, then add or subtract some amount of money from its balance, and update it back to the database.
However, this is often done incorrectly without a proper locking mechanism. I'm gonna show you how!
First we call q.GetAccount()
to get the fromAccount
record and assign it to account1
variable. If err
is not nil
, we return it.
func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
var result TransferTxResult
err := store.execTx(ctx, func(q *Queries) error {
...
// move money out of account1
account1, err := q.GetAccount(ctx, arg.FromAccountID)
if err != nil {
return err
}
result.FromAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
ID: arg.FromAccountID,
Balance: account1.Balance - arg.Amount,
})
if err != nil {
return err
}
}
return result, err
}
Else, we call q.UpdateAccount()
to update this account’s balance. The ID should be arg.FromAccountID
, and the balance will be changed to account1.Balance - arg.Amount
because money is going out of this acount.
The updated account record will be saved to result.FromAccount
. And if we get an error, just return it.
After this, we have moved money out of the fromAccount
. Now we can do similar thing to move those money into the toAccount
.
func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
var result TransferTxResult
err := store.execTx(ctx, func(q *Queries) error {
...
// move money out of account1
...
// move money into account2
account2, err := q.GetAccount(ctx, arg.ToAccountID)
if err != nil {
return err
}
result.ToAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
ID: arg.ToAccountID,
Balance: account2.Balance + arg.Amount,
})
if err != nil {
return err
}
}
return result, err
}
Here, the account ID should be arg.ToAccountID
. The result will be stored in result.ToAccount
. And the new balance should be account2.Balance + arg.Amount
because money is going into this account.
OK so the implementation is done. However, I’m telling you it’s incorrect. Let’s rerun our test to see how it goes!
The test still fails. But this time the error is on line 94, where we compare the amount of money that goes out of account 1 with those that goes into account 2.
In the log, we can see that the first transaction is correct. The balance of account 1 decreases by 10
, from 380
to 370
. And the balance of account 2 increases by the same amount, from 390
to 400
.
But it doesn’t work correctly in the second transaction. The balance of account 2 increases by 10
more, to 410
. While the balance of account 1 stays the same, at 370
.
To understand why, let’s look at the GetAccount
query:
-- name: GetAccount :one
SELECT * FROM accounts
WHERE id = $1 LIMIT 1;
It’s just a normal SELECT
, so it doesn’t block other transactions from reading the same Account
record.
Therefore, 2 concurrent transactions can get the same value of the account 1, with original balance of 380
. Thus it explains why both of them have the updated balance of 370
after execution.
Query without lock
To demonstrate this scenario, let’s start the psql
console in 2 different terminal tabs and run 2 parallel transactions.
In the 1st transaction, let’s run a normal SELECT
query to get the account record with ID = 1
.
SELECT * FROM accounts WHERE id = 1;
This account has balance of 748 USD.
Now I’m gonna run this query in the other transaction.
As you can see, the same account record is returned immediately without being blocked. This is not what we want. So let’s rollback both transactions and learn how to fix it.
Query with lock
I will start 2 new transactions. But this time, we will add FOR UPDATE
clause at the end of the SELECT
statement.
SELECT * FROM accounts WHERE id = 1 FOR UPDATE;
Now the first transaction still gets the record immediately. But when we run this query on the second transaction:
It is blocked and has to wait for the first transaction to COMMIT or ROLLBACK.
Let’s go back to that transaction and update the account balance to 500:
UPDATE accounts SET balance = 500 WHERE id = 1;
After this update, the second transaction is still blocked. However, as soon as we COMMIT the first transaction:
We can see that the second transaction is unblocked right away, and it gets the newly updated account with balance of 500 EUR. That’s exactly what we want to achieve!
Update account balance with lock
Let’s go back to the account.sql
file, and add a new query to get account for update:
-- name: GetAccountForUpdate :one
SELECT * FROM accounts
WHERE id = $1 LIMIT 1
FOR UPDATE;
Then we open the terminal and run make sqlc
to regenerate the code. Now in the account.sql.go
file, a new GetAccountForUpdate()
function is generated.
const getAccountForUpdate = `-- name: GetAccountForUpdate :one
SELECT id, owner, balance, currency, created_at FROM accounts
WHERE id = $1 LIMIT 1
FOR UPDATE
`
func (q *Queries) GetAccountForUpdate(ctx context.Context, id int64) (Account, error) {
row := q.db.QueryRowContext(ctx, getAccountForUpdate, id)
var i Account
err := row.Scan(
&i.ID,
&i.Owner,
&i.Balance,
&i.Currency,
&i.CreatedAt,
)
return i, err
}
We can use it in our money transfer transaction. Here, to get the first account, we call q.GetAccountForUpdate()
instead of q.GetAccount()
. We do the same thing to get the second account.
func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
var result TransferTxResult
err := store.execTx(ctx, func(q *Queries) error {
...
// move money out of account1
account1, err := q.GetAccountForUpdate(ctx, arg.FromAccountID)
if err != nil {
return err
}
result.FromAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
ID: arg.FromAccountID,
Balance: account1.Balance - arg.Amount,
})
if err != nil {
return err
}
// move money into account2
account2, err := q.GetAccountForUpdate(ctx, arg.ToAccountID)
if err != nil {
return err
}
result.ToAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
ID: arg.ToAccountID,
Balance: account2.Balance + arg.Amount,
})
if err != nil {
return err
}
}
return result, err
}
Alright, now we expect this to work. Let’s rerun our test.
Unfortunately, it still fails. This time the error is deadlock detected
. So what can we do?
Don’t worry! I’m gonna show you how to debug this deadlock situation.
Debug a deadlock
In order to figure out why deadlock occured, we need to print out some logs to see which transaction is calling which query and in which order.
For that, we have to assign a name for each transaction and pass it into the TransferTx()
function via the context argument.
Now inside this for loop of the test, I’m gonna create a txName
variable to store the name of the transaction. We use the fmt.Sprintf()
function and the counter i
to create different names: tx 1
, tx 2
, tx 3
, and so on.
Then inside the go routine, instead of passing in the background context, we will pass in a new context with the transaction name.
func TestTransferTx(t *testing.T) {
...
// run n concurrent transfer transaction
for i := 0; i < n; i++ {
txName := fmt.Sprintf("tx %d", i+1)
go func() {
ctx := context.WithValue(context.Background(), txKey, txName)
result, err := store.TransferTx(ctx, TransferTxParams{
FromAccountID: account1.ID,
ToAccountID: account2.ID,
Amount: amount,
})
errs <- err
results <- result
}()
}
// check results
...
}
To add the transaction name to the context, we call context.WithValue()
, pass in the background context as its parent, and a pair of key value, where value is the transaction name.
In the documentation, it says the context key should not be of type string or any built-in type to avoid collisions between packages. Normally we should define a variable of type struct{}
for the context key.
So I’m gonna add a new txKey
variable in the store.go
file, because later we will have to use this key to get the transaction name from the input context of the TransferTx()
function.
var txKey = struct{}{}
func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
...
}
...
Here, the 2nd bracket in struct{}{}
means that we’re creating a new empty object of type struct{}
.
Now in the TransferTx()
function, the context will hold the transaction name. We can get it back by calling ctx.Value()
to get the value of the txKey
from the context.
Now we have the transaction name, we can write some logs with it. Let’s print out this transaction name and the first operation: create transfer
. Then do the same for the rest of the operations:
func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
var result TransferTxResult
err := store.execTx(ctx, func(q *Queries) error {
var err error
txName := ctx.Value(txKey)
fmt.Println(txName, "create transfer")
result.Transfer, err = q.CreateTransfer(ctx, CreateTransferParams{
FromAccountID: arg.FromAccountID,
ToAccountID: arg.ToAccountID,
Amount: arg.Amount,
})
if err != nil {
return err
}
fmt.Println(txName, "create entry 1")
result.FromEntry, err = q.CreateEntry(ctx, CreateEntryParams{
AccountID: arg.FromAccountID,
Amount: -arg.Amount,
})
if err != nil {
return err
}
fmt.Println(txName, "create entry 2")
result.ToEntry, err = q.CreateEntry(ctx, CreateEntryParams{
AccountID: arg.ToAccountID,
Amount: arg.Amount,
})
if err != nil {
return err
}
// move money out of account1
fmt.Println(txName, "get account 1")
account1, err := q.GetAccountForUpdate(ctx, arg.FromAccountID)
if err != nil {
return err
}
fmt.Println(txName, "update account 1")
result.FromAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
ID: arg.FromAccountID,
Balance: account1.Balance - arg.Amount,
})
if err != nil {
return err
}
// move money into account2
fmt.Println(txName, "get account 2")
account2, err := q.GetAccountForUpdate(ctx, arg.ToAccountID)
if err != nil {
return err
}
fmt.Println(txName, "update account 2")
result.ToAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
ID: arg.ToAccountID,
Balance: account2.Balance + arg.Amount,
})
if err != nil {
return err
}
})
return result, err
}
Alright, now the logs are added, we can rerun the test to see how it goes.
But to make it easier to debug, we should not run too many concurrent transactions. So I’m gonna change this n
to 2
instead of 5
.
func TestTransferTx(t *testing.T) {
...
n := 2
amount := int64(10)
errs := make(chan error)
results := make(chan TransferTxResult)
// run n concurrent transfer transaction
...
}
Then let’s run the test!
And voila, we still got the deadlock. But this time, we have a detailed logs of what happened.
As you can see here:
- Transaction 2 ran its first 2 operations:
create transfer
andcreate entry 1
. - Then transaction 1 jumped in to run its
create transfer
operation. - Transaction 2 came back and continued running its next 2 operations:
create entry 2
andget account 1
. - Finally the transaction 1 took turn and ran its next 4 operations:
create entry 1
,create entry 2
,get account 1
, andupdate account 1
. - At this point, we got a deadlock.
So now we know exactly what happened. What we have to do is to find out the reason why it happened.
Replicate deadlock in psql console
Here I have opened the simple_bank
database in TablePlus. At the moment, it has 2 accounts with the same original balance of 100 USD
.
I also prepared the money transfer transaction with the list of SQL queries that should be run exactly as we implemented in our Golang code:
BEGIN;
SELECT * FROM accounts WHERE id = 1;
INSERT INTO transfers (from_account_id, to_account_id, amount) VALUES (1, 2, 10) RETURNING *;
INSERT INTO entries (account_id, amount) VALUES (1, -10) RETURNING *;
INSERT INTO entries (account_id, amount) VALUES (2, 10) RETURNING *;
SELECT * FROM accounts WHERE id = 1 FOR UPDATE;
UPDATE accounts SET balance = 90 WHERE id = 1 RETURNING *;
SELECT * FROM accounts WHERE id = 2 FOR UPDATE;
UPDATE accounts SET balance = 110 WHERE id = 2 RETURNING *;
ROLLBACK;
- The transaction starts with the
BEGIN
statement. - First we
INSERT
a newtransfer
record fromaccount 1
toaccount 2
withamount
of10
. - Then we
INSERT
a newentry
record foraccount 1
withamount
of-10
. - And
INSERT
anotherentry
record foraccount 2
withamount
of+10
. - Next we
SELECT
account 1
for update. - And we
UPDATE
itsbalance
to100-10
, which is90
USD. - Similarly, we
SELECT
account 2
for update. - And we
UPDATE
its balance to100+10
, which equals to110
USD. - Finally we do a
ROLLBACK
when a deadlock occurs.
Now just like what we did before, I’m gonna open the terminal and run 2 psql console in order to execute 2 transactions in parallel.
Let’s start the first transaction with BEGIN
. Then open another tab and access the psql console. Start the second transaction with BEGIN
.
Now, we should follow the steps in the logs. First, transaction 2
should run its 2 first queries to create the transfer
and entry 1
records:
Inserted successfully! Now we have to move to transaction 1
and run the 1st query to create transfer
record.
Now back to transaction 2
and run its 3rd query to create entry 2
and the 4th query to get account 1
for update.
Now we see that this query is blocked. It is waiting for the transaction 2
to commit or rollback before continue.
It sounds strange because transaction 2 only creates a record in transfers
table while we’re getting a record from accounts
table. Why a INSERT
into 1 table can block a SELECT
from other table?
To confirm this, let’s open this Postgres Wiki page about lock monitoring.
SELECT blocked_locks.pid AS blocked_pid,
blocked_activity.usename AS blocked_user,
blocking_locks.pid AS blocking_pid,
blocking_activity.usename AS blocking_user,
blocked_activity.query AS blocked_statement,
blocking_activity.query AS current_statement_in_blocking_process
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks
ON blocking_locks.locktype = blocked_locks.locktype
AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;
This long and complex query allows us to look for blocked queries and what is blocking them. So let’s copy and run it in TablePlus.
As you can see, the blocked statement is SELECT FROM accounts FOR UPDATE
. And the one that’s blocking it is INSERT INTO transfers
. So it’s true that queries on these 2 different tables can block each other.
Let’s dig deeper to understand why the SELECT
query has to wait for the INSERT
query.
If we go back to the Postgres Wiki and scroll down a bit, we will see another query that will allow us to list all the locks in our database.
I’m gonna modify this query a bit because I want to see more information:
SELECT
a.datname,
a.application_name,
l.relation::regclass,
l.transactionid,
l.mode,
l.locktype,
l.GRANTED,
a.usename,
a.query,
a.pid
FROM
pg_stat_activity a
JOIN pg_locks l ON
l.pid = a.pid
ORDER BY
a.pid;
- The
a.datname
field will show us the database name. - Let’s add
a.application_name
to see which application the lock comes from. - The
l.relation
regclass is actually the name of the table, -
L.transactionid
is the ID of the transaction that the lock come from. -
L.mod
is the mod of the lock. - Let’s also add
l.lock_type
to see the type of the lock. -
L.granted
tells us whether the lock is granted or not. -
a.usename
is the username who run the query. -
a.query
is the query that’s holding or trying to acquire the lock. - The time when that query started
a.query_start
or itsage
are not very important, so I’m gonna remove them. - The last field is
a.pid
, which is the process ID that's running the transaction.
As you can see, we’re selecting from the pg_state_activity
table, alias as a
, and join with the pg_locks
table, alias as l
, on the process ID column.
It’s ordering by query start time, but actually I think order by process ID is better because we have 2 different processes that are running 2 psql consoles with 2 parallel transactions. So it will be easier to see which lock belong to which transaction.
Alright, let’s run it!
Here we can see some locks from TablePlus
application, which are not relevant. What we care about is only the locks that came from psql
consoles.
So I’m gonna add a WHERE clause to get only the locks with application name equals psql
.
The database name is also not important because it’s always simple_bank
in our case. So I will remove a.datname
as well.
OK let’s run this query again:
SELECT
a.application_name,
l.relation::regclass,
l.transactionid,
l.mode,
l.locktype,
l.GRANTED,
a.usename,
a.query,
a.pid
FROM
pg_stat_activity a
JOIN pg_locks l ON
l.pid = a.pid
WHERE
a.application_name = 'psql'
ORDER BY
a.pid;
Now we can see, there is only 1 lock that hasn’t been granted yet. It comes from the SELECT FROM accounts
query of the process ID 3053
.
The reason it’s not granted is because it is trying to acquire a ShareLock
of type transactionid
, where the transaction ID is 2442
. While this transaction ID lock is being held exclusively
by the other process ID 3047
with the INSERT INTO transfers
query.
But why a SELECT FROM accounts
table needs to get a lock from other transaction that runs INSERT INTO transfers
table?
Well, if we look at the database schema, we can see that the only connection between accounts and transfers table is the foreign key constraint:
ALTER TABLE "entries" ADD FOREIGN KEY ("account_id") REFERENCES "accounts" ("id");
ALTER TABLE "transfers" ADD FOREIGN KEY ("from_account_id") REFERENCES "accounts" ("id");
ALTER TABLE "transfers" ADD FOREIGN KEY ("to_account_id") REFERENCES "accounts" ("id");
The from_account_id
and to_account_id
columns of transfers
table are referencing the id
column of accounts
table. So any UPDATE
on the account ID will affect this foreign key constraint.
That’s why when we select an account for update, it needs to acquire a lock to prevent conflicts and ensure the consistency of the data.
Having said that, now if we continue running the rest of the queries on transaction 1 to create entry 1
, create entry 2
, and select account 1
for update:
We will get a deadlock because this query also has to wait for a lock from transaction 2, while transaction 2 is also waiting for a lock from this transaction 1.
And that clearly explains how the deadlock happens. But how to fix it?
Fix deadlock [the bad way]
As we know, the deadlock is caused by foreign key constraints, so one simple way to avoid it is to remove those constraints.
Let’s try comment out these statements in the init_schema.up.sql
file:
-- ALTER TABLE "entries" ADD FOREIGN KEY ("account_id") REFERENCES "accounts" ("id");
-- ALTER TABLE "transfers" ADD FOREIGN KEY ("from_account_id") REFERENCES "accounts" ("id");
-- ALTER TABLE "transfers" ADD FOREIGN KEY ("to_account_id") REFERENCES "accounts" ("id");
Then run make migratedown
in the terminal to delete the database schema. And run make migrateup
to recreate the new db schema without foreign key constraints.
Alright, now if we run the test again, it will pass because the constraints are gone, so no lock is required when select accounts for update. And no lock means no deadlock.
However, this is not the best solution, because we don’t want to loose our nice constraints that keep our data consistent.
So let’s revert these changes, run make migratedown
, then make migrateup
again to have those constraints back. Now the test will fail because of deadlock again.
Let’s learn a better way to fix this issue.
Fix dead lock [the better way]
As we already know, the transaction lock is only required because Postgres worries that transaction 1 will update the account ID
, which would affect the foreign key constraints of transfers
table.
However, if we look at the UpdateAccount
query, we can see that it only change the account balance.
-- name: UpdateAccount :one
UPDATE accounts
SET balance = $2
WHERE id = $1
RETURNING *;
The account ID will never be changed because it’s the primary key of accounts table.
So if we can tell Postgres that I’m selecting this account for update, but its primary key won’t be touched, then Postgres will not need to acquire the transaction lock, and thus no deadlock.
Fortunately, it’s super easy to do so. In the GetAccountForUpdate
query, instead of just SELECT FOR UPDATE
, we just need to say more clearly: SELECT FOR NO KEY UPDATE
-- name: GetAccountForUpdate :one
SELECT * FROM accounts
WHERE id = $1 LIMIT 1
FOR NO KEY UPDATE;
This will tell Postgres that we don’t update the key, or ID
column of accounts table.
Now let’s run make sqlc
in the terminal to regenerate golang code for this query.
const getAccountForUpdate = `-- name: GetAccountForUpdate :one
SELECT id, owner, balance, currency, created_at FROM accounts
WHERE id = $1 LIMIT 1
FOR NO KEY UPDATE
`
func (q *Queries) GetAccountForUpdate(ctx context.Context, id int64) (Account, error) {
row := q.db.QueryRowContext(ctx, getAccountForUpdate, id)
var i Account
err := row.Scan(
&i.ID,
&i.Owner,
&i.Balance,
&i.Currency,
&i.CreatedAt,
)
return i, err
}
OK the code is updated. Let’s run our test again!
It passed! Excellent! So our debugging and fixing is done.
Update account balance [the better way]
Now before we finish, I’m gonna show you a much better way to implement this update account balance operation.
Currently, we have to perform 2 queries to get the account and update its balance:
func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
var result TransferTxResult
err := store.execTx(ctx, func(q *Queries) error {
...
// move money out of account1
account1, err := q.GetAccountForUpdate(ctx, arg.FromAccountID)
if err != nil {
return err
}
result.FromAccount, err = q.UpdateAccount(ctx, UpdateAccountParams{
ID: arg.FromAccountID,
Balance: account1.Balance - arg.Amount,
})
if err != nil {
return err
}
// move money into account2
...
})
return result, err
}
We can improve this by using only 1 single query to add some amount of money to the account balance directly.
For that, I’m gonna add a new SQL query called AddAccountBalance
to the query/account.sql
file.
-- name: AddAccountBalance :one
UPDATE accounts
SET balance = balance + $1
WHERE id = $2
RETURNING *;
It’s similar to the UpdateAccount query, except that, here we set balance = balance + $2
.
Let’s run make sqlc
to generate the code. A new function is successfully added to the Queries
struct:
const addAccountBalance = `-- name: AddAccountBalance :one
UPDATE accounts
SET balance = balance + $1
WHERE id = $2
RETURNING id, owner, balance, currency, created_at
`
type AddAccountBalanceParams struct {
Balance int64 `json:"balance"`
ID int64 `json:"id"`
}
func (q *Queries) AddAccountBalance(ctx context.Context, arg AddAccountBalanceParams) (Account, error) {
row := q.db.QueryRowContext(ctx, addAccountBalance, arg.Balance, arg.ID)
var i Account
err := row.Scan(
&i.ID,
&i.Owner,
&i.Balance,
&i.Currency,
&i.CreatedAt,
)
return i, err
}
However the balance
parameter inside AddAccountBalanceParams
struct looks a bit confusing, because we’re just adding some amount of money to the balance, not changing the account balance to this value.
So this parameter’s name should be Amount
instead. Can we tell sqlc to do that for us?
Yes, we can! In the SQL query, instead of $2
, we can say sqlc.arg(amount)
, and instead of $1
, we should say sqlc.arg(id)
-- name: AddAccountBalance :one
UPDATE accounts
SET balance = balance + sqlc.arg(amount)
WHERE id = sqlc.arg(id)
RETURNING *;
This amount
and id
will be the name of the generated parameters. Let’s run make sqlc in the terminal to regenerate the code.
const addAccountBalance = `-- name: AddAccountBalance :one
UPDATE accounts
SET balance = balance + $1
WHERE id = $2
RETURNING id, owner, balance, currency, created_at
`
type AddAccountBalanceParams struct {
Amount int64 `json:"amount"`
ID int64 `json:"id"`
}
func (q *Queries) AddAccountBalance(ctx context.Context, arg AddAccountBalanceParams) (Account, error) {
row := q.db.QueryRowContext(ctx, addAccountBalance, arg.Amount, arg.ID)
var i Account
err := row.Scan(
&i.ID,
&i.Owner,
&i.Balance,
&i.Currency,
&i.CreatedAt,
)
return i, err
}
This time, we can see the parameters’ name have changed to what we want. Cool!
Now come back to the store.go
file, I’m gonna remove the GetAccountForUpdate
call, and change UpdateAccount()
to AddAccountBalance()
:
func (store *Store) TransferTx(ctx context.Context, arg TransferTxParams) (TransferTxResult, error) {
var result TransferTxResult
err := store.execTx(ctx, func(q *Queries) error {
...
// move money out of account1
result.FromAccount, err = q.AddAccountBalance(ctx, AddAccountBalanceParams{
ID: arg.FromAccountID,
Amount: -arg.Amount,
})
if err != nil {
return err
}
// move money into account2
result.ToAccount, err = q.AddAccountBalance(ctx, AddAccountBalanceParams{
ID: arg.ToAccountID,
Amount: arg.Amount,
})
if err != nil {
return err
}
return nil
})
return result, err
}
Note that the Amount
to add to account1
should be -amount
because money is moving out.
And we’re done! Let’s rerun the test.
Yee! It passed! Let’s run the whole package test.
All passed!
And that’s it for today’s lecture about locking in db transaction and how to debug a deadlock. I hope you enjoy it.
Stay tuned for the next lecture, because I’m telling you the deadlock issue is not completely resolved yet. There are much more to learn about it.
In the mean time, happy coding and I’ll see you very soon!
If you like the article, please subscribe to our Youtube channel and follow us on Twitter for more tutorials in the future.
If you want to join me on my current amazing team at Voodoo, check out our job openings here. Remote or onsite in Paris/Amsterdam/London/Berlin/Barcelona with visa sponsorship.
Posted on August 27, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.