SELECT FOR UPDATE and its behavior with foreign keys in PostgreSQL.
Ankita
Posted on October 22, 2022
The general approach followed by developers is to pre-acquire the lock(FOR UPDATE)on the data sets that are being updated/deleted concurrently.
This approach may be considered to solve concurrency problems occurs due to concurrently running transactions. if you are confused that what is concurrency problems? then check out my blog concurrency in dbms.
To start with a simple scenario, let us consider the following example with two tables and let us assume that an application is running concurrent update operations on the parent table.
CREATE TABLE parent(id INT PRIMARY KEY, balance INT);
CREATE TABLE child(id INT REFERENCES parent(id) ON DELETE CASCADE, trx_timestamp TIMESTAMP);
A foreign key with a cascade delete means that if a record in the parent table is deleted, then the corresponding records in the child table will automatically be deleted. This is called a cascade delete.The principal advantage to the cascading-deletes feature is that it allows you to reduce the quantity of SQL statements you need to perform delete actions.
ON DELETE CASCADE option is to specify whether you want rows deleted in a child table when corresponding rows are deleted in the parent table. If you do not specify cascading deletes, the default behaviour of the database server prevents you from deleting data in a table if other tables reference it.
Let's Insert some data in both table.
INSERT INTO parent VALUES (1,1000),(2,2000),(3,3000) ;
INSERT INTO child VALUES( 1 ,now()),( 2 ,now());
The following is an example transaction on the parent table which is using the FOR UPDATE clause to avoid any deadlock problems for this specific transaction.
Now, consider that I have an another concurrent transaction which tries to insert the data into the child table.
Following is the behaviour observed, when we run these two transactions concurrently.
As you see in the above output, the insert operation on the child table is in waiting state.
Even though the FOR UPDATE clause does not exist in the SESSION 2, the transaction would still remain in a waiting state. The reason for this waiting state is due to the PostgreSQL's FOREIGN KEY data validation process between the child and parent tables.
By default, when we use FOR UPDATE, it means that it is going to return the rows which are retrieved by the select statement by acquiring the lock on key columns.
In the above example we are actually updating the non key column(balance), by acquiring the lock on id column.
As we hold the lock on primary key column id, when we try to insert any value into the child table, then it will
try to acquire the SHARE lock on the parent table's id column. In this case, the SESSION 2 transaction will wait until the SESSION 1 transaction is complete.
In majority of the cases, we actually do not need to update any key columns,
rather we update only the non key columns like salary, phone number, address, etc. While updating the non key column values, we could force the behavior to not hold any lock on the key columns. To achieve this with SELECT FOR UPDATE in PostgreSQL, we could use the NO KEY option in the FOR UPDATE clause.
Now, let try the same example with the NO KEY option.
successfully inserted row in child table.
Posted on October 22, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024