šš¶šæšš šæš²š®š±, š”š¼š»-šæš²š½š²š®šš®šÆš¹š² šæš²š®š±, š®š»š± š£šµš®š»šš¼šŗ šæš²š®š± šš¶ššµ Real š¹š¶š³š² š²š š®šŗš½š¹š²š
jairajsahgal
Posted on December 30, 2023
In the field of database systems, concurrency control is a crucial aspect to ensure that multiple transactions can execute simultaneously without causing inconsistencies or data corruption. To manage this, databases use various isolation levels and techniques like locking mechanisms to protect data integrity. However, these mechanisms may introduce certain issues known as Dirty Read, Non-repeatable Read, and Phantom Read.
š. šš¶šæšš š„š²š®š±:
A dirty read occurs when a transaction reads uncommitted data from the database. In other words, it reads data that has been modified by another
transaction but not yet committed. This can lead to incorrect results being displayed to the user. To prevent this issue, databases generally use the Two-Phase Locking (2PL) protocol, which ensures that a transaction holds locks only for the duration of the transaction.
# Example using SQL and Python's psycopg2 library:
import psycopg2
conn = psycopg2.connect(database="your_db", user="your_user", password="your_password")
cursor = conn.cursor()
def transaction1():
cursor.execute("UPDATE table SET column=column+1 WHERE condition")
conn.commit() # commits the changes
def transaction2():
cursor.execute("SELECT * FROM table") # reads uncommitted data
print(cursor.fetchall())
š®. š”š¼š»-šæš²š½š²š®šš®šÆš¹š² š„š²š®š±:
A non-repeatable read occurs when a transaction re-reads data that has been modified by another concurrent transaction. This issue arises because databases do not guarantee that the data will remain unchanged between two reads of the same transaction, even if no other transactions are modifying it. To address this, 2PL is also used to prevent non-repeatable reads.
# Example using SQL and Python's psycopg2 library:
import psycopg2
conn = psycopg2.connect(database="your_db", user="your_user", password="your_password")
cursor = conn.cursor()
def transaction1():
cursor.execute("SELECT * FROM table WHERE condition") # reads data
rows = cursor.fetchall()
cursor.execute("UPDATE table SET column=column+1 WHERE condition") # modifies data
conn.commit() # commits the changes
cursor.execute(rows) # reads data again, but results can be different due to non-repeatable read
šÆ. š£šµš®š»šš¼šŗ š„š²š®š±:
A phantom read occurs when a transaction retrieves multiple rows from a query based on certain conditions, but later, when it re-executes the
same query with the same conditions, additional rows appear that were inserted by other concurrent transactions in the meantime. To address this
issue, databases use multiversion concurrency control (MVCC) and version store, which allow multiple versions of data to coexist until a commit
or rollback is performed.
# Example using SQL and Python's psycopg2 library:
import psycopg2
conn = psycopg2.connect(database="your_db", user="your_user", password="your_password")
cursor = conn.cursor()
def transaction1():
cursor.execute("SELECT * FROM table WHERE condition") # reads data
rows = cursor.fetchall()
cursor.execute("INSERT INTO table (column) VALUES ('some value')") # inserts new row
conn.commit() # commits the changes
cursor.execute("SELECT * FROM table WHERE condition") # re-reads data, but results can be different due to phantom read
To avoid these issues, databases use various techniques like 2PL, MVCC, and version store, depending on the isolation level chosen for the transaction. By ensuring data integrity, these mechanisms prevent dirty reads, non-repeatable reads, and phantom reads, allowing transactions to run concurrently without causing inconsistencies or data corruption.
š„š²š®š¹ šš¶š³š² šš š®šŗš½š¹š²
Imagine you are withdrawing money from your bank account. You have two options for checking the balance: Option A and Option B.
Option A: Dirty Read, Non-repeatable Read
- You check the balance of your account, showing $10,000.
- Another transaction transfers $5,000 from your account to another account without committing yet (dirty read).
- You again check the balance of your account and see $5,000.
- The transfer transaction gets rolled back due to an error.
- If you try to re-check the balance using Option A, it will show $10,000, but it should ideally be $5,000 (non-repeatable read).
Option B: Phantom Read
- You check all transactions in your account from last month, showing a total of $10,000.
- Another transaction makes a deposit of $2,000 into your account without committing yet.
- You try to re-check the transactions using Option B, but it will now show an additional $2,000 deposit (phantom read).
- If you try to re-execute the query, it will show both the original transactions and the new $2,000 deposit (phantom read persists).
In this example, Option A demonstrates dirty reads and non-repeatable reads, while Option B illustrates a phantom read. Databases employ
isolation levels, locking mechanisms, or multiversion concurrency control to prevent these issues and maintain data integrity during concurrent
transactions.
Chapter on Concurrency Controls : https://www.cs.uct.ac.za/mit_notes/database/pdfs/chp13.pdf
Posted on December 30, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.