What makes a database ACID compliant?🧪

adhirajk

Adhiraj Kinlekar

Posted on November 20, 2023

What makes a database ACID compliant?🧪

Before diving into ACID, let's first establish a fundamental concept in database management: transactions. A database transaction is a logical unit of work that consists of one or more operations executed against a database. It's worth noting that, in most databases, a transaction is always initiated, with each individual database operation treated as a transaction. Users can explicitly start one, and if a transaction is not explicitly started, the database system will automatically start it and then promptly conclude the transaction.

Now, turning our attention to ACID, which stands for Atomicity, Consistency, Isolation, and Durability, it represents a set of properties that ensure the reliability and integrity of data within a database system.

1. Atomicity

Atomicity is a fundamental attribute of transactions, emphasizing their indivisible nature. Transactions, consisting of a set of operations, either successfully complete in entirety or undergo a complete rollback in the event of failures, such as power outages, exceptions, or crashes. This ensures that the database remains in a consistent state despite unforeseen challenges.

Let's consider a simple example of an atomic transaction in a banking context:

Suppose you want to transfer money from one account to another.

Debit:
Deduct $100 from Account A.

Credit:
Add $100 to Account B.

In an atomic transaction:

Both the debit and credit operations must successfully complete, resulting in Account A being debited by $100 and Account B being credited by $100.

Or, if any part of the transaction fails (e.g., Account B is inactive), the entire transaction is rolled back. In this case, no money is deducted from Account A. This ensures that the bank account system remains in a consistent state, and the money transfer is treated as an indivisible unit. If something goes wrong, the system won't end up in an inconsistent state where money is deducted from one account but not added to the other.

Dwight meme

2. Isolation

It refers to the ability to concurrently process multiple transactions in a way that one does not affect another. Changes occurring in a particular transaction will not be visible to any other transaction until that specific change in that transaction has been committed.

Isolation levels define the degree to which a transaction must be isolated from the data modifications made by any other transaction in the database system.

A transaction isolation level is defined by the following phenomena –

Dirty read - A dirty read is a situation when a transaction reads data that has not yet been committed.

Non repeatable read - A non-repeatable read occurs when a transaction needs to read the same row twice, and there is a mismatch between the values in these rows.

Phantom read - A phantom read occurs when a transaction reads a set of rows satisfying a specific condition in a read query, and another transaction inserts new rows that would meet the condition specified in the aforementioned read query. As a result, upon subsequent reads, the result will include new rows that were not present in the initial read.

Lost update - It occurs when a same record is updated by two different transactions at the same time.


Isolation levels in database transactions vary depending on these phenomena, and the following are some examples -

Read Uncommitted :
Read Uncommitted is the lowest isolation level. In this level, a transaction can read changes made by other transactions, regardless of whether those changes have been committed or not. At this level, transactions are not isolated from each other. This isolation level does not prevent dirty reads, non-repeatable reads, or phantom reads and it is susceptible to lost updates.

Read Committed :
This isolation level guarantees that any data that is being read in a transaction is committed. The transaction secures a read lock (for read-only operations) or a write lock (when updating or deleting the row) on the current row to block other transactions from modifying or deleting it. The read locks are released as the transaction moves away from the current row. It holds write locks until it is committed or rolled back.

Repeatable Read :
This is one of the most restrictive isolation level. Beyond the assurance of encountering only committed data, any rows read by a transaction remains constant throughout subsequent reads within the same transaction. The transaction enforces read locks on all rows it accesses and write locks on any rows subject to insertion, updating, or deletion. The transaction releases its locks when it is committed or rolled back. This effectively prevents issues like dirty reads, non-repeatable reads, and helps mitigate the risk of lost updates. However, it's important to note that phantom reads remain a possibility.

Serializable :
It is the highest level of isolation. Serializable isolation ensures that transactions are executed in a manner that maintains serializability. This is the strongest of all the isolation levels and guarantees pure isolation. The transaction holds a read lock (if it only reads rows) or a write lock on the range of rows it affects. For example, if the transaction includes the SQL statement DELETE FROM Orders WHERE Status = 'PLACED,' the range is all rows with a Status of 'PLACED'; the transaction write-locks all rows in the Orders table with a Status of 'PLACED' and does not allow any rows to be inserted or updated that match this condition. Serializable isolation prevents dirty reads, non-repeatable reads, phantom reads, and prevents lost updates.

Snapshot :
In Snapshot Isolation, each query in a transaction only sees changes that have been committed up to the start of the transaction. It is considered an optimistic form of concurrency control because it allows transactions to proceed with the optimistic assumption that conflicts are rare. The transaction itself will only succeed if none of the updates it has made conflict with any concurrent updates made since that snapshot. Snapshot isolation prevents dirty reads, non-repeatable reads, phantom reads, and prevents lost updates.

It's worth noting that the implementation of the isolation levels can vary across different database management systems. The details mentioned here are tailored with Microsoft SQL Server in mind.


Let's consider a scenario where isolation is not maintained, leading to a "dirty read" in a banking context:

Transaction 1 (Jane):
Jane initiates a transaction to transfer $100 from her account to John's account.
Transaction 1 has not been committed yet.

Transaction 2 (John):
John initiates a transaction to check his account balance.
John's transaction reads the account balance, which includes Jane's uncommitted transfer.

Now, due to the lack of isolation:

John sees an updated balance in his account, reflecting Jane's transfer, even though Jane's transaction has not been committed.
The problem: John has read uncommitted data, leading to a "dirty read" where he observes changes made by Jane that are not finalized. In the event that transactions do not reach completion, it can result in significant confusion and potential discrepancies.
This example illustrates the potential issues that can arise when transactions are not properly isolated, allowing one transaction to see uncommitted changes made by another transaction.

3. Consistency
Consistency refers to the correctness and integrity of a database's data during and after a transaction. when transactions are being processed or executed within the database system the data should not be inconsistent and should always be in valid state. The “valid state” refers to the rules specified such as integrity, referential, not null, data types or other SQL constraints. To ensure consistency, database operations must be atomic, and transactions must be isolated from one another and if data at any stage goes against the specified rules or constraints, the whole transaction should fail.

We should not mistake strong/eventual consistency with the consistency guarantee of ACID databases. Strong/eventual consistency refers to the degree of consistency across multiple nodes of a database system. However, when replications are performed asynchronously (changes are not applied to all nodes immediately, and the primary database commits transactions without waiting for acknowledgment from replicas), it can introduce a delay between the time a transaction is committed on one node and when it is replicated to other nodes. During this lag, there is a window of time when the data on the replicated nodes may not be consistent with the source node.

4. Durability 

Durability ensures that once a transaction is committed, its effects are permanent and survive any subsequent failures, such as power outages, system crashes, or errors. The changes made during a transaction are written to durable storage, such as a solid-state drive, ensuring permanence even in the case of failures.

This durability is commonly achieved through mechanisms like transaction logging, where changes are initially recorded in a log stored on durable storage before being applied to the database. The transaction log serves as a record for recovery in case of failure. When the system is restarted or recovered, the committed changes are still present, and the DBMS uses it to recover the database to a consistent state by applying all committed changes.

Song of the day: Dream Theater - Pull Me Under

💖 💪 🙅 🚩
adhirajk
Adhiraj Kinlekar

Posted on November 20, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

What makes a database ACID compliant?🧪