Franck Pachot
Posted on December 10, 2023
The lowest level of MVCC databases is Read Committed, which is commonly used as the default setting. However, it is also possibly the least understood and the least database-agnostic. As the name suggests, it only reads committed data but allows for all types of anomalies except dirty reads.
So, does using Read Committed corrupt your database? Not if you understand it and manage race conditions yourself. MVCC databases typically allow concurrent reads and writes without locking the data for reads by default. However, in certain scenarios, it may be necessary to use explicit locking to ensure data consistency. For example, if you are concerned about lost updates, you can use the SELECT FOR SHARE or SELECT FOR UPDATE commands to lock the rows you've read. This approach provides protection that is similar to the Cursor Stability or Repeatable Read isolation levels, as it prevents UPDATE or DELETE operations on the read set but with a reduced scope on a statement-by-statement basis. To prevent other anomalies, such as phantom reads, you can use LOCK TABLE to prevent new insertions from altering the read state, since you cannot lock a row that doesn't exist yet. Some databases also provide an API for custom locks, like PostgreSQL Advisory lock.
What is the advantage of Read Committed over Repeatable Read? A MVCC database allows the database to roll back and restart a statement at the statement level, avoiding the need for the application to handle serialization errors.
Every database is unique when it comes to a transparent restart and explicit locking.
Oracle doesn't offer a LOCK FOR SHARE option that blocks writers while allowing other readers to access the data. Instead, it uses LOCK FOR UPDATE, which has a lower level of concurrency as readers can block each other. On the other hand, PostgreSQL and YugabyteDB provide shared and exclusive row locks, which enable more efficient data access and better concurrency control.
In case of a conflict between the read state (using MVCC) and the write state (the current state), when Oracle or YugabyteDB encounters such a situation, it can roll back the statement to an implicit savepoint and restart it to ensure a consistent result based on a more recent read time, all of which is done seamlessly and transparently.
In the same condition, SQL Server with READ_COMMITTED_SNAPSHOT implements MVCC for Read Committed. It locks the read state instead of restarting. More details can be found at https://www.dbi-services.com/blog/how-sql-server-mvcc-compares-to-oracle-and-postgresql/, which means that readers still block writers.
When using Read Committed in PostgreSQL, inconsistencies can arise when there is a conflict during a write operation. In such cases, if a row has been modified since it was last read, PostgreSQL will re-read the row to avoid corrupting it. However, this re-read is based on a new time, which can be inconsistent with the previous reads. I think the main reason why it doesn't rollback and restart is that it requires savepoint before each statements, and those do not scale in PostgreSQL.
To ensure result consistency, YugabyteDB and Oracle follow a different approach. Instead of re-reading the row, they rollback and restart the entire statement. This ensures that the entire dataset reflects the same state from the new read time.
YugabyteDB implements a read restart to ensure statement-level consistency without blocking writes, and SELECT FOR SHARE/UPDATE for explicit locking, providing a powerful Read Committed isolation level.
The main difference between Read Committed and Repeatable Read in MVCC databases lies in the read time. In Read Committed, the read time is the start of the statement, while in Repeatable Reads and higher levels, it is the same for the whole transaction. Having a different read time for each statement doesn't protect against anomalies in complex transactions, but it allows more transparent statement restarts, which means that the database can roll back a statement (to an implicit savepoint taken before) and restart it transparently with a different read time.
In higher levels, when the read time must be the beginning of the transaction, the entire transaction must be rolled back and restarted. The database cannot perform this action on its own as it lacks knowledge of what else the application has done during the transaction. Therefore, to protect against anomalies with higher isolation levels, an MVCC database must raise a serializable error when a conflict is detected. This allows the application to retry the transaction itself.
This provides a clue for optimizing Read Committed transactions: run the entire business transaction as a single statement with WITH and RETURNING clauses instead of multiple statements.
Here are the characteristics of Read Committed isolation level in YugabyteDB (when --yb_enable_read_committed_isolation=true
)
- Read time: the start of the statement
- Possible anomalies: all (except dirty reads)
- Performance overhead: none except when using explicit locking
- Development constraint: explicit locking when repeatable reads is necessary Default in: PostgreSQL, Oracle, YugabyteDB
Posted on December 10, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024