System Design: CAP Theorem
Priyank Sevak
Posted on September 18, 2024
Imagine you and a friend are watching the same live football match but on different devices. One device shows the final whistle has blown, while the other still displays a tense final minute on the clock. Frustrating, right?
This seemingly simple scenario highlights a fundamental concept in distributed systems: the CAP theorem.
Before we dive deeper into trade-offs and debates regarding the CAP theorem let's first understand what CAP stands for:
consistency: consistency in CAP theorem(different than the consistency in ACID) stands for the guarantee that the read receives the most recent write.
Availability: Guarantee that a read returns data in a decent amount of time irrespective of it being most recent. ("In a decent amount of time" is a keyword here as your website can load a page after 2-5 minutes but frustrated users wouldn't call it an "Available" system.)
Partial Tolerance: Guarantee that the system will continue to operate even if the network error/network failure occurs.
As you might have guessed Network is the most inconsistent aspect of this theorem that can not be guaranteed. Is it possible to ensure a network that never goes down? -- I believe not! so that leaves us with only 2 options either prioritize Consistency and Partial Tolerance(CP) or Availability and Partial Tolerance(AP).
What does Consistency mean?
Let's take the example of two friends watching the same live telecast of a football match. If we change the scenario and make it "2 neighbors watching the same football match." we can compromise the consistency. consistency which is also called "Linearizability" in this scenario refers to the scenario where
It guarantees that once a write operation is performed, both the nodes(users in our scenario) receive the same data.
to simplify this, If there is no communication between you and your neighbor you as a user wouldn't know if the match has finished or not. Thus, Consistency relies on the fact that there is no other channel of communication(being in the same room or texting with your friend in our scenario).
The fact that you know that the match has been completed and even after performing multiple refreshes your device is not up-to-date violates the "Consistency" in CAP. "Atomic consistency" ensures that external communication regarding the data is respected.
Availability and Partial Tolerance(AP)
As explained above for live telecast, we can only provide "Availability and Partial Tolerance(AP)." ensuring that the live telecast is available to all the users without being too laggy or buffer to wait for the latest updates.
Consistency and Partial Tolerance(CP):
Let's take another example of performing a banking transaction. You visit an ATM and attempt to withdraw some cash. The system needs to ensure consistency as explained above. If the ATM dispenses the cash without checking your balance this might result in an overdraft and you might face overdraft fees.
The system ensures that your account balance is updated simultaneously with the withdrawal. So, whether you're checking your balance online or using another ATM, you'll always see the same amount.
Even if the ATM loses connection to the central banking system, it might still allow you to withdraw a limited amount of money. However, the system will prioritize ensuring that your account balance is updated correctly once the connection is restored.
Conclusion
While the CAP theorem has long been a cornerstone of distributed systems design, its relevance has been questioned in recent years due to advancements in technology and the emergence of new paradigms. Systems like ZooKeeper and Google Docs have demonstrated the ability to provide strong consistency, high availability, and partition tolerance simultaneously, challenging the traditional CAP trade-offs. Explained more accurately here: Please stop calling databases CP or AP
However, it's important to note that these systems often achieve this by making certain trade-offs or assumptions.
In future posts, we will delve deeper into consensus algorithms and operational transformation, two key techniques that play a vital role in modern distributed systems.
Corrections and comments are welcome.
References
Posted on September 18, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.