Notes on Kafka: Distributed Systems
Eden Jose
Posted on June 15, 2021
Now that we've learned what Kafka is and it's capability to scale out massively, we'll go over what a distributed system is and how Kafka is considered as one.
A system is a collection of resources that have instructions to achieve a specific goal. A distributed system is a system that consist of multiple workers or nodes, or sometimes as worker nodes. These are the Kafka brokers.
These nodes must coordinate with each other to ensure that the goal is achieved. Without coordination, it'll be hard to divide the work, and even to track who's working on what.
Controller Election
Like any organization, each workers have their own roles and responsibilities. There's also a general hierarchy where there's a supervisor that acts as the head. In Kafka, the supervisor is the controller.
The controller is a worker node that is elected to serve as the admin, and it is usually the node that's been around the longest. The controller's responsibility includes:
- inventory of the available workers to take on the load
- maintain the item list of work assigned to the workers
- ensure active status of each worker
Getting Work Done
The controller decides which worker will take on a task that comes in. To do this, the controller must know:
- Who's available to take on the task?
- What risk policy should be considered in its decision?
- How much redundancy should be implemented in case the assigned worker fails?
If the controller determines that it needs more than one worker for a task, it will promote the chosen worker as the leader. The promoted leaders will then each recruit two more peers to participate in the replication. The number of recruited peers is called the replication factor
A quorum is then formed when the committed peers take on a new role as a follower.
In the event that the assigned worker becomes unavailable, the work already done should not be lost. The controller must ensure the task given to a worker must at least be given to another.
In case one leader is not able to establish a quorum with the available peers, the controller reassigns the task to another worker which will become a leader and the new leader will retry to form a quorum with two peers again.
The succeeding notes will discuss what's the role of the Apache Zookeeper. If you'd like to know more, please proceed to the next note in the series.
Similarly, you can check out the following resources:
Getting Started with Apache Kafka by Ryan Plant
Apache Kafka Series - Learn Apache Kafka for Beginners v2 by Stephane Maarek
Apache Kafka A-Z with Hands on Learning by Learnkart Technology Private Limited
The Complete Apache Kafka Practical Guide by Bogdan Stashchuk
If you've enjoyed this short but concise article, I'll be glad to connect with you on Twitter!. You can also hit the Follow below to stay updated when there's new awesome contents! 😃
Posted on June 15, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.