Dummy Guide on How to Scale your Application

This is Giorno 👱‍♂️
Giorno finally finished building and deploying his side project.

He showed it to his auntie and the next day she makes a Tiktok about it.

Boom! Thousands of concurrent users are trying to reach Giorno's application and the poor server in Giorno's garage is getting absolutely destroyed.

Giorno decided to scale his application so it can handle the next marketing campaign done by his auntie on Tiktok. They expect millions of users.

Server Scaling

The first thing Giorno did is that he ran to nearest hardware store and bought latest CPU out there and replaced it with the old CPU he had in that single server in his garage. This is called Vertical Scaling.

However, upgrading the server with new hardware is not enough, and if that single server in the garage is down, the whole application goes down!

Giorno then decides to buy many new servers. So that it is enough to handle the expected number of users, and so that if one of the servers is down, the application is still up and running. This is called Horizontal Scaling.

However, another problem meets Giorno, the client doesn't know which server does it send the request to, there are many of them!

To solve this problem a load balancer is used. The Load balancer IP is public and is given to the host and the other servers are turned into private IPs. When the client makes a request to the load balancer, it checks one of the healthy server and sends it the request.

The healthy servers are known using the Heartbeat Protocol, servers send a request to the load balancer every small time unit to tell the load balancer that they are available to take requests.

Database Scaling

The application can now serve many concurrent requests and if a server goes down, the application is still running. However, with the expected number of users, and considering it is a single database, the queries are expected to get slower.

Giorno needs to scale the database the same way he did with the servers, by creating more instances of it, called Replicas, this process is called Replication.

The main problem that met Giorno with this approach is how the multiple replicas would be updated. If a piece of data was inserted in database 1. and the same data was required to be read from database 2, how would DB 2 be updated with the new writes of DB 1.

Giorno made a master database, which will accept all writes then updates all the other databases.

Each server sends the write requests to the master database and the read requests to one of the many other databases. If the master database goes down, one of the other databases becomes the new master. This is called Leader-based Replication.

This approach works well with the nature of Giorno's application which has many reads and very little writes.

Another thing that Giorno did is that he used an In-memory Storage, since the memory is much faster than disks, Giorno cached the data that is often used.
// Get the data from the cache (Cache Hit) // if it doesn't exist, get the data from DB then Cache it (Cache Miss) // Make sure the cached data is updated if it's DB data receives a new Write

CDNs

To make sure the client recieves the images and static files fast. Giorno used a CDN to cache the static files, which are sent to the client from the nearest server.

Giorno's application is now ready to recieve the millions of expected requests after his auntie's Tiktok marketing campaign.

Blog

Dummy Guide on How to Scale your Application

Abdelrahman Mostafa

Server Scaling

Database Scaling

CDNs

Join Our Newsletter. No Spam, Only the good stuff.

Related