Load Balancers: Do We Actually Need Them?

What is a load balancer? What problems does it actually solve?

Performance, availability, and economy are three significant issues that load balancers were primarily designed to address.

Performance is constrained by how much work a server can physically accomplish in a certain length of time. But, given the enormous amounts of user data that can be handled, consumers' desire for sophisticated software continues to grow exponentially beyond the capability of this technology. This can quickly lead to the failure of the server.

The server's probable failure due to a heavy load of requests leads to the availability issue. When you build your application on a single server with the possibility of exceeding its capacity, it leads to a single point of failure. Having your server duplicated is one way to address this problem. Making copies of your server would ensure that, in the event of a server failure, your complete application wouldn't fall offline. Customers should only notice a minimal amount.

This is the solution. Running a large number of servers and being able to re-route traffic away from the server in real-time.

This leaves us only two options for scaling: vertically or horizontally.

We can opt for vertical scaling, that is, by acquiring and upgrading server capacity as needed to meet the growing demand of our consumers. This may be a viable choice in some cases, but it is not a cost-effective option for the vast majority of application workloads.

And this leads to the third problem: how can we as a business build a high-performance, reliable product with little to no downtime while still optimizing our return on investment? Load balancers are the unambiguous solution.

A load balancer is a networking software device that aids in the distribution of incoming network traffic among various servers or resources. It guarantees that no one server or resource is overburdened with requests at any given moment.

A load balancer works as an intermediary between client devices and a collection of servers. This is accomplished by evenly spreading incoming network traffic among the servers.

Load Balancer: How It Works

The load balancer displays to the client a virtual IP address (VIP)—a virtual IP address that does not correspond to a real network interface — that represents an application server.

When a client connects to the VIP, the load balancer decides whether or not to route the connection to a certain application on the server using its algorithms and parameters.

Many criteria, including authentication, validation, and authorization, might prevent the load balancer from sending requests between the two parties if specified as some of its parameters. But regardless, the load balancer would continue to maintain and monitor the connection between both parties throughout the length of the connection.

Take the professional football business model as an analogy: imagine a sports agent like Roberto Calenda (the current agent for Victor Osimhen) negotiating a new contract for his star athlete, Victor Osimhen.

Calenda (the load balancer) takes a request from Osimhen (the client), who wants to leave Napoli, and sends it to different football teams that have their interest, like Manchester United (the server).

Manchester United responds with an offer, which Calenda then passes back to Osimhen, and this goes back and forth until they both come to a suitable offer.

In the process of back-and-forth communication between the server and the client, Calenda can also provide additional functionalities. He can decide to allow or deny some details (security) from one party to the other. He can validate whether the representative of Manchester is authentic or just a scam (authentication).

Load Balancer: Use Case

Content-based Routing

Content-based routing is the execution of established rules that govern the allocation and configuration of network traffic between users and web applications depending on the content being sent.

If your application is made up of many services, a load balancer can route a request to one of them based on the request's information, such as the host field, path URL, HTTP header, HTTP method, query string, or source IP address. Let’s talk briefly about some of them:

Host-based routing: You can route a client request based on the host field of the HTTP header, allowing you to route to multiple domains from the same load balancer.
Path-based routing: You can route a client request based on the URL path of the HTTP header.
HTTP header-based routing: You can route a client request based on the value of any standard or custom HTTP header.
HTTP method-based routing: You can route a client request based on any standard or custom HTTP method.
Query string parameter-based routing: You can route a client request based on a query string or query parameters.
Source IP address-based routing: You can direct traffic to a specific destination based on the source IP address or a combination of the source and destination IP addresses.

User Authentication

You can offload the application authentication functionality to a load balancer. It will securely authenticate users as they access your cloud applications. For example, you can use an AWS application load balancer to seamlessly build your application's authentication.

This is possible through the Amazon Cognito service, which allows end users to authenticate with social identity providers — Google, Facebook, and Amazon — and also with enterprise identity providers — Microsoft Active Directory via SAML or any OpenID Connect-compliant Identity Provider (idP).

Allowing the load balancer to handle some salient components of your application makes it easier to focus on features that will improve the core value of your application.

TLS/SSL Offloading

You can offload the extra burden that SSL/TLS — handshake and encryption/decryption — adds to your application by delegating the task to your load balancer. This way, it offloads additional tasks from your application servers so they can focus on their primary functions. And, depending on the load balancer, it should also aid with HTTPS inspection, reverse-proxying, cookie persistence, traffic control, and so on.

When you consider the fact that attackers might hide in your encrypted communication and enter your application, being able to analyze HTTPS traffic becomes virtually mandatory.

So, TLS/SSL offloading extends beyond encryption and decryption; some of the extra features listed above also play a significant role in determining the security of your application.

Load Balancer: Algorithms

Web applications are diverse, and the optimal method for load-balancing connections to them will depend on the infrastructure in place and the organization’s requirements.

Here are five algorithms used to load balance connections to application servers:

Round Robin

Round-robin is the simplest basic load-balancing algorithm. Client requests are routed to application servers in a simple cycle. If you have three application servers, the first client request goes to the first application server in the list, the second client request goes to the second application server, the third client request goes to the third application server, and so on.

It is best suited for predictable client requests that are distributed throughout a server pool with approximately equal processing capabilities and available resources (such as network bandwidth and storage).

Least Connections

Least connections load balancing is a dynamic load balancing mechanism that distributes client requests to the application server with the fewest active connections at the time the client request is received.

The algorithm places a special emphasis on dynamic connection loads. Simply because application servers have comparable specs and one server may get overwhelmed as a result of longer-lived connections.

When incoming requests have different connection delays and a group of servers with approximately equivalent processing power and resources is available, the least-connection load-balancing algorithm can be used. If clients can maintain connections for a lengthy period, a single server may wind up having all of its capacity used by many connections comparable to this.

Source IP Hash

The source IP hash load balancing technique generates a unique hash key from the client's source and destination IP addresses to attach the client to a specific server.

Because the key can be regenerated if the session disconnects, reconnection requests are sent to the same server as before. This is referred to as server affinity.

This load-balancing strategy is best suited when a client must always connect to the same server on each subsequent connection, such as in a shopping cart situation, where products placed in a cart on one server should be available when a user joins later.

URL Hash

This load balancing mechanism is similar to source IP hashing, except that the hash formed is dependent on the client request's URL. This guarantees that all client queries to the same URL are sent to the same back-end server.

A common use case would be to route traffic to an optimized media server capable of playing video or an optimized server for a certain purpose.

DNS

This method is widely used for simple load balancing as well as traffic distribution over numerous data centers, perhaps in various geographic locations. Instead of having separate hardware for detecting which server is in charge of traffic management, the DNS load balancer can distribute traffic across a group of server machines.

The name server keeps a list of IP addresses for the different servers to which requests can be directed. In essence, whenever someone requests the name server for a certain domain, the name server delivers this list of IP addresses and changes the order of the addresses.

Load Balancer: Types

In this part, we will take a brief look at how each load balancer type has evolved and what difficulties it seeks to solve.

Network Server Load Balancers

Load balancers hit the market in the mid-1990s, with the basic capability of managing connections based on packet headers such as source IP, destination IP, source port, destination port, and IP protocol.

This is the network server load balancer's entry point, also known as the Layer 4 load balancer. It operates similarly to a firewall, routing connections between servers and clients based on simple IP address and port information, as well as health checks.

It is a flow control that intelligently decides data transmission speed and sending goal amounts, ensuring that senders with faster connections do not overpower receivers with slower connections.

The most significant aspect of Layer 4 load balancing is that the application servers handle all of the work and build a direct relationship with the clients that is quick, transparent, and simple to comprehend.

Application Load Balancers

At this point, we have a Layer 7 load balancer that runs at the highest point possible and offers more context on application layer protocols like HTTP.

The Layer 7 load balancer, which operates at the application layer, could now use some of the new features to make more complex and informed decisions based on the payload's content and to apply optimizations and changes to the content (such as HTTP header manipulation, compression, and encryption) while monitoring the health of applications.

A layer 7 load balancer is sometimes known as a reverse proxy since it may maintain two TCP connections (one with the server and one with the client).

Global Server Load Balancer (GSLB)

GSLB is a dynamic DNS solution that appears and monitors many sites via configuration and health checks.

It allows multi-data center resiliency by utilizing service resource awareness and DNS to direct traffic across geographically distributed pools based on previously specified business logic.

Client traffic is routed to the location with the greatest application performance and client experience, based on the client's location and the observed availability of each site.

Hardware vs Software vs Virtual Load Balancing

Load balancers began as hardware solutions. They were efficient at the time because they supplied a basic appliance that combines functionality with a focus on performance.

While hardware-based load balancers are intended for installation within data centers, software-based load balancing solutions provide flexibility and the ability to interface with virtualization orchestration systems. DevOps and CI/CD techniques are commonly used in software-based workplaces. These are also some of the aspects that contribute to its ease of integration.

Elastic Load Balancer

Elastic load balancer (ELB) systems are significantly more complex, providing cloud computing operators with a scalable capacity based on traffic demands at any given time. It automatically adjusts traffic to an application as demand fluctuates over time.

It can achieve this by employing request routing algorithms to disperse incoming application traffic over several instances or grow them as needed, boosting the fault tolerance of your applications.

Hopefully, you found the article useful! Let's connect on Twitter and LinkedI

Blog