What Is Load Balancing?

What is load balancing?

Load balancing is the process of distributing network traffic efficiently among multiple servers to optimize application availability and ensure a positive end-user experience.

Because high-traffic websites and cloud computing applications receive millions of user requests each day, load balancing is an essential capability for modern application delivery. For example, e-commerce sites rely on load balancing to ensure that web applications are able to deliver data, images, video, and pricing from web servers to consumers without delay or downtime.

Aerial view of highways with a forest overlay

Keep your head in the cloud  

Get the weekly Think Newsletter for expert guidance on optimizing multicloud settings in the AI era.

How load balancing works

Load balancing can be implemented in a couple of ways. Hardware load balancers are physical appliances that are installed and maintained on premises. Software load balancers are applications installed on privately-owned servers, or delivered as a managed cloud service (cloud load balancing).

In either case, load balancers work by mediating incoming client requests in real time and determining which backend servers are best able to process those requests. In order to prevent a single server from becoming overloaded, the load balancer routes requests to any number of available servers on premises or hosted in server farms or cloud data centers.

Once the assigned server receives the request, it responds to the client by way of the load balancer. The load balancer then completes the server-to-client connection by matching the IP address of the client with that of the selected server. The client and server are then able to communicate and carry out requested that tasks until the session is complete.

If there is a spike in network traffic, a load balancer may bring extra servers online to keep up with demand. Or, if there is a lull in network activity, the load balancer may reduce the pool of available servers. It can also assist with network caching by routing traffic to cache servers where previous user requests are temporarily stored.

IBM Pwer11

What makes cloud networking fast, secure and ready for AI?

Behind every responsive, AI-ready cloud is an infrastructure built for speed, scale, and simplicity. See how IBM Power is designed to handle data-intensive AI and networking — so you can focus on outcomes, not IT complexity.

Explore IBM Power11

Benefits of load balancing

Availability

Load balancers perform health checks on servers before routing requests to them. If one server is about to fail, or is offline for maintenance or upgrades, load balancing automatically reroutes the workload to a working server to avoid service interruptions and maintain high availability.

Scalability

Load balancing enables an on-demand, high-performance infrastructure that can handle the heaviest or lightest network traffic loads. Physical or virtual servers can be added or removed as needed, making scalability simple and automated.

Security

Load balancers can include security features such as SSL encryption, web application firewalls (WAF) and multi-factor authentication (MFA). They can also be incorporated into application delivery controllers (ADC) to improve application security. By safely routing or offloading network traffic, load balancing can help defend against security risks such as distributed denial-of-service (DDoS) attacks.

Load balancing algorithms

The method for routing a request to a particular server is defined by a load balancing algorithm. Load balancing algorithms provide different capabilities and benefits to satisfy different use cases.

Round robin

This algorithm uses the Domain Name System (DNS) to sequentially assign requests to each server in a continuous rotation. It is the most basic load balancing method, as it uses only the name of each server to determine which one receives the next incoming request.

Weighted round robin

In addition to its DNS name, each server in this algorithm is also assigned a ‘weight.’ The weight determines which servers should have priority over others to handle incoming requests. An administrator decides how each server is weighted based upon its capacity and the needs of the network.

IP hash

In this algorithm, a computation simplifies (or hashes) the IP address of the incoming request into a smaller value called a hash key. This unique hash key (which represents the user’s IP address) is then used as the basis to decide how to route the request to a specific server.

Least connections

As the name indicates, this algorithm gives priority to the server with the fewest active connections when a new client request is received. This method helps to prevent servers from becoming overloaded with connections, and to maintain a consistent load across servers at all times.

Least response time

This algorithm combines the least connection method with the shortest average server response time. Both the number of connections, and the time it takes for a server to perform requests and send a response, are evaluated. The fastest server with the fewest active connections will receive the incoming request.

Types of load balancers

While the primary purpose of any load balancer is to distribute traffic, there are several types of load balancers that serve specific functions.

Network load balancers

Network load balancers optimize traffic and reduce latency across local and wide area networks. They use network information such as IP addresses and destination ports, along with TCP and UDP protocols, to route network traffic and provide enough throughput to satisfy user demand.

Application load balancers

These load balancers use application content such as URLs, SSL sessions and HTTP headers to route API request traffic. Because duplicate functions exist across multiple application servers, examining application-level content helps determine which servers can fulfill specific requests quickly and reliably.

Virtual load balancers

With the rise of virtualization and VMware technology, virtual load balancers are now being used to optimize traffic across servers, virtual machines, and containers. Open-source container orchestration tools like Kubernetes offer virtual load balancing capabilities to route requests between nodes from containers in a cluster.

Global server load balancers

This type of load balancer routes traffic to servers across multiple geographic locations to ensure application availability. User requests can be assigned to the closest available server, or if there is a server failure, to another location with an available server. This failover capability makes global server load balancing a valuable component of disaster recovery.

How Cloud Networking Powers GenAI: A Practical Guide

Explore how hybrid cloud and modern network architecture work together to support the speed, data access, and performance AI needs. In plain terms.

What is load balancing?