Load balancing is the process of distributing network traffic efficiently among multiple servers to optimize application availability and ensure a positive end-user experience.
Because high-traffic websites and cloud computing applications receive millions of user requests each day, load balancing is an essential capability for modern application delivery. For example, e-commerce sites rely on load balancing to ensure that web applications are able to deliver data, images, video, and pricing from web servers to consumers without delay or downtime.
Connect and integrate your systems to prepare your infrastructure for AI.
Register for the guide on DaaS
Load balancing can be implemented in a couple of ways. Hardware load balancers are physical appliances that are installed and maintained on premises. Software load balancers are applications installed on privately-owned servers, or delivered as a managed cloud service (cloud load balancing).
In either case, load balancers work by mediating incoming client requests in real time and determining which backend servers are best able to process those requests. In order to prevent a single server from becoming overloaded, the load balancer routes requests to any number of available servers on premises or hosted in server farms or cloud data centers.
Once the assigned server receives the request, it responds to the client by way of the load balancer. The load balancer then completes the server-to-client connection by matching the IP address of the client with that of the selected server. The client and server are then able to communicate and carry out requested that tasks until the session is complete.
If there is a spike in network traffic, a load balancer may bring extra servers online to keep up with demand. Or, if there is a lull in network activity, the load balancer may reduce the pool of available servers. It can also assist with network caching by routing traffic to cache servers where previous user requests are temporarily stored.
Load balancers perform health checks on servers before routing requests to them. If one server is about to fail, or is offline for maintenance or upgrades, load balancing automatically reroutes the workload to a working server to avoid service interruptions and maintain high availability.
Load balancing enables an on-demand, high-performance infrastructure that can handle the heaviest or lightest network traffic loads. Physical or virtual servers can be added or removed as needed, making scalability simple and automated.
Load balancers can include security features such as SSL encryption, web application firewalls (WAF) and multi-factor authentication (MFA). They can also be incorporated into application delivery controllers (ADC) to improve application security. By safely routing or offloading network traffic, load balancing can help defend against security risks such as distributed denial-of-service (DDoS) attacks.
The method for routing a request to a particular server is defined by a load balancing algorithm. Load balancing algorithms provide different capabilities and benefits to satisfy different use cases.
Round robin
This algorithm uses the Domain Name System (DNS) to sequentially assign requests to each server in a continuous rotation. It is the most basic load balancing method, as it uses only the name of each server to determine which one receives the next incoming request.
Weighted round robin
In addition to its DNS name, each server in this algorithm is also assigned a ‘weight.’ The weight determines which servers should have priority over others to handle incoming requests. An administrator decides how each server is weighted based upon its capacity and the needs of the network.
IP hash
In this algorithm, a computation simplifies (or hashes) the IP address of the incoming request into a smaller value called a hash key. This unique hash key (which represents the user’s IP address) is then used as the basis to decide how to route the request to a specific server.
Least connections
As the name indicates, this algorithm gives priority to the server with the fewest active connections when a new client request is received. This method helps to prevent servers from becoming overloaded with connections, and to maintain a consistent load across servers at all times.
Least response time
This algorithm combines the least connection method with the shortest average server response time. Both the number of connections, and the time it takes for a server to perform requests and send a response, are evaluated. The fastest server with the fewest active connections will receive the incoming request.
While the primary purpose of any load balancer is to distribute traffic, there are several types of load balancers that serve specific functions.
Network load balancers
Network load balancers optimize traffic and reduce latency across local and wide area networks. They use network information such as IP addresses and destination ports, along with TCP and UDP protocols, to route network traffic and provide enough throughput to satisfy user demand.
Application load balancers
These load balancers use application content such as URLs, SSL sessions and HTTP headers to route API request traffic. Because duplicate functions exist across multiple application servers, examining application-level content helps determine which servers can fulfill specific requests quickly and reliably.
Virtual load balancers
With the rise of virtualization and VMware technology, virtual load balancers are now being used to optimize traffic across servers, virtual machines, and containers. Open-source container orchestration tools like Kubernetes offer virtual load balancing capabilities to route requests between nodes from containers in a cluster.
Global server load balancers
This type of load balancer routes traffic to servers across multiple geographic locations to ensure application availability. User requests can be assigned to the closest available server, or if there is a server failure, to another location with an available server. This failover capability makes global server load balancing a valuable component of disaster recovery.
Discover the leading enterprise observability platform for hybrid clouds. Improve application performance management and accelerate CI/CD pipelines no matter where the applications reside.
IBM Cloud load balancers enable you to balance traffic among servers to improve uptime and performance.
Continuously and accurately match application demand to cloud resources in real time and feel confident about your cost allocation.
Explore how cloud computing transforms IT infrastructure into a utility, letting you ‘plug in' to computing resources and applications over the internet.
Discover how open-source container orchestration platform Kubernetes automates deployment, management and scaling of containerized applications.