As a website grows, a single server sooner or later becomes insufficient. There is also a limit to moving to a more powerful server. This is where load balancing comes in: by distributing traffic across multiple servers, it increases both capacity and reliability. This guide explains how load balancing works.
Related reading: Server performance and bottlenecks · Server monitoring basics · What is a CDN
What Is Load Balancing?
Load balancing is the process of distributing incoming traffic across multiple servers (a backend pool). The component that does this distribution is called a load balancer. The visitor connects to a single address; the load balancer routes the request to a suitable server behind it.
This has two core gains: scalability (increasing capacity by adding more servers) and high availability (the site staying up even if one server crashes).
Distribution Algorithms
The load balancer decides which server to send a request to based on an algorithm:
| Algorithm | How It Works |
|---|---|
| Round Robin | Distributes requests to servers in turn, cyclically |
| Least Connections | Sends to the server with the fewest active connections at the moment |
| IP Hash | Always routes the same visitor to the same server |
| Weighted | Gives more share to powerful servers (weighted distribution) |
Health Checks
A critical part of load balancing is the health check. The load balancer checks each server behind it at regular intervals — for example by sending a request to a specific page. If a server is not responding, the load balancer stops sending traffic to it and routes requests to healthy servers. When the server becomes healthy again, it is brought back into the pool.
Thanks to this mechanism, a server crashing does not mean downtime for the visitor — traffic silently shifts to the other servers.
L4 and L7 Load Balancing
Load balancing can work at two levels. L4 (transport layer) load balancing distributes the request looking only at IP and port — it is fast but does not look at content. L7 (application layer) load balancing can look at the HTTP request's content (URL, headers, cookies); for example it can route /api requests to one group and static requests to another. Tools like Nginx can do L7 load balancing.
Session Management
A point to watch in load balancing is user sessions. If a user's requests are distributed to different servers, the session is lost if session information is stored on only a single server.
Frequently Asked Questions
How many servers does load balancing need?
At least two backend servers make sense — if one crashes the other takes over. The actual server count depends on your traffic and availability goal; the load balancer itself can also be set up redundantly so it is not a single point of failure.
Are load balancing and a CDN the same thing?
No, but they complement each other. A CDN distributes static content geographically; load balancing distributes dynamic traffic across multiple origin servers. Large systems use both together.
Is load balancing necessary for a small site?
Usually no. A single adequate server is suitable for most small and medium sites. Load balancing becomes meaningful when high traffic or uninterrupted service (high availability) is a critical requirement.
Build load balancing and highly available architectures for your growing project with KEYDAL hosting solutions. Explore KEYDAL hosting