A Network Load Balancer (NLB) is a type of load balancer that operates at Layer 4 (the Transport Layer) of the OSI model. This means it makes its traffic distribution decisions based on network-level information, primarily IP addresses and TCP/UDP ports.
Think of a Network Load Balancer as a highly efficient “traffic cop” that directs incoming network connections to a group of healthy backend servers, without inspecting the actual content of the application-layer data (like HTTP headers or URLs).
Why Do We Need Load Balancers?
Before diving into NLBs specifically, it’s important to understand the general purpose of load balancing:
-
High Availability (HA): If one server fails, the load balancer automatically stops sending traffic to it and redirects requests to other healthy servers, ensuring continuous service. This eliminates a single point of failure.
-
Scalability: As user traffic grows, you can easily add more servers (backend instances) to your application. The load balancer then distributes the load across these new servers, allowing your application to handle increased demand without performance degradation.
-
Performance Optimization: By distributing traffic evenly, load balancers prevent any single server from becoming overwhelmed, leading to faster response times and a better user experience.
-
Maintainability: Servers can be taken offline for maintenance, upgrades, or troubleshooting without impacting the availability of the application, as the load balancer simply routes traffic to the remaining healthy servers.
How a Network Load Balancer Works (Layer 4 Focus):
When an NLB receives an incoming connection, it performs the following steps:
-
Listens for Connections: The NLB listens for incoming client connections on specific IP addresses and ports.
-
Inspects Network Information: It primarily examines the source IP address, source port, destination IP address, destination port, and protocol (TCP, UDP, etc.) of the incoming connection.
-
Selects a Backend Server: Based on a configured load balancing algorithm and the health status of the backend servers, the NLB chooses a server to handle the connection. Common algorithms include:
-
Round Robin: Distributes requests sequentially to each server in a rotating fashion.
-
Least Connection: Directs new connections to the server with the fewest active connections.
-
IP Hash: Uses a hash of the client’s source IP address (and sometimes destination IP) to ensure that connections from the same client are consistently routed to the same backend server (session stickiness).
-
-
Routes the Connection (Directly): Unlike Application Load Balancers (which act as proxies), many Network Load Balancers operate in a “passthrough” or “direct server return” (DSR) mode. This means:
-
The NLB forwards the client’s connection directly to the selected backend server.
-
Crucially, the client’s original source IP address is preserved and visible to the backend server.
-
The backend server then sends its response directly back to the client, bypassing the load balancer on the return path. (Some NLBs can also operate in a proxy mode for the return traffic, but direct return is common for performance).
-
-
Health Checks: NLBs continuously monitor the health of their registered backend servers using Layer 4 health checks (e.g., performing a simple TCP handshake on a specific port). If a server fails its health check, the NLB stops sending new connections to it until it recovers.
Key Characteristics of Network Load Balancers:
-
Layer 4 Operation: Focused on the Transport Layer (TCP, UDP).
-
High Performance & Low Latency: Because they don’t inspect application-layer content, NLBs are extremely fast and efficient, capable of handling millions of requests per second with very low latency. This makes them ideal for high-throughput, real-time applications.
-
Protocol Agnostic (beyond HTTP/S): Can load balance any TCP or UDP traffic, including protocols like DNS, FTP, SSH, gaming servers, VoIP, streaming, and more, not just HTTP/S. They can also often handle ESP, GRE, ICMP, and ICMPv6.
-
Source IP Preservation: Often preserves the client’s original source IP address, which is important for logging, security, and applications that rely on client IP for specific functionality.
-
Static IP Addresses: Many cloud NLBs provide static IP addresses, which can be useful for DNS records, firewall whitelisting, and predictable endpoints.
-
TLS Termination (Optional): Some modern NLBs (like AWS NLB) can perform TLS termination, offloading the SSL/TLS decryption from the backend servers.
Network Load Balancer vs. Application Load Balancer (ALB):
This is a crucial distinction in modern load balancing:
Benefits of Network Load Balancers:
-
Exceptional Performance: Ideal for applications demanding ultra-low latency and very high throughput.
-
Protocol Flexibility: Can handle virtually any TCP or UDP-based application, making them versatile.
-
Simple and Efficient: Less processing overhead than ALBs because they don’t inspect application content.
-
Enhanced Security (with proper configuration): Preserving source IP helps with logging and security analysis on backend servers. Static IPs simplify firewall rules.
-
Scalability for Core Services: Excellent for scaling backend database services, DNS servers, gaming servers, or other non-HTTP/S applications.
In summary, a Network Load Balancer is your go-to solution when you need blazing fast, low-latency load distribution for any TCP or UDP traffic, where the decision to route a connection is based purely on network-level information, and preserving the client’s source IP is important. For HTTP/S web applications with complex routing or content-based needs, an Application Load Balancer is usually the better choice.