Load Balancers in Microsoft Azure

Load Balancer Overview

The simplest definition of a load balancer is a system that allows a group of machines to appear as a single machine to service a user request. The load balancer job is to accept a request, decide which machine in the group can handle the user request, and then forward the request to that machine. There are multiple algorithms and different types of load balancers that process requests at different layers in the TCP/IP stack.

Microsoft Azure supports load balancers for virtual machines and cloud services to allow your applications to scale up and also to prevent a single system failure that could result in loss of service availability. There are three load balancers in Azure: Azure Load Balancer, Internal Load Balancer (ILB), and Traffic Manager.

Azure Load Balancer

The Azure Load Balancer is a TCP/IP layer 4 load balancer that utilizes a hash function based on a 5 tuple (source IP, source port, destination IP, destination port, protocol type) to distribute traffic across virtual machines in the same load balancer set. While the hash function computation is used to distribute the load, the traffic for the same 5 tuple flows to the same endpoint during a session. If a client closes a session and then opens a new session, the traffic from the new session may be directed to a different endpoint than the previous session, but all traffic from the new session will be directed to the same endpoint. The hash function distribution results in a fairly random endpoint selection and over time produces a fairly even distribution of traffic for both TCP and UDP protocol based sessions.

The Azure Load Balancer is also referred to as the Cloud Service Load Balancer because it is automatically created when you create a cloud service. The Azure Load Balancer has a public facing Virtual IP (VIP) and external endpoint that accept connections from the Internet.

By default, a single VIP can support multiple input endpoints. Each input endpoint can be assigned to a different group of virtual machines in a load balanced set or a cloud service that has multiple defined instances. However, while the external endpoint is the same (e.g., port 80) for all groups, each input endpoint much listen on a different port. This means that you cannot have multiple groups of machines listening on port 80. The first group can listen on port 80, but each other group must use a different port (e.g., port 8080 for group 2, port 8081 for group 3, and so on).

Recently, Microsoft released support for multiple VIPs allowing a single cloud service to have multiple Internet facing VIPs (default maximum of 5), and each VIP can support multiple input endpoints. This means that you can have different groups of machines with its own VIP and each VIP can listen on the same external facing port. However, for each VIP, you still cannot have multiple input endpoints listening on the same port.

Internal Load Balancer

The Internal Load Balancer (ILB) is an implementation of the Azure Load Balancer that only has an internal facing Virtual IP. This means that you cannot use an ILB to load balance traffic from the Internet to internal input endpoints. ILB only provides load balancing between virtual machines that are on an internal Azure virtual network or cloud service. If used within the cloud service, then all the nodes that are load balanced must be a member of the cloud service. If used within a virtual network, then all nodes that are load balanced must be attached to the same virtual network.

For example, you may want to build a multi-tier application with a public facing web tier and a private internal database tier that contains sensitive information. However, you want only the web tier to have access to the database tier. If both tiers require load balancing, you would use the Azure Load Balancer for the public-facing web tier, and the Internal Load Balancer to load balance web tier request to the internal database tier that you do not want exposed to the Internet. With the ILB, you can provide a load balanced internal endpoint that uses a private address space and is not exposed externally. Another common scenario is building an SQL Always On Availability Group cluster using the ILB as the listener.

Load Balancer Probes

The Azure Load Balancer supports the ability to probe the nodes of the load balanced set or the instances of a web role to determine if the nodes or instances are responding to requests. The probe supports both HTTP and TCP as probing protocols, and HTTPS is not supported. You must specify a port for the probe to use during probing attempts.

The default protocol for a probe is TCP. Every 15 seconds, the probe will attempt to connect to the instances on the defined probe port by sending two probes attempts over TCP. If the probe does not receive a TCP ACK (acknowledgement) for each probe attempt it will consider the node or instance offline and stop sending traffic to it.

The TCP probe tells you if the node TCP/IP stack is operational, but does not give you any other information. For most situations like a web server, just knowing if the TCP/IP stack is operational on the node is probably not enough information.

For web servers, the better probe protocol to use is HTTP. Using HTTP for the probe allows you to specify a path on the web server that the probe will attempt to access. If the probe is able to connect to the path, it will receive an HTTP 200 response. If it cannot respond or if it receives any other response but an HTTP 200, then the load balancer will stop sending traffic to that node. Using this approach allows you to write your own code to perform other checks that verify the state of the web site.

Some other characteristics of which you should be aware include:

  • The system that does the probe is external to the load balancer but on the internal Azure network so it only probes using internal addresses
  • The probe continues to check a node even if a node has been taken out of traffic balancing
  • If a node is taken out of the traffic balancing, it will be automatically put back in as soon as the probe is successful.
  • The HTTP probe path cannot not require authentication

Traffic Manager

Traffic Manager is an Internet facing solution to load balance traffic between multiple endpoints. Traffic Manager utilizes DNS queries and a policy engine to direct traffic to Internet resources. The Internet resource can be located in a single datacenter or across the globe. Traffic Manager is not like a typical load balancing engine because it is only involved in the initial endpoint selection, not the actual processing of the redirection of every packet.

Traffic Manager supports three types of load balancing algorithms:

  • Performance – Directs the client to the closest load balanced node based on latency  
  • Failover – Directs the client to the primary node unless the primary node is down and then redirects to a backup node.
  • Round Robin – Directs the client to the node based on a distributed approach using weights assigned to the nodes

Traffic Manager implements the load balancing algorithm defined through a set of intelligent policies, and each Traffic Manager URL resource is associated with a set of policies.

Traffic Manager can be considered the Virtual DNS entry for the URL to which you are trying to connect. For example, to use Traffic Manager to load balance traffic to www.contoso.com, you associate www.contoso.com to a CNAME entry that points to a Traffic Manager URL. When the user queries www.contoso.com, it is redirected to the Traffic Manager and the policy engine analyzes the rules for the particular request. Once the policy engine processes the request, it uses the configured load balancing algorithm to select an endpoint to return to the client. The client actually receives a CNAME record for the defined endpoint, and then the client performs a DNS query for that endpoint name and connects to the endpoint using its IP address.

Once a client receives a redirection to an endpoint, it will continue to use that endpoint until the Time-To-Live (TTL) of the cached DNS entry expires or a refresh is forced. If the endpoint becomes unavailable, the client could experience delays or an inability to connect to the site.

Traffic Manager also supports the ability to enable or disable endpoints in the policy engine without taking the endpoint down. This provides the ability to add or remove endpoint locations as needed or to perform maintenance on endpoints easily.

Virtual Appliance Load Balancers                                                                                             

The Microsoft Azure gallery marketplace offers 3rd party virtual appliances that provide load balancing functionality from Kemp and Barracuda. New appliances were announced at Microsoft Ignite 2015 from F5, A10, and Citrix. These appliances can be used as both internal and external load balancers.

Appliances can work in different modes, namely Single Arm and Dual Arm modes. Single Arm appliances have a single NIC and all traffic flows into the load balancer. The traffic is then redirected back out to the target systems over the same NIC. This typically also requires that the systems that are behind the load balancer have a client installed on them to control the routing of packets through the appliance. Dual Arm appliances have multiple NICs and typically support routing of traffic at layer 3 through the appliances. This is a much simpler configuration and typically performs better than a Single Arm configuration.


The load balancers available natively in Microsoft Azure and the 3rd party appliance offerings allow you to configure the best solution based on your particular requirements. For a multi-tier application that needs to be globally accessible, you can leverage the Traffic Manager performance load balancing method to redirect clients to the closest endpoint. The endpoint could be an Azure Load Balancer VIP that is load balancing a user interface tier across multiple nodes. Each user interface tier node could use an Internal Load Balancer that load balances a middleware tier across multiple nodes. Each middleware tier could be connecting to an SQL Always On Availability Group data tier that is using a 3rd party appliance for the implementation of the SQL listener. Whether you need to load balance a single tier or multi-tier cloud service or virtual machines, Microsoft Azure offers you the flexibility to tailor the solution that works for your specific environment.

Leave a Comment

Your email address will not be published.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top