CARP and Load Balanced Domains
One thing about this business is that if you think you understand something, you don't. This fact was highlight for me last night when I viewed a Webcast done by the MS IT team on how Microsoft uses CARP to load balance outbound Web connections through ISA Firewall arrays.
I gave a short overview of how CARP works in ISA Firewall arrays over at http://blogs.isaserver.org/shinder/2006/06/03/can-carp-and-nlb-support-each-other/ and at that time I thought I understood how the CARP algorithm worked to balance outbound connections through the ISA Firewall.
However, I think in the back of my head that was no entirely comfortable with that description, because there were "rumors" that there was some big changes in the CARP algorithm. These changes took place with ISA 2004 SP2, and some of them were described in the ISA 2004 SP2 white paper at http://www.microsoft.com/technet/isa/2004/plan/sp2.mspx
The major change to the CARP algorithm was that instead of the entire URL being used in the hash, only the FQDN is used in the hash. That makes sure that all outbound requests to a particular host is handled by the same array member, which keeps the context of a session to the same external hosts.
However, as pointed out during the Webcast, there still left a problem with a single array member being hammers with too many requests. For example, suppose 10K clients behind the ISA Firewall array want to go to www.microsoft.com/downloads/secfix.asp. All those requests would use www.microsoft.com in the hash and the server responsible for the FQDN would be server 3 in the array, as seen in the figure below. This would lead to server 3 being overwhelmed with traffic while other array members are left relatively unaffected (assuming that we're using client side CARP -- server side CARP would have it's typical effect on all members of the array, but would not change the load issue for server 3 in this scenario, it would just increase the load on all servers, as it normally does).
So what did I find out during the Webcast? I found out that there were more changes made to the CARP algorithm than what was shared in the SP2 white paper, and represent a very clever, and very powerful change in the CARP load balancing algorithm. The change introduced with SP2 not only takes the destination FQDN into account, but also uses the source IP address.
When the source IP address is added to the hash calculation, connections to www.microsoft.com/downloads/secfix.asp can be load balanced among multiple array members, as seen in the figure above. This is a much more efficient way to load balancing outbound Web connections through the ISA Firewall array and prevents any server in the array from being overwhelmed with traffic, even when thousands of clients are trying to reach the same destination FQDN.
This is really significant, because it proves that the ISA Firewall array is a much more robust solution than Blue Coat, since Blue Coat has very limited support for redundancy or load balancing. In contrast, Microsoft has been about to get five 9's uptime for their 70K clients using ISA Firewall arrays at the Microsoft campus and all over the world. Another significant comment made during the Webcast is that Microsoft, one of the most attacked Web sites in the world, do not use "hardware" firewalls in front of the ISA Firewall arrays. Why? Because they're not required -- the ISA firewall is a secure and robust as any hardware firewall.
Lesson learned: don't buy into the "hardware" sales guy's BS when he disses the ISA Firewall.