Firewall Fault Tolerance: Windows 2000 NLB versus RainWall
By Thomas W Shinder, M.D.
Uptime is the clarion call of the network admin. File servers need to be up, mail servers need to be up, Web servers need to be up and database servers need to be up. All these servers need to be up and doing their jobs around the clock. These days the life’s blood of your business is your Internet connection, which means you also need your firewall to be fault tolerant.
This is where the Windows 2000 Network Load Balancing Server (NLB) and Rainfinity’s RainWall services come into the picture. NLB is an IP load balancing and fail over solution that comes with Windows 2000 Advanced Server and Datacenter Server. RainWall is an IP load balancing and fault tolerance product that can be installed on Windows 2000 Server, Windows 2000 Advanced Server and Windows 2000 Datacenter Server.
I covered NLB in great detail in a series of article earlier this year, all of which you can find at www.isaserver.org/shinder. NLB has the advantage of being packaged with Windows and it does a fairly decent job at what it does. So, why would you want to purchase RainWall instead of using NLB? Good question! There are a number of reasons:
Let’s look at each of these advantages in more detail.
RainWall Supports NLB on the Internal and External Interfaces
One of the major problems we have when using NLB with ISA Server is that you can only bind NLB to a single interface. That means you can configure the external or internal interface for fault tolerance and load balancing, but not both. You have to make a decision on what’s more important – fail over for your incoming connections or fail over for your outgoing connections. It’s a difficult decision to make if you are using ISA Server for both inbound and outbound access.
RainWall gets around this problem completely by providing load balancing and fault tolerance for both the internal and external interfaces. If one of the servers in the RainWall array goes down, internal network clients are automatically redirected to an available RainWall ISA Server array member. External network clients are also automatically directed to an active server. Internal and external network clients are never aware of a problem and access continues normally.
RainWall Supports Symmetric Routing
I spent a lot of time on this issue in my NLB articles. One of the major problems with NLB is that is really wasn’t designed to be used on a firewall. There are situations where an NLB array member accepts a connection request, only to have the response go to another member in the NLB array. This is asymmetric routing of requests and responses and leads to broken connections either inbound or outbound access. The figure below shows an example of what can happen with asymmetric routing.
For example, think about what happens when you publish a server on the internal network and you have NLB enabled on the external interface of the ISA Servers in the NLB array. ISA Server #1 receives the connection request. However, the default gateway for the SMTP server is ISA Server #2. So, even though the incoming connection was correctly forwarded by ISA Server #1, the response goes to ISA Server #2 because the SMTP server must be a SecureNAT client and its configured to use the internal IP address of ISA Server as its default gateway. NLB array members don’t share connection state information, so ISA Server #2 doesn’t know anything about the initial connection request, so it just drops the packets. This leads to broken connections.
In contrast, RainWall nodes share connection state information with each other using their “RAIN” technology (Reliable Array of Independent Nodes). A big part of this RAIN technology is sharing connection information among the array members and it does this without needing to create a dedicated segment for intra-array communications. Array members are aware of which RainWall array member received the connection request and it insures that the same member that received the request also handles the response. In addition, the SMTP server can remain a SecureNAT client, because it uses the Virtual IP Address of the RainWall array as its default gateway. Sweet!
RainWall Never Requires Static MAC Entries
You may remember from your reading of the NLB articles about the issues with unicast and multicast modes. NLB uses these different modes to support different network environments. Multicast mode allows each NIC in the NLB array to keep its own MAC address while also answering on a shared multicast address. The problem with this method is Cisco switches don’t like to associate unicast IP addresses with multicast MAC addresses. You can get around that problem by programming a static ARP entry on the Cisco device, but who wants to deal with Cisco device admins? J
Unicast mode gets us around the problem of dealing with grumpy Cisco types, but suffers from the fact that the actual MAC address of each NIC participating in the array is replaced with a shared array MAC address. This works OK, except there’s a problem with switch flooding, since switches don’t allow the same MAC address to listen on multiple ports. You can sort of get around this problem, but you never get close to the “wire speed” you paid for when you purchased the interfaces and interposed devices.
RainWall never changes the MAC address on any of the array NICs. The adapters participating in the array trade off who accepts incoming packets for the VIP. In a method that works sort of like the token ring network architecture, the RainWall array members rotate who’s responsible for taking the request. Each machine also keeps its own IP address and MAC addresses, so you never run into the problems you see with unicast and multicast modes and NLB.
RainWall is Easy to Setup and Configure with No-Brainer Wizards
RainWall is Configured Once and Replicated to All Array Members Automatically
If you’ve ever set up an NLB array, you know that it’s no walk in the park. The multicast mode and unicast mode concepts can fry anyone’s brain, and then you have to go to each machine and repeat the brain-fry. You also need to work with port rules and affinity settings. Then you have to think about the exceptions to the port rules and affinity settings and try to figure out how they relate to using them on a multihomed firewall (since they weren’t really designed for this kind of environment).
You won’t have to deal with any of those issues with RainWall. I guarantee that by the end of the configuration Wizard you’ll say to yourself “is that all there is?” You’ll keep thinking that to yourself as you wait for the bumps on your forehead to heal from the times you’ve beat your head against the monitor when configuring Windows 2000 NLB arrays.
How about adding a new member to the NLB array? You have to remember what the port rules are on the other array members, make sure you have the correct Host ID, and set the affinity settings correctly. Type in just one wrong number and BANG! you spend the rest of the day trying to troubleshoot the problem.
You won’t have to suffer that way with RainWall. Configure the first member of the array. Then install RainWall on your second ISA Server to create the second RainWall array member. Configuration information is automatically passed to the second array member. You don’t have to mess with arcane settings that are bollixed up with a single “fat fingered” typo. Subsequent array members are just as easy to enter into the array. Just install and the RAIN algorithm takes care of the rest.
RainWall is Service and Adapter Aware
One of the major drawbacks of the Windows 2000 NLB service is that it’s completely oblivious of service and NIC status. For example, you want to use NLB for outbound access for your Firewall, SecureNAT and Web Proxy clients. Everything is working well until the Firewall service on one of the machines stops working. What do you think happens? Nothing. NLB continues to forward packets to the ISA Server with the downed Firewall service with the result that any connections sent to that server fail. These failed connections lead to many calls to your desk and increased “up time” for you, meaning up and out of your chair J .
RainWall watches the ISA Server services right out of the box. If one of the ISA Server services becomes unavailable, the RainWall service will stop packets from being directed to that machine and move them to machines with functioning services. RainWall will then “fail back” and bring the defective ISA Server back into the array when the service becomes available again.
The figure below shows that Web Proxy service is up and running on both node in Cluster 1. The green SM icon to the right of the W3Proxy entry indicates that the Web Proxy service is running on both Node 1 and Node 2 in the array.
What does the Windows 2000 NLB service do if a NIC in the array stops working? Nothing! It keeps the machine in the array. Wouldn’t it be nice to have the machine removed from the array NIC goes bad? You bet! That’s what RainWall does with its NIC monitor. You can see in the figure below that you can configure the NIC monitors for the internal and external interfaces. A custom Hold Count works like the typical “hold down timer” and prevents false alarms from taking the server out of the array prematurely.
RainWall is designed to work with ISA Server and can be installed on ISA Standard and Enterprise Servers
NLB was really designed for unihomed application servers and its configuration interface and options are geared to using NLB in those kinds of environments. While you can force NLB to work on the internal or external interface of a multihomed ISA Server, you probably can tell from the NLB articles I’ve done that NLB really doesn’t work easily in a multihomed firewall environment.
RainWall for ISA Server was built from the ground up to work with ISA Server. It integrates seamlessly with ISA Server and supports inbound and outbound access transparently. There are no special tweaks, no “config files” that need careful massaging and finger crossing and hoping that you’ve configured it correctly. Symmetric routing is enabled out of the box and it supports all protocols supported by ISA Server because it handles packets before the ISA Server sees them; all members of the array have current state information.
A great feature of RainWall is that is provides fault tolerance and load balancing on all versions of Windows 2000 and ISA Server. NLB is only available on Windows 2000 Advanced Server and Datacenter Server and CARP is only available in ISA Server Enterprise Edition. Imagine how much money you’ll save running three Windows 2000 Servers with ISA Server Standard Edition with RainWall compared to running NLB on Windows 2000 Advanced Server with ISA Server Enterprise Edition. Whoa!
RainWall is Fully Integrated with RainConnect
If all you’re looking for is a fault tolerance solution for your ISA Servers, then RainWall is all you need. But what about your Internet connection? Your ISA Servers can be humming along working the way their supposed to work, but if your single Internet connection goes down, you’re out of luck! For true fault tolerance, you need redundancy for your Internet connections. That’s what RainConnect is all about.
For example, suppose you have a DSL and a cable connection. You could use BGP (good luck; with those cheap connections you’ll never talk to anyone live at your ISP) or you could try to hack a script that will fail over your connections to the up link when one of them goes down (good luck on that one too!). Or you can use something that I’ve seen work over and over again: RainConnect. I won’t go into RainConnect into detail in this article, but its worth noting that RainConnect partners perfectly with RainWall. The RainConnect/RainWall combo is a powerful fault tolerance one-two punch for both your ISA Servers and Internet links.
Both NLB and RainWall provide fault tolerance and load balancing for your ISA Servers. While NLB provides very basic fail over and load balancing, it suffers by not supporting more than one interface, asymmetric routing, and unawareness of ISA Server services or NIC status. In contrast, RainWall provides symmetric routing, its aware of the state of ISA Server and other Windows services, and watches the current status of the NICs participating in the array, as well as up and downstream routers. Best of all, RainWall is a more cost effective solution. You get more for less! That doesn’t happen too often. For this reason we give RainWall an ISAServer.org Rating of 5 stars — it does not get better than that!
Head on over to www.rainfinity.com and check out the info on RainWall. Send them a note letting them know you’re interested in testing our their software. Next week I’ll share with you a cool lab study you can do with RainWall using VMware, and you can even leverage your existing ISA Server connection to the Internet. You’re going to like it, so stay tuned and I’ll see you next week.
I hope you enjoyed this article and found something in it that you can apply to your own network. If you have any questions on anything I discussed in this article, head on over to http://forums.isaserver.org/ultimatebb.cgi?ubb=get_topic;f=2;t=008899 and post a message. I’ll be informed of your post and will answer your questions ASAP. Thanks! –Tom
If you would like us to email you when Tom Shinder releases another article on ISAserver.org, subscribe to our ‘Real-Time Article Update’ byclicking here. Please note that we do NOT sell or rent the email addresses belonging to our subscribers; we respect your privacy!