Why Do I Need High-Availability?
High-availability solutions provide ways to manage both planned and unplanned downtime. Some examples of planned downtime include the installation of operating system or application upgrades that require taking a server offline. Unplanned downtime can be the result of a simple hardware component failure, or loss of physical servers because of a natural disaster. On a large scale, building a high-availability strategy includes an end-to-end examination of all the interlinked components that provide users with access to services, and may require the implementation of more than one solution to meet availability targets.
While server virtualization provides a solution for organizations to implement a dynamic, flexible core infrastructure that minimizes the number of deployed physical servers, increases utilization of physical resources, and reduces long-term operating costs, the migration of multiple physical servers onto a common virtualization host requires a broader high-availability strategy than in a traditional infrastructure. If one or more virtualization hosts experience downtime, large numbers of users can be affected and lose access to services and applications, translating into a loss in productivity and financial impact to the organization. At the hardware level, deploying virtualization hosts on platforms that incorporate redundant or hot-swappable components (i.e., power supplies, processors, and memory) reduces the risk of unplanned downtime. With Windows Server 2008, there is also an opportunity to leverage the integrated failover clustering feature to manage both unplanned and planned downtime of virtualization hosts and guests.
Windows Server 2008 Failover Clustering
Failover clustering has been a component of Microsoft Windows server products beginning with NT 4.0. Since those early days, the failover cluster component has much evolved, especially in terms of ease of configuration and supported applications. If you use Windows Server 2008 with Hyper-V as your virtualization platform, you can integrate failover clustering as part of your high-availability strategy for your virtualized infrastructure. A Windows Server 2008 failover cluster consists of at least two servers (nodes) that are connected through multiple network links, one of which enables monitoring the status of each node. Each failover cluster node is connected to a common storage array such as a Storage Area Network (SAN), and only one node in a cluster can own the set of network and disk resources associated with an application or service at any one time. In terms of scale, a Windows Server 2008 failover cluster can contain up to 16 nodes. The nodes monitor each other using a network heartbeat to determine if nodes are responsive. If a node becomes unresponsive, the application or service running on the failed cluster node will be restarted on another cluster node after it has taken ownership of resources. Beginning with Windows Server 2008, geographically-dispersed (or stretch) clusters can also be implemented without requiring custom or specialized hardware. This provides you with the ability to implement a failover cluster that can manage unplanned downtime by failing over to another local node in the case of a single server failure, or to a node in another geographical region in the event of a more severe local disruption such as might be caused by an extended power outage, natural disaster, or other large-scale problem.
Windows Server 2008 Failover Clustering With Hyper-V
Using failover clustering with Windows Server 2008 and Hyper-V provides the ability to implement a high-availability strategy that can manage both unplanned and planned downtime in a virtualized infrastructure. There are two different levels at which you can implement a failover cluster in a Hyper-V environment: at the virtualization host level, and at the guest operating system level.
Option 1: A Guest Operating System Failover Cluster
As shown in Figure 1, a guest operating system failover cluster is implemented between two or more virtual machines running on separate Hyper-V hosts and that are connected to a shared storage system. In order to implement this option, you have to run an operating system in the virtual machine that supports failover clustering, such as Windows Server 2003 R2 (up to 8 nodes) or Windows Server 2008 Enterprise or Datacenter (up to 16 nodes) editions. In addition, the application that you intend to make highly available must be “cluster-aware”. This means that the application has been developed with specific features that allow it to interact with the cluster service and enable it to failover and restart with all required resources on a different cluster node.
Figure 1: A Guest Operating System Failover Cluster
If you are planning a guest operating system failover cluster, iSCSI is the only shared storage access protocol that is supported for this configuration. The iSCSI protocol allows transmitting and receiving block storage data over a TCP/IP network. In order to use iSCSI and maximize performance, you must dedicate a virtual network adapter in each virtual machine for iSCSI communication. You should also dedicate one or more physical network cards and configure individual virtual networks on each Hyper-V host for iSCSI storage access. You will be required to install an iSCSI initiator in each virtual machine to access the iSCSI-based targets on the shared storage. An iSCSI initiator is a software component that enables the connection of a Windows host to an external iSCSI storage array over a TCP/IP network. It is important to note this configuration does not support directly attaching an iSCSI target to the virtual machine as a boot device. The Microsoft iSCSI initiator is included in Windows Server 2008, but must be downloaded for Windows Server 2003 and earlier versions.
A guest operating system failover cluster is capable of supporting planned and unplanned downtime for cluster-aware applications. In fact, this configuration will manage unplanned downtime caused by a failure or crash that occurs within the virtual machine, as well as a failure or crash that occurs at the Hyper-V host platform level.
Option 2: A Hyper-V Host Failover Cluster
The second failover cluster option consists of two or more Windows Server 2008 servers with the Hyper-V role installed, each configured as a cluster node and with connections to a shared storage system. A Hyper-V host failover cluster is illustrated in Figure 2. This cluster configuration allows you to achieve high-availability for non-cluster aware applications running in virtual machines and supports planned and unplanned downtime for Hyper-V hosts. On the contrary, a failure or crash of either the guest operating system or application will not result in a failover event.
Figure 2: A Hyper-V Host Failover Cluster
One of the advantages of a Hyper-V host failover cluster is that you are not limited to using iSCSI for connection to the shared storage system. In this configuration, you are able to leverage either iSCSI or Fibre-Channel connected shared storage, even a file server using CIFS or SMB protocols. There are quite a few different storage configurations that you can configure in this scenario depending on the requirements of your environment. These configurations will be addressed in a future article.
Planned Versus Unplanned Failover Process
In a planned failover scenario, whether based on a need to perform maintenance on a Hyper-V host or to rebalance the load across Hyper-V hosts through quick migration of a virtual machine, the migration process can occur without data loss and with minimal service interruption. To accomplish this, a virtual machine is placed in saved state, which results in the active memory and processor state being captured to disk, and processing is suspended. Essentially, the storage resources ownership is then transferred to the target cluster node, the active memory and processor state is loaded, and processing is resumed. Depending on the underlying storage and the size of the state data that is reloaded, the entire process can take place in a few seconds.
In an unplanned failover scenario, caused by a hardware problem or other unforeseen issue, the Hyper-V host crashes along with all the virtual machines. Because the virtual machines crash before state can be saved, the migration process loses data in active memory. However, because the Hyper-V host is part of a failover cluster, the storage resources ownership will be transferred to another cluster node and virtual machines restarted on that Hyper-V host.
In this article, you learned why it is important to reconsider your high-availability strategy as you migrate from a physical to a virtualized infrastructure. More specifically, you learned how Hyper-V can be used in conjunction with the failover cluster technology in Windows Server 2008 to create high-availability solutions for both cluster-aware, and non-cluster aware applications that run in virtual machines. These solutions are able to manage both planned and unplanned downtime scenarios, so that you have the ability to perform Hyper-V host maintenance, rebalance load across Hyper-V hosts, and address physical server failures while minimizing disruption to critical applications and services.