A vSphere Distributed Resources Scheduler Conceptual Primer (Part 1)

If you would like to be notified when Scott Lowe releases the next part in this article series please sign up to our Real-Time Article Update newsletter.

Introduction

Perhaps one of the most important innovations that has come from hypervisor vendors in the past decade is the concept of workload migration. The ability to simply shift workloads from one host to another has helped organizations increase overall availability, be much more flexible, and it has enabled new opportunities for how to operate mission critical software.

While this article won’t discuss the cloud, it’s important to note that emerging functionality in both vSphere and Hyper-V is enabling organizations to seamlessly migrate running workloads both between hypervisors as well as from on-premises hardware to cloud-based services, such as Microsoft’s Windows Azure and VMware’s vSphere Hybrid cloud services.

The focus of this article, though, is on some direct functionality offered by the vSphere/vCenter combination of which workload migration is a key element. Specifically, the Distributed Resources Scheduler, which is a service that allows vCenter to, based on certain administrative rules, automatically migrate workloads without administrator intervention. DRS works by aggregating all of the resources into a cluster and treating everything as a pool, which DRS then manages on behalf of the administrator.

DRS, which is available in the Enterprise and Enterprise Plus editions of vSphere, provides a number of services for administrators:

Initial virtual machine placement. When an administrator creates a new virtual machine, DRS scans the cluster to determine the most suitable location for the new virtual machines.
Seamless maintenance. Enables an administrator to evacuate all virtual machines from a host so that maintenance can be performed.
Ongoing resource management. Depending on the sensitivity level enabled by the administrator, as new performance opportunities arise, DRS can leverage these opportunities by automatically moving workloads.
Load balancing. Ensures that all host servers are as equally balanced as possible. This takes into consideration the need for resource overhead that will be used in the event of an outage.
Adherence to constraints. Administrators can create constraints – called affinity and anti-affinity rules – that constrain what DRS can do.

DRS requirements

As you may imagine, a service like DRS carries with it some requirements:

First and foremost, there must be some kind of shared storage – such as a SAN or NAS – in use by the hosts participating in the DRS cluster.
Make sure that all VMFS volumes are accessible by all hosts in the cluster and that there is sufficient space in the VMFS volumes to store the necessary virtual machines.

However, in what is generally the most difficult prerequisite to attain, administrators must take steps to ensure processor compatibility between all of the hosts in the cluster. Here’s the challenge: When a workload is migrated to another host, the running state of that virtual machine goes along with it. In order for the process to be successful, the destination host’s processors must be able to resume execution as if the workload were still running on the original host. This means that processor features must be compatible. The processors don’t need to run at the same speed or have the same amount of cache, but they must be compatible.

To this end, it’s not possible to migrate workloads between processors of different vendors. So, you can’t use DRS with a cluster that’s made of mixed AMD and Intel servers. However, once a cluster has servers with processors all from the same vendor, there are ways to make DRS work across processor families/generations.

The easiest way to ensure ongoing processor compatibility in a cluster is to enable Enhanced vMotion Compatibility (EVC) for that cluster. EVC takes a “lowest common denominator” approach to compatibility. EVC identifies the processor family that is supported by all processors in the cluster and, for processors that are newer or have additional features, EVC masks those features from use so that workloads can be migrated between all processors. Using this method, all of the processors in the cluster are compatible for DRS’ purposes.

Affinity and anti-affinity explained

In vSphere, the Distributed Resource Scheduler is a great service to use to automate vSphere’s ability to balance workloads in ways that make sense from a resource perspective. However, sometimes, workload management requires additional thought beyond just “are there enough resources to support it” scenarios.

Affinity rules – VM/VM

At times, you need to ensure that multiple virtual machines are always running on the same host. As such, if one of the virtual machines is vMotioned to a different host, the associated virtual machines must be moved as well. The scenario is common between, for example, application and database servers where keeping communications between the VMs on the same host is preferable to having that communication traverse a network link.

These kinds of needs are addressed through the creation of affinity rules.

Affinity rules – Host/VM

In other cases, it’s not important to maintain VM to VM communication, but you need to make sure that certain workloads always run on the same host. Many companies, for example, want to know on which host vCenter is running or they may have an application running inside a virtual machine, but that application is tied via licensing rules to the current vSphere host. Administrators can create virtual machine to host affinity rules to make sure that these virtual machines are never migrated to other hosts. Of course, the downside here is that the failure of the host will result in the workload going down as well.

Anti-affinity rules – VM/VM

Finally, there are times during which certain virtual machines should not run on the same host. For example, most organizations want to make sure that at least one domain controller remains available at all times, so those organizations will create VM to VM anti-affinity rules which state that these virtual machines are to run on different hosts, even if performance would be better by combining them..

vSphere DRS automation levels

When an administrator decides to take the plunge and implement vSphere’s Distributed Resource Scheduler tool, a decision needs to be made regarding the level of automation that will be afforded to DRS. There are three levels from which to choose.

Manual

The DRS cluster will make recommendations to an administrator, but no automated actions will be taken. The administrator must manually carry out any recommendations. This is a good setting if you just want to see what impact DRS might have on your environment.

Partially automated

Partially automated DRS clusters are pretty common. Clusters configured for partial automation will automatically place new virtual machines on existing hosts based on performance and resource criteria. However, after the initial placement event, which may result in recommendations to move other workloads to accommodate the new virtual machine, DRS operates the same way that it does when Manual DRS is use.

Fully automated

Many administrators are loathe to allow DRS to simply work to its will through the fully automated option. When this option is selected, DRS will provide initial placement services as described earlier, but it will also move workloads around the cluster without administrator intervention if there is a chance to better balance workloads running inside the cluster. The administrator is able to specify the level of sensitivity using what is called the Migration Threshold. You can configure DRS to initiate a migration when there is any associated performance improvement or you can choose to be a bit more conservative and wait until DRS finds that an operation will have a significant positive impact.

Summary

This has been a look at a number of vSphere DRS concepts. In the next part of this series, we will go into more depth on DRS and I will show you how to configure the service.

If you would like to be notified when Scott Lowe releases the next part in this article series please sign up to our Real-Time Article Update newsletter.

A vSphere Distributed Resources Scheduler Conceptual Primer (Part 1)

Introduction

DRS requirements

Affinity and anti-affinity explained

vSphere DRS automation levels

Summary

About The Author

Scott D. Lowe

Leave a Comment Cancel Reply

Introduction

DRS requirements

Affinity and anti-affinity explained

vSphere DRS automation levels

Summary

About The Author

Scott D. Lowe

Read Next

Leave a Comment Cancel Reply