Maximizing Your Virtual Machine Density in Hyper-V (Part 6)

If you would like to read the other parts in this article series please go to:

Introduction

Throughout this article series, I have been discussing features and techniques that you can use to maximize virtual machine density in Windows Server 2012 Hyper-V. In this article, I want to talk about how you might be able to further increase virtual machine density in Windows Server 2012 R2.

Deduplication

Microsoft first introduced native deduplication capabilities in Windows Server 2012. Ever since the introduction of this feature, there has been at least some level of confusion (as evidenced by misinformation found on the Internet) as to how this feature can and cannot be used.

In Windows Server 2012, Deduplication was not supported for use with virtual environments. It was OK to deduplicate volumes containing template files or virtual hard disks that were not being actively used, but problems could result if you deduplicated a volume containing live virtual hard disks.

It has long been known that Windows Server 2012 R2 will support deduplication for virtual hard disks. However, it is the manner of the support that is sometimes misunderstood.

The big problem with deduplicating a volume that contains virtual hard disk files that are being actively used is the performance impact caused by the deduplication process. To put it simply, the deduplication engine requires various system resources to be used. The act of deduplicating a volume consumes CPU resources, memory resources, and generates disk I/O. This is a big problem if your goal is to achieve the highest possible virtual machine density. After all, the data deduplication engine is consuming the very resources that are needed by your virtual machines. In fact, when the deduplication process is running, the process may actually reduce the number of virtual machines that the server can handle.

Another issue that must be considered is performance. Today there are plenty of organizations that use high-performance virtual machines. If these virtual machines are in use on a twenty-four seven basis, then there isn’t going to be a period of inactivity when the deduplication process can run without the contents of a virtual hard disks changing before the deduplication process completes.

This isn’t to say that the deduplication of volumes containing running virtual hard disks is impossible. As previously mentioned, Microsoft is providing this capability in Windows Server 2012 R2. However, there are two major caveats that apply when using this capability.

The first caveat is that Microsoft only supports the deduplication of virtual hard disks that are being used by virtual desktops in a VDI environment. The main reason for this is that VDI environments tend to have a more predictable I/O pattern then virtual servers.

Some might be quick to point out that Hyper-V uses exactly the same mechanisms whether it is hosting virtual servers or virtual desktops. This being the case, Windows Server 2012 R2 does nothing to prevent you from deduplicating a volume containing virtual hard disks that belong to virtual servers. However, Microsoft simply does not support the deduplication of volumes containing these types of virtual hard disks. I have seen posts on the Internet from those who claim to have used Windows Server 2012 R2 to deduplicate these types of volumes without experiencing any negative consequences, but I have to confess that I have not tried it myself.

The second caveat for deduplicating volumes that contain live virtual hard disks actually goes right along with the theme of this article series. This article series as you know, is all about maximizing your virtual machine density. As previously mentioned, the deduplication process consumes resources that could be used by the host server to actually run virtual machines and may in some cases actually a lower your virtual machine density. This is where the second caveat comes into play.

Windows Server 2012 R2 supports deduplication of volumes containing the virtual hard disks used by a VDI infrastructure, so long as the storage and compute nodes are connected remotely. In other words, Microsoft wants you to avoid using deduplication in any situation in which the host server is relying upon direct attached storage. By using remote storage, the performance overhead that is associated with the deduplication process can be offloaded from the host server to a remote system so that it does not directly impact the host ability to efficiently run virtual machines.

This brings up another point. What kind of benefit can be derived from deduplicating a volume containing live virtual hard disks? There are two main benefits. There is the benefit of reducing storage costs. But in many cases there is also a performance benefit.

This performance boost takes a little bit of explaining. When you deduplicate a volume, you essentially removes redundant storage blocks. In the case of a volume containing virtual hard disks used by a series of identical virtual desktops, there are a lot of redundant storage blocks. All of the virtual desktops typically run the same operating system, have the same patches, and run the same applications. By some estimates deduplicating a volume containing a series of virtual desktops can decrease the storage requirements by over 95%.

When you remove redundant storage blocks, you replace many duplicate storage blocks with a single copy of each storage block (there may be additional copies provided solely for fault tolerant purposes, but those are irrelevant to this discussion).

The reason why I am telling you all of this is because replacing many duplicate storage blocks with a single copy of each storage block means that that the one remaining copy will probably be read on a frequent basis. With that said, consider the way that storage pools work in Windows Server 2012 R2.

Windows Server 2012 R2 allows you to create storage pools containing a mixture of standard hard disks and solid-state disks. When you create a virtual hard disk on top of a Windows Server 2012 R2 storage pool you have the option of creating a storage tier structure. When tiered storage is used, Windows keeps track of what are known as hot blocks. Hot blocks are storage blocks that are frequently read. Windows automatically and dynamically moves hot blocks to solid-state storage so that they can be read with maximum efficiency. Furthermore, Windows even allows you to use PowerShell to pin frequently used files to the fast storage tier.

In other words, when you deduplicate a volume containing virtual hard disk files, it may be possible to receive a performance boost by having Windows automatically move the most frequently used remaining storage blocks to solid-state storage. Even if your server does not use solid-state storage however, it may be possible to cache the most frequently used storage blocks to memory as a way of receiving a performance boost.

Conclusion

In this article, I have explained that when used properly, deduplication can serve as a mechanism for helping you to improve your overall virtual machine density. However, there are a number of guidelines that you will have to adhere to when doing so. In the next article in this series, I plan to conclude the discussion by talking about another new Windows Server 2012 R2 feature known as Storage QoS.

If you would like to read the other parts in this article series please go to:

About The Author

Leave a Comment

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top