Taking Control of VM Sprawl (Part 3)

If you would like to read the other parts in this article series please go to:

In my previous article, I explained that the most difficult part of reducing virtual machine sprawl was to identify virtual machines that are no longer being used. As an administrator, you probably already have a relatively good idea of which virtual machines are no longer being used. The really tough part however, is proving that you are correct. After all, you can’t just haphazardly delete virtual machines that you believe to be unused because if you happen to guess incorrectly then there are very real consequences to removing the virtual machine.

Some have suggested the approach of powering off of virtual machines that seem to be unnecessary. The idea behind this approach is that if anyone is using the virtual machine then someone will eventually speak up when able to access the virtual machine.

This approach certainly seems logical enough, and it is a lot safer than deleting virtual machines. However, there are some problems with using this approach.

The first problem is that if you power down a virtual machine that is believed to be unnecessary and someone is actually using the virtual machine then you have just caused that person to experience an outage. This could potentially disrupt important business processes.

The other problem with powering down virtual machines that are believed to be unused is that it is entirely possible for a virtual machine to be critically important, and yet seldom used. I know that this sounds like a contradiction, but let me give you an example. Imagine for a moment that someone in the accounting department needs to run a quarterly financial report. This report is generated from, you guessed it, a virtual machine.

Now let’s suppose that because the virtual machine is so seldom used, someone in the IT department mistakes it as having been abandoned and therefore powers it down. After about sixty days of remaining off-line the IT department assumes that nobody needs the virtual machine because no one has noticed that it has been off-line for the last two months. The virtual machine is deleted, and IT goes on about its business. A few weeks later it is time for the quarterly report to be run. That’s when the finance department notices that they cannot access the virtual machine. They place a call to IT, and you can imagine the drama that unfolds from there. Hopefully there is a backup of the virtual machine.

My point is that the old technique of powering down a virtual machine and waiting for someone to report its unavailability is reliable at best and catastrophic at worst. The IT department therefore needs a much better method of establishing a virtual machine’s ownership and its purpose. Like so many other things in IT however, this is much easier said than done.

So how can you identify virtual machines that may or may not have been abandoned? This is the million-dollar question. My answer to this question might surprise you. My answer is that this isn’t something that you should do right away. Let me explain myself.

Suppose for a moment that an organization has roughly about 500 virtual machines online. As an administrator, you have been tasked with cleaning out virtual machine sprawl. You’re pretty sure that there are probably at least twenty or thirty virtual machines that have been abandoned, but you don’t really know which one just yet. It’s tempting to begin trying to identify those virtual machines right off the bat, but doing so could be a huge mistake.

The problem is that as you work to identify virtual machines that have been abandoned, brand-new virtual machines are probably being created. In the previous article I talked about the importance of setting up a system for documenting newly created virtual machines. If you don’t have such a system in place, then the task of identifying virtual machines that have been abandoned will be made nearly impossible by the added clutter of new virtual machines. My advice is to begin by putting into place a system for documenting every newly created virtual machine. Once you have that system in place then you can begin documenting existing virtual machines.

Your immediate goal should not be to identify virtual machines that have been abandoned, but rather to document all existing virtual machines. That way, you don’t have to worry about losing track of the existing virtual machines along the way. As you document each virtual machine, it will become more apparent which virtual machines require some additional scrutiny. The idea is to create good documentation for each VM so that abandoned VMs stand out from the rest.

So what types of things should you document for existing virtual machines and for newly created virtual machines? Obviously, every organization’s needs are different, but at a minimum you need to know who created the virtual machine and what department they work in. If possible, it would be really nice to document why the virtual machine was created.

In my opinion, it is best to use an online system that is tied directly to the hypervisor so that documentation can be created at the same time that virtual machine is created. Believe it or not, this is easier said than done.

In one of the previous releases of Microsoft System Center for example, Microsoft included an awesome feature that would allow you to set a virtual machine expiration date. Just before a virtual machine was set to expire, an email message could be automatically sent to the owner and the owner would have the option of either renewing of the virtual machine or going ahead and let it expire. Unfortunately, this functionality no longer exists in the current version of System Center. You can still achieve this functionality in a roundabout way by taking advantage of custom properties and run books within System Center Orchestrator. The idea is that you can use a custom property to set an expiration date for every virtual machine, and then create a run book that checks each day to see which virtual machines are about to expire and then takes action based on that expiration date.

A better option might be to transition toward a private cloud environment in which authorized users are given self-service capabilities. In these types of environments, authorized users are given blocks of resources and are able to create their own virtual machines to take advantage of these resources.

On the surface, this concept might seem totally removed from that of controlling virtual machine sprawl. However, transitioning to a self-service model does a couple of things for you. First, it lets you know in no uncertain terms which tenant created each virtual machine. The other thing that this type of model does for you is that many of the self-service solutions include a built-in chargeback engine. This engine is able to report on the resources that each tenant consumed so that those tenants can be billed for the resources that they used. Even if you don’t actually charge the individual departments for their usage, the chargeback reports can be extremely valuable for identifying virtual machine owners and virtual machine usage.

Conclusion

In this article, I have continued discussing some strategies for getting a handle on your virtual machine inventory. In the next article in this series, I will share some techniques for gathering information about virtual machines that may have been abandoned.

If you would like to read the other parts in this article series please go to:

About The Author

Leave a Comment

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top