Troubleshooting Slow VM Performance in Hyper-V (Part 1)

If you would like to read the other parts in this article series please go to:

Introduction

One of the most common problems that virtualization administrators encounter is slow virtual machine performance. Unfortunately, these types of performance problems can be surprisingly difficult to diagnose (at least in some cases). In this article series, I want to share with you some real world techniques that I have used to diagnose and correct virtual machine performance problems.

Slow is a Subjective Term

The first thing that you must understand about virtual machine performance troubleshooting is that the word slow is a subjective term. The word slow can mean different things to different people. Whenever someone tells me that a virtual machine is running slowly, the first thing that I ask them is “slow compared to what”? Is the virtual machine running slower than it would run if it were running on dedicated hardware? Is it running slower than it was yesterday? Is it running slower than the 386 that I had back in college?

My point is that it is very difficult to address performance problems unless you have a basis of comparison. You might be able to observe that a particular virtual machine is suffering from poor response time, but to truly address the problem you need objective, statistical performance data. Otherwise it may be difficult to tell whether or not your corrective actions are having the desired effect.

How Widespread is the Problem?

Before you begin gathering performance data, it is a good idea to take a moment and check the scope of the problem. If the performance issues are isolated to an individual virtual machine or to a few specific virtual machines then you have got your work cut out for you. However, if every virtual machine on a given host server is suffering from performance issues then it is a good bet that the host is either overloaded or that it isn’t properly configured. In those situations you might be able to correct the performance issue by doing something as simple as moving a couple of virtual machines to a different host.

Making a Baseline Comparison

As previously stated, the best way to diagnose virtual machine performance problems is to have an objective basis of comparison. Generally this means taking performance benchmark readings on the virtual machine whose performance is suffering and on a comparable but healthy virtual machine. You can then compare the data from the two virtual machines and use it to look for clues as to the source of the problem.

Collecting performance benchmark data in a virtual environment is something of an art form. Sure, you need to know which performance counters to check, but there is more to it than that.

The first thing that you need to understand about virtual machine performance monitoring is that the performance data is skewed by the hypervisor. To show you how this works, let’s pretend that a host server contains four CPU cores and that a virtual machine has been provisioned with one CPU core. Now let’s pretend that you ran the performance monitor and determined that the virtual machine is using 100% of the CPU resources. This doesn’t mean that the host server’s CPU resources have been exhausted. It only means that the virtual machine is using all of the CPU resources that have been allocated to it. In this simplistic example, the virtual machine is using 100% of one virtual CPU core. Since the host has four CPU cores it means that the virtual machine is using 25% of the host’s total CPU resources.

Of course this example assumes that a physical CPU core has been dedicated to the virtual machine. More often than not, cores are shared among multiple virtual machines. In these situations the hypervisor takes a round robin approach to processing items in each virtual machine’s CPU queue. As such, a virtual machine that is reporting 100% CPU utilization might not actually be consuming much CPU time at all. It could be that the CPU resources are being shared among so many virtual machines that the virtual machine isn’t receiving enough CPU time to keep up with the demand.

Another thing that you must understand about performance monitoring is that Performance Monitor counters reflect a machine’s resource utilization at a specific point in time. That being the case, taking a quick look at the machine’s performance can be very misleading. Even healthy virtual machines have resource utilization spikes of up to 100%. If you only look at a virtual machine’s performance for a moment then it will be impossible to tell the difference between a routine spike and a serious performance problem. It is far more effective to measure virtual machine performance over the course of several hours or even days.

So with these considerations in mind, you may be wondering about the most effective method for establishing a basis of comparison for virtual machine performance. Rule number one is to never try to compare benchmarking data from a virtual machine to data from a physical server. While it is true that a properly configured virtual machine should perform as well as a physical machine, factors related to the way that the hypervisor skews performance monitoring data prevent physical to virtual machine comparisons from being reliable (at least that’s been my experience anyway).

Ideally you should attempt to compare two virtual machines that are as similar as possible. At the very least, the two virtual machines should be running the same operating system and service pack. They should also be provisioned with identical CPU, memory, disk, and network resources.

I recently ran into a situation in which I had installed Windows 8 onto a virtual machine and found that the operating system ran very slowly in spite of the fact that there was no user workload. In an effort to diagnose the problem, I performed a full backup of the virtual machine and then restored the backup to a comparably equipped (but isolated) lab server for comparison.

When I performed a benchmark comparison I found that the two virtual machines were delivering a nearly identical level of performance. Since the two virtual machines were running on separate hardware, the test allowed me to rule out the host server as a possible cause of the performance problem. I knew at that point that the problem was related to the way that the virtual machine was configured. I was ultimately able to improve the virtual machine’s performance by allocating a little bit more memory to it.

Conclusion

The first step in addressing virtual machine performance problems is to establish baseline readings for the virtual machine that is performing poorly and for a similarly configured virtual machine that is running normally. These baseline readings allow you to quantify the two virtual machines performance. These baseline measurements not only provide clues as to the nature of the performance problems, they also help you to measure the effects of any actions that you take in an effort to improve performance.

In Part 2 of this series, I will continue the discussion by talking about the specific performance counters that you should be tracking when establishing a performance baseline. From there I will go on to discuss some things that you can do to improve the virtual machine’s performance.

If you would like to read the other parts in this article series please go to:

Leave a Comment

Your email address will not be published.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top