Diagnosing Live Migration Failures (Part 1)

If you would like to read the other parts in this article series please go to:

Introduction

For the most part Hyper-V live migrations tend to be relatively painless. Even so, things can, and sometimes do go wrong. Surprisingly, there isn’t a tremendous amount of information on TechNet that is designed to help administrators to cope with live migration failures. That being the case, I decided to write this article as a way of providing administrators with a few things to look for.

Before I Begin

The causes of Hyper-V live migration problems generally fall into three categories – communications problems, configuration problems, and resource problems. As I work through this article, I will try to demonstrate examples of all three types of problems. Some of the solutions will be relatively simple, while others require a bit more thought.

An Error Occurred While Attempting to Contact the Virtual Machine Management Service on the Destination Computer

One of the most common Hyper-V live migration errors that you are likely to encounter is the infamous error stating that an error has occurred while attempting to contact the Virtual Machine Management Service on the destination computer. You can see an example of this error in Figure A.

Image
Figure A: This is one of the most common Hyper-V live migration errors.

This seemingly simple error message can be surprisingly tough to troubleshoot because there are a number of different conditions that can trigger the error. The error message occurs when Hyper-V is not able to establish an RPC session with the destination host within a specific length of time. That being the case, you should approach the troubleshooting process from the standpoint of figuring out what might cause a communications failure.

Although it might sound silly, the first thing that I recommend checking is to make sure that the destination host computer is still online. After all, I triggered the error message shown in the screen capture above simply by powering down the destination host server prior to attempting a live migration.

If you are able to confirm that the destination host is online then it’s a good idea to check to make sure that a couple of critical system services are running. One such service is the Hyper-V Virtual Machine Management service. To check this service’s status, enter the Services.msc command at the destination host’s Run prompt to open the Service Control Manager. Upon doing so, check to make sure that the Hyper-V Virtual Machine Management service is running and that its startup type is set to Automatic, as shown in Figure B.

Image
Figure B: The Hyper-V Virtual Machine Management service must be running on the destination host.

While you have the Service Control Manager open, it is also a good idea to make sure that the Remote Procedure Call (RPC) service is also running and that its startup type is set to Automatic. Incidentally, the Remote Procedure Call (RPC) Locator service is not required by the live migration process, and it is normal for this service to be stopped, as shown in Figure C.

Image
Figure C: The Remote Procedure Call (RPC) service must be running, but the Remote Procedure Call (RPC) Locator service is not required.

If you still have not been able to fix the error message shown in Figure A, then it’s a good idea to make sure that your firewall is not preventing live migrations from occurring. Shared live migrations are performed using TCP over port 6600. If you are using the Windows Firewall then exceptions should be enabled for Hyper-V and for the Hyper-V Management Clients, as shown in Figure D.

Image
Figure D: Make sure that the Windows Firewall is configured to allow Hyper-V and Hyper-V Management Clients to pass through the firewall.

The Destination Computer is Not Configured to Send or Receive Live Migrations of Virtual Machines

Another common live migration problem is an error message stating that the destination computer is not configured to send or receive live migrations of virtual machines. You can see what this particular error looks like in Figure E.

Image
Figure E: The destination computer is not configured to send or receive live migrations of virtual machines.

This is an especially easy problem to fix. In Windows Server 2012 and in Windows Server 2012 R2, you must give Hyper-V permission to accept inbound live migrations. This is Microsoft’s way of preventing rogue virtual machines from showing up on your Hyper-V host.

To correct this problem, open the Hyper-V Manager on the destination host. Next, right click on the container representing the host server and then choose the Hyper-V Settings command from the resulting shortcut menu. When the Hyper-V Settings dialog box appears, select the Live Migration option. As you can see in Figure F, there is a check box that you must select in order to enable incoming and outgoing live migrations on the host.

Image
Figure F: The Enable Incoming and Outgoing Live Migrations checkbox must be selected on both the source and the destination hosts.

There Was an Error During the Move Operation

Another common live migration problem is a generic message indicating that there was an error during a move operation. Typically however, this error message will contain some extra text that tells you more about the specific condition that caused the problem. For example, if you look at Figure G, you can see some explanatory text indicating that there is not enough memory in the system to start the virtual machine.

Image
Figure G: Live migrations can fail due to memory shortages.

Don’t let the term “virtual machine gateway” in the error message confuse you. In this case, Gateway is the name of the virtual machine.

So obviously this problem occurred as a result of a memory shortage on the destination host. But here is where things get tricky. The destination host that I am using has 32 GB of memory. If you look at Figure H, you can see that the running virtual machines are only consuming roughly about 22 GB of memory.

Image
Figure H: The Host Server is equipped with 32 GB of memory, but the running VMs are only consuming 22 GB.

A quick look at the destination host’s Resource Monitor seems to confirm these estimates, as shown in Figure I.

Image
Figure I: The destination host has nearly 10 GB free.

So why can’t the VM be migrated? The problem is related to the way that the host operating system is managing the server’s memory. I captured the screen shots above shortly after the server was booted. There were a lot of demand spikes going on during that time. In fact, if you look back at Figure G, you will notice that there are three virtual machines that are not running. The reason why these VMs weren’t running is because Hyper-V indicated that there was insufficient memory to start those VMs. However, after waiting a bit for the system to stabilize, I got some of the memory back (as indicated by the Resource Monitor screen capture shown above), and was able to start the remaining VMs, as shown in Figure J.

Image
Figure J: The system eventually released some memory and I was able to boot the remaining VMs.

Conclusion

As you can see, memory usage on Hyper-V hosts is anything but linear. In the next article in this series, I will talk more about how host memory usage can impact live migrations.

If you would like to read the other parts in this article series please go to:

About The Author

Leave a Comment

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top