This is the latest of Mitch Tulloch’s “trench tales,” stories and anecdotes he’s collected from colleagues on the front lines working in IT about how they faced and solved various problems over the years.
A while back, a colleague reported to me about another colleague’s frustrations trying to complete the migration of an organization’s Information Rights Management (IRM) infrastructure from Windows RMS to AD RMS in Windows Server 2012 R2. For those of you who aren’t too familiar with these technologies, Windows RMS was a feature of the now-obsolete Windows Server 2003 operating system while AD RMS, which stands for Active Directory Rights Management Services, is a server role in Windows Server 2012 that organizations can use to augment their security strategy by protecting documents using IRM tools and technologies. As the story unfolds below, you’ll see both that migrations of this scale can be very, very finicky to accomplish with lots of tricky steps and stuff to worry about. And you’ll also see the kind of innocuous little things that mess everything up and drive even experienced IT pros into the wall. Once again, I’ve obfuscated certain details of this story that might identify the individuals and organizations involved, and I’ve also rearranged the story a bit to make it more interesting and build some suspense. But as you’ll soon see the moral of the story really has nothing to do with RMS and is actually much broader in its application.
The consultant (we’ll call him Bob) began by deciding to follow Microsoft’s official step-by-step documentation on migrating from Windows RMS to AD RMS, which can be found here on TechNet. After all, how hard can it be if vendor has provided you with a detailed roadmap to follow?
Anyway, Bob carefully performed all the required steps such as updating DNS records so the client could find the new AD RMS server, importing the trusted publishing domain from the old RMS server into the new infrastructure, and so on. Once he had finished all the indicated steps in the guide, he tried to open a Microsoft Word document that had previously been protected using Windows RMS and found he couldn’t open it. Instead, an error message was displayed by Microsoft Word that said “Cannot verify user information at this time. Do you want to open the document using a different set of credentials?” These are the kinds of things that IT pro nightmares are made of, for imagine if all of your organization’s documents become unreadable after you complete your migration.
Attempts at finding a resolution
Bob discussed his problem with several colleagues who had experience and expertise with AD RMS, and they advised checking to make sure he had done everything necessary and tried everything possible. For example:
- Making sure that the name of the old server is a Subject Alternative Name in the SSL certificate for the new RMS server. (Check.)
- Making sure that the Certificate Revocation List (CRL) is online and available. (Check.)
- Try temporarily disabling CRL checks in Internet Explorer on the client to see if the error still persists. (Done. Error still happens.)
- Making sure that the new RMS URLs have been added to the Local Intranet Zone in Internet Explorer on the client. (Check.)
- Make sure that any URLs you’ve added to the Local Intranet Zone have the correct prefix i.e. HTTP or HTTPS. (Yes, they’re correct.)
- Doing a network packet trace to verify that the client is actually attempting to contact the new RMS server. (Yes, the client is trying to retrieve the RMS license from the new server.)
- Try clearing the license store for the version of Microsoft Office you have installed on the client. (Done. No change.)
- View the IIS logs to see if the user account being used on the client is recorded there. If not then the problem is on the client side not the server. (The user account is present in the logs).
And so on.
As Bob was trying to dig deeper into the configuration of his old and new RMS servers, he noticed that the old RMS server required HTTP instead of HTTPS for inbound connections from Microsoft Word trying to open documents on the old server. Some research online led him to this Wiki article on Microsoft TechNet which seemed to cover specifically with the kind of situation he was dealing with. The article advises to create a new registry value named LicenseServerRedirection on the client, so Bob dutifully tried this out. Unfortunately, it didn’t work.
A colleague then suggested opening some of the old server’s documents using Notepad to see what URL was being used for the server. Meanwhile Bob ran another network trace and was able to confirm that Microsoft Word was indeed attempting to establish an HTTP connection to the new AD RMS server instead of using the required HTTPS type of connection. He tried creating a redirection in IIS from HTTP to HTTPS but this had no effect.
Doing some further research, Bob then turned up this page on TechNet, which explains various Microsoft Office registry settings and what they are used for. On this page, he found the same registry value named LicenseServerRedirection as before, but this time the value to be specified for the settings was different from what the Wiki page had specified. Bob tried this different syntax and, lo and behold, he was now able to open documents encrypted on the old RMS server with the new AD RMS server!
What did Bob learn from all this? Sometimes documentation can be wrong!
How common are documentation errors?
There aren’t any statistics available concerning how often documentation errors occur in products by Microsoft and other software vendors. But in my over 20 years of working with Microsoft products, I’ve come across several documentation errors myself on TechNet pages for Windows Server, Microsoft Exchange, and most recently Windows PowerShell. Documentation errors in PowerShell Get-Help cmdlet output seems to be the most common form of doc error these days, and this might be due to the accelerated pace of PowerShell development for Microsoft’s server platform and products. Devs are generally not the best when it comes to writing documentation, not because they don’t know their stuff (they know it inside out) but probably just because they’re so busy developing and so don’t care that much at getting the step-by-step procedures right, thereby allowing documentation errors to creep in.
The bigger problem, though, is that a single documentation error can nullify hours or even days of effort spent troubleshooting an issue you’re experiencing. If you’re simply following a step-by-step procedure on TechNet or some other Microsoft site and can’t get things working, you might be wise to suspect some sort of documentation error. This could be an incorrect registry value, a missing configuration step, a typo that says Allow when it should be Deny, or some other type of mistake or omission. What should you do if you suspect this?
If you’ve opened a support ticket with Microsoft, you should work through things until your problem is resolved. If it turns out that the problem was documentation errors on their part, you should push Microsoft Support hard to get your ticket fee cancelled — talk to the support engineer’s manager, or the manager’s manager if necessary to ensure this.
If you haven’t opened a support ticket because you’re a consumer or small business that isn’t using volume-licensed software, try opening a chat session with Microsoft Support from this page or its equivalent in your own country. Tell the helpdesk person right off the bat that you suspect documentation errors because you’ve followed the steps exactly as described in their technical documentation. If this doesn’t lead anywhere then find a suitable TechNet forum relating to your issue and post a detailed description of your problem there. The first few responses you’re likely to get will probably be canned responses from Microsoft support engineers, but if you keep making noises you’re likely to eventually interest a few peers who may have more hands-on real-world experience with the product(s) than Microsoft’s own support people.
Keep this in mind — the squeaky wheel gets the grease!
One takeaway: Look before you leap into migrations
Let me conclude with some reflections relating to the technologies I talked about at the beginning of this article. Security technologies for IRM are constantly evolving, and with Microsoft’s new insistence on cloud-everything solutions, it should come as no surprise that Microsoft hopes to get organizations who use AD RMS to migrate to their new Azure Information Protection feature of Microsoft Azure. The advantages of cloud-based Azure Information Protection over traditional on-premises AD RMS solutions are clear: no server infrastructure is required; built-in support for mobile devices; cloud-based authentication; plus a whole bunch more tweaks and goodies. But migrating anything big like this is still a far from trivial task, and the story in this article may serve as a kind of warning that something as simple as documentation errors can lead to hours and hours of frustrating efforts to complete the migration and get the new system up and running. Perhaps cloud-based solutions like Azure Information Protection are best left for organizations who haven’t yet dipped their foot into the IRM pond. After all, if your on-premises solution is working, why replace it? And remember — migrations generally aren’t very much fun.
Photo credit: Pixabay