Cloud migration is not always a smooth journey. One of the risks of putting your infrastructure into the cloud is this: What happens when the “lights go out” and you can’t contact anyone? Whenever I discuss cloud migration with a client I make sure they understand that this is indeed a risk, albeit a slight one.
And then one day it happened to one of my colleagues, whom I’ll name Bob, as he prefers to remain anonymous. Bob works for a company that provides IT support and managed services for small- and midsized businesses in the southern USA. On the day in question risk became reality when Bob was migrating his client’s email system to Office 365.
The client initially had their email hosted on their provider’s servers. For reasons that were not 100 percent clear, the client wanted to migrate away from this as quickly as possible. So on Thursday afternoon, Bob was told that the cloud migration had to be completed before the weekend arrived. This meant that 28 mailboxes and four domains all had to be moved without the current provider realizing anything was happening.
After a bit of discussion, the customer and Bob decided that Office 365 would be a better solution than migrating to the in-house mail service hosted by the company Bob worked for. The reason for this decision was that Office 365 seemed to offer better reliability, larger mailboxes, and, should the client later decide they needed to move to have Bob’s company provide support for their IT needs, better portability. In particular, the administrator of managed services at Bob’s company felt that their mail servers would have been busting at the seams had they suddenly added such very large mailboxes as this client said they required.
So on Thursday afternoon Bob’s company prepped O365 tenancy for the client, created the necessary accounts, and assigned permissions and email addresses. Friday was then spent double-checking everything, and a cutover to the new mail system was done at the close of business that day. By Saturday morning everything had been uploaded and seemed to be working well. Workstations had been reconfigured, all their data was available, and not a single message appeared to have been lost. It was a very clean cutover — or so it seemed.
Then something unexpected popped up: The realization arose that the client also had a line of business (LoB) application that needed to send emails. Unfortunately, this requirement had not been brought to Bob’s attention prior to the cutoff. Well, this is what happens when you don’t get enough time to plan everything out properly. So a connector was hastily configured for their Office 365 environment and once again all seemed to be well.
Then Sunday morning Bob got a call from the client that their emails were bouncing with an NDR indicating “thresholds exceeded.” A quick call to Microsoft Support confirmed that the connector has a limit of 1,000 messages per hour and, according to their systems, that limit was being exceeded. This didn’t match what the LoB app was reporting, but Microsoft Support insisted it was correct. The working theory they had was that the import of messages as part of the cloud migration process may have triggered the connector. The solution was to turn all thresholds off for seven days. This would then be reset, and within the hour all messages would be sent again.
So, where’s the problem?
Except it didn’t work. From that moment on, despite dozens of calls to Microsoft, the customer was never able to send another email externally through Office 365! They were told that the thresholds for the connector had been reset 11 times, but the LoB app kept on busting it. Over the next three days, the app actually attempted to send 468 messages in total while the entire company only attempted to send some 3,000 messages according to the Office 365 reporting feature. This was nowhere near any of the thresholds, so clearly the problem was something happening on the Microsoft side of things. As a precaution, on Monday morning they took out the connector and reconfigured the LoB app to use one of the servers at Bob’s company as a mail relay. But it didn’t work: excess usage was still being reported on the connector.
For three days the only response they got from Microsoft Support was the issue had been logged and was being escalated. They were promised again and again for three days that it would be “working within the hour.” At no time, however, was Bob able to talk to anyone on the escalation team; he could only get hold of the entry helpdesk. He even received a call at 3 a.m. on Wednesday telling him the problem had been fixed — but it hadn’t.
On Tuesday afternoon, Bob migrated the customer once again to the hosted mail system at Bob’s company. This enabled the customer to have their mail fully up and running before they opened for business on Wednesday morning. Wednesday was then spent importing messages once again and sorting out other minor issues. This seemed strange to both Bob and me because, after all, even if you did have a user or connector that’s exceeding the threshold, should that disable the ability to send for the entire company?
Anyway, on Wednesday afternoon a so-called “Office 365 Ambassador” emailed Bob apologizing for the frustration his customer had been experiencing. The apology, however, included the following words: “I do think in your case, however, that there seems to have been a regional service health incident that may have been going on since yours is not the only case I have had with this issue. And it seems our backend team got a bit swamped trying to work all the issues as they came in.”
I have to wonder what I would do myself if I ever come across a similar situation. I talked with Bob afterward and reviewed his cloud migration process and I don’t feel he made any mistakes. He had contacted Microsoft Support and had followed their instructions to the letter. No amount of pleading or begging could get the issue resolved or even looked at by a senior technician at Microsoft. The only solution he had, after three days, was to choose another solution that worked, namely, migrating the customer to the hosted mail service provided by Bob’s company.
Cloud migration: Always some risk
I’ll conclude by saying that I’ve known many businesses that have migrated their mail systems to Office 365 and though there are sometimes some hitches in the process, things usually turned out OK in the end. Office 365 is a robust and reliable business-class communications and collaboration solution for small- and midsized companies, and I have no qualms recommending it to businesses that ask me about it. Bob feels the same way and he has lots of other Office 365 customers, and this is the first time I’ve had to contact support for any of them. I sincerely hope it’s the last.
But this story does highlight one important fact about migrating any aspect of your company’s IT infrastructure to the cloud: There’s always some measure of risk involved.
Featured image: Pixabay