Exchange Server 2019 disaster recovery: Rise above the chaos

Email is not just sending a short message or long-winded document to someone. Your Exchange Server contains contacts, archive emails, user data, Skype for Business information — the list goes on and on. In an organization today, email is classed as one of the most important applications and generally has a hefty service-level agreement (SLA) assigned to it. These generally include how long Exchange can be down for. Exchange, of course, relies heavily on Active Directory, so if Active Directly is not online then Exchange Server 2019 won’t function. This makes your IT department’s job more difficult — and stressful — as it needs to ensure that backups and disaster recovery plans are in place and tested in an event that disaster strikes.

If you work in a large organization that has 10,000 or more users, recovering the database to test with means you need to have extra equipment lying around in your on-premises environment — you need to have a replica of your live environment to test against. Many organizations don’t have this as they always say they have “no budget” for a test lab. Well, when disaster happens, they want to fire the IT department or the guy in charge gets fired because, well, they cannot recover the data or the CEO’s Exchange mailbox is on some random store without redundancy and the disk fails or is corrupted. Not a pretty picture for the IT department.

And keep in mind that disaster does not have to be just caused by nature. It can be a whole lot of things. Here are a few examples:

Exchange Server 2019 disaster recovery

  • Hardware failures
  • Backups not actually backing up data
  • No test restores done
  • Malware/viruses
  • Ransomware
  • Power problems
  • Human error
  • Disgruntled employees
  • Changes without a change control
  • No antivirus
  • No security policies in place

Here are a few scenarios based on the above information. Not all scenarios are covered but some highlighted ones that stand out are:

  • Scenario 1: Hardware failures
  • Scenario 2: Ransomware
  • Scenario 3: Backups not working
  • Scenario 4: Changes without a change control

Let’s take a look at these in detail:

Hardware failures

disaster recovery and business continuity

As an IT admin, I have been to clients that pretty much run everything on one server. This includes the file server, Exchange, Active Directory, SQL, and more. When that host fails, well, they are completely down with very little option of recovery as no other hardware is available.

This boils down to the fact that budgets were used for other things rather than ensuring the physical environment is intact. This means that in an event of a failure, they don’t have a backup.

A way to avoid a nightmare is to have two hosts adequately spec’d so that in an event of one failing, the other server part of that cluster can take over so that Exchange and AD can keep running while the other one is recovered.

Ransomware

Ransomware has been around for a while, and despite the disaster it can cause to a company, many people turn a blind eye as they have the attitude of “it won’t happen to me.” The increasing dangers of exposing servers to the Internet and not locking them down increases the chances you will be hit by ransomware. For example, an IT admin uses his Exchange Server as a server to download his personal things like movies or surfing the Internet because it has unrestricted access. Like many, they don’t install an antivirus on the server and think that they can get away with it. Unfortunately, if the server is infected, all the information is encrypted. We not going to deep dive into this, but if this happens to your system, you can be hit by file-level encryption.

This means that email is going to be unavailable as the entire machine will be down and the company will have to pay for the decryption key. To recover an Exchange Server in this scenario would mean setting up a new virtual machine and using the /recoveryswitch when the setup is rerun so that all settings and information are pulled down. The disks will need to be attached to the VM if they weren’t encrypted or would be after the disk was decrypted and you would need to look at doing mailbox moves to new stores as the old ones cannot be relied upon.

ransomware

Backups not working

Many IT companies have backups and think that because it says “completed,” then everything is okay. But the sad fact is that under the hood, everything may not be okay. When the time comes to restore the data, the dataset is either corrupt or nothing got backed up as no restores were done to validate the data. Exchange Server 2019 works with most backup software, but you need to follow the guidelines to ensure it is working correctly. Your disaster recovery plan should include running proper tests in a lab and ensuring that you are able to mount the Exchange databases but also work in applications like Outlook.

Changes without a change control

Larger companies have a Change Advisory Board (CAB), meaning that each week the change controls for that week are tabled and agreed upon. This is a good part of a disaster recovery plan to ensure business continuity with no downtime if a change goes wrong. In some organizations, they don’t have this in place and IT staff must do changes on the fly. Here’s a scenario: Joe sees that the latest cumulative update is available for his Exchange. He has done these many times, so he thinks, “What can go wrong?” He goes and installs the cumulative update and it fails. Now the entire Exchange Server is offline as the services are disabled and the update won’t rerun. In a panic he tries various things and blames the system but in actual fact he made an unauthorized change without notifying the correct people and didn’t plan it correctly to ensure things ran smoothly rather than rushing to get it done.

If you as a company have your disaster recovery strategy in place with all the testing, Exchange should be online and running without issue as you will have multiple database copies spread across datacenters to ensure you have met your company’s SLA percentage uptime.

Exchange Server disaster recovery: When all else fails

You also have a final option installing some third-party application to revive your crashed Exchange Server or an Active Directory server that cannot be recovered or salvaged due to unknown circumstances. There are several good products out there. One that I have used with success is Stellar Repair for Exchange. Not only can this open any corrupt Exchange database, but it can restore the data back to Office 365 or to your Exchange server or merely export all the mailboxes to .PST. The software is very lightweight and be done directly from the server itself.

I have seen organizations keep these third-party recovery tools in the IT department and make it part of the Exchange Server disaster recovery plan in an event that the SLA cannot be met. With these tools, the IT department can hopefully get Exchange back up and running so users can carry on working, and after everyone is able to get things done, admins can work with less strain to get the environment and emails restored and mailboxes back to where they were before disaster struck.

Featured image: Shutterstock

About The Author

4 thoughts on “Exchange Server 2019 disaster recovery: Rise above the chaos”

  1. I am glad I found this reading which repflect the life of major IT staff working on on premise email system.

    . We always have to fight with management saying there is no budget anad when it comes to system failure, the tech staff is the one who is guilty

Leave a Comment

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top