A First Look at Microsoft’s Data Protection Manager, Version 2
Since its initial release, I have used Microsoft’s Data Protection Manager (DPM) on my own network to protect my data. I always thought that DPM was an excellent application, but it always seemed as though some things were missing. Fortunately, Microsoft is hard at work on DPM version 2, which is currently in beta testing. DPM version 2 is designed to fill some of the voids left by DPM 2006. In this article, I will talk about what you can expect from DPM version 2.
What is Data Protection Manager?
Although I tend to think that Data Protection Manager is a great product, it hasn’t really been around for very long, and isn’t as popular as many of the other Microsoft Server products. It certainly isn’t as well known as Exchange Server or SQL Server. That being the case, I want to spend a few minutes talking about what DPM is and what it does. Everything that I talk about in this section applies to both DPM 2006 and DPM version 2.
DPM is a server application that is designed to back up your data. The difference between DPM and other backup solutions such as NTBACKUP is that DPM is designed to address the fundamental shortcomings of traditional backup solutions.
There are several problems with a typical backup solution. One problem is that the amount of data that a company accumulates grows exponentially. This is true even of a small network, such as my own. Just the simple fact of saving the Microsoft Word document that I am writing right now means that I will be backing up more data tonight than I did last night. Of course the problem with backing up more data each night is that backups will take longer and longer to complete as the amount of data that is being backed up increases.
Unfortunately, spending more time to backup data each night often is not an option. Many companies conduct business 24 hours a day therefore a long backup in the middle of the night would be disruptive since files typically have to be closed to be backed up
These challenges are not new, they have been going on for many years. Traditionally administrators have skirted the issue of increasing data and decreasing backup windows by only performing differential or incremental backups during the week and saving full backups for the weekend. While these techniques may work for some, even an incremental or differential backup takes time to complete each night, and that often means lost productivity.
Another problem with traditional backups is that the backup usually only runs once per day. To see why this is a problem, imagine that your company runs a backup every night at 11:00 PM and that the backup takes an hour to complete. If that’s the case, then what happens if you have a major problem at 10:00 PM, resulting in data loss? In a situation like that, you are going to lose every bit of data that has been created since the last backup was made (11:00 the night before). Nobody wants to have to explain to their boss why a full day’s worth of data has been lost.
Data Protection Manager solves these problems by using what is known as a disk to disk to tape backup. The basic idea is that data is backed up continuously throughout the day rather than making one large backup late at night. This technique greatly reduces the amount of data that will be lost should a system fail.
DPM’s backup interval is configurable. On my own network, DPM is configured so that it runs a backup once per hour. In doing so, DPM simply backs up anything that has changed in the last hour. One nice aspect of using this approach is that it allows you to maintain multiple versions of the files that you are backing up throughout the day. For example, if a user decides that they need to restore a document to the way that it existed four hours ago, that is now an option. The actual number of versions of a file that you can keep on hand varies depending on the amount of free disk space that your DPM server has allocated to the resource that is being protected.
So how is it that DPM can backup data throughout the day, when a normal backup application can’t? DPM relies on shadow copies to take snap shots of the data that is being backed up. This allows files to be backed up, even if they are open at the time that the backup is being made. When DPM backs up a file, it also checks to see if the file has been previously backed up. If the file has been backed up before, only the bytes of the file that have changed since the last backup are backed up. This allows DPM to conserve disk space and network bandwidth, allowing it to operate with great efficiency.
What’s New in Version 2?
On the surface, DPM version 2 looks almost exactly like DPM 2006. There is one very important difference between the two versions. DPM 2006 was limited to backing up file servers, but DPM version 2 will be able to back up application servers as well.
Specifically, this means that DPM version 2 will be able to back up file servers, just as the current version does, but it will also be able to back up databases associated with Exchange Server (including the soon to be released Exchange Server 2007), SQL Servers, and SharePoint Servers.
These particular applications have traditionally been unable to take advantage of backup mechanisms such as DPM because they do not rely on file level backups. Exchange Server for example, stores user’s mail, contacts, and calendar information in a database. However, that database remains open at all times. This means that a backup application can not simply copy the database file. The database file would likely change before the backup could complete. Instead, backup applications designed for Exchange Server make use of Exchange Server’s transaction logs.
The actual backup process is complicated, but here’s a simplified explanation. When new data (such as new mail messages) is destined for an Exchange database, it is not written directly to the database, but rather to a transaction log file. Once the transaction log file has filled up, the contents of the transaction log are committed to the database. When a traditional backup application backs up an Exchange database, it must lock the database so that no transaction logs are committed during the backup. Once the database has been backed up, the contents of the transaction logs are backed up, and then the transaction logs are finally committed.
As you can see, this process is very involved. In fact, a normal backup application can not back up an Exchange Server unless the databases are taken offline. A backup application must be Exchange Server aware in order to be able to perform a backup of an Exchange Server while the databases are mounted. It is this complexity that makes it such a big deal that DPM version 2 will be able to provide continuous protection of data stored on Exchange, SharePoint, and SQL servers.
In spite of its new capabilities, DPM version 2 does have its limitations. For example, DPM is unable to back up a server’s system state. This means that a bare metal restore of a server using DPM is impossible. Another limitation is that DPM relies on agents which must be installed onto the servers that are being backed up. This means that DPM is limited to backing up only servers that are running Microsoft operating systems. Not only does this effect those with Unix, Linux, or NetWare servers, but it also effects those who have data stored on NAS devices which do not contain a traditional operating system.
I tend to think that Data Protection Manager version 2 will set the new standard for data protection. One thing to keep in mind is that DPM will not be a replacement for traditional tape backup. It will still be important to backup data to tape so that a copy of the data can be taken off site after a backup is made, as protection against loss from fire, hurricane, etc. The difference is that traditionally, tape backups are used to backup servers directly. In an environment in which DPM is used, DPM will back up your servers, and the tape drive will only be used to back up the DPM server.