Streamlining the backup process with Microsoft’s Data Protection Server
Have you ever really stopped to think about your company’s backups? I’m sure that you and your company have invested lots of time into an elaborate disaster recovery strategy. Even so, the simple truth is that most company’s daily backup strategy involves backing up data to a tape and then sending a tape off site for protection. While this is certainly a tried and true method for data protection, the simple fact is that it is badly outdated. In fact, companies have been using this particular approach to backing up data for over twenty years. Sure, the hardware and the backup procedures have become much more sophisticated over the last two decades, but they are still based on twenty year old technology.
Jus to show how antiquated this method for backing up data really is, I want to share with you some of my earliest experiences with backups. I will admit that I was not involved in computers 20 years ago. I got involved in computers about seventeen years ago, at the age of fifteen. At that time, I managed to scrape together enough cash from cutting lawns that I was able to buy my first personal computer. Disk drives were still extremely expensive (at least compared to my budget and to today’s prices) and hard drives were almost non existent in the personal computer market. I got sick of losing the code that I had written every time I turned off the machine, so I invested in a tape drive that I could use for data storage.
The tape drive used audio tapes to store data, but the technology wasn’t all that different from what companies are using for backups today. It didn’t take me long to realize however, that saving data to tape was less than reliable. I would often save multiple copies of my data on several different tapes in hopes that one of the copies would be good.
Today, things have not changed as much as you might expect. While it’s true that nobody uses a tape drive for their primary data storage anymore, tape drives are widely used for backup. What hasn’t changed much though is reliability. Companies face many of the same problems restoring backups today as I had with my tape drive seventeen years ago. Often times the data on the tape is unreadable or corrupt. Tapes are also occasionally destroyed when hungry tape drives eat them. In fact, a recent study indicated that 42% of Microsoft’s customers who use tape backups had at least one incident in the last year in which they were unable to restore a backup because of a bad tape.
There are several other problems that plague administrators when it comes to backing up data. To put it in a nut shell though, the amount of data being backed up increases every day. However, administrators are being given less and less time to back up more and more data. Some organizations are even trying to minimize potential data loss by doing more than one backup per day.
The Good News
The good news is that Microsoft has a possible solution to everyone’s backup woes. Rather than produce another product based on 20 years worth of backup technology evolution, Microsoft has created a backup system based on disks rather than on tapes. The new product is called Data Protection Server.
Data Protection Server has just now entered the beta testing phase and is expected to become available within the first quarter of 2005. There is no word yet regarding pricing or how the product will be licensed. What I can tell you is that Data Protection Server will ride on top of Windows Server 2003 and can be used to back up data residing on Windows 2000 Server, Windows Server 2003, and on Windows 2003 Storage Servers.
The product’s main goal is to remove many of the inefficiencies and complexities associated with backup and restore operations. For example, imagine that today you have ten different servers that you backup each night. Depending on how much data exist on each server, you might have a separate tape drive on each machine or you might have a tape loader on a central server that is configured to back up each server. Either way, you are spending a lot of time backing up a lot of data to a lot of tapes.
Typically in such an environment, a full backup would be done on the weekend and incremental backups would be done throughout the week in an effort to save time and tape space. This technique works well enough, but things become problematic when it’s time to perform a restore operation.
One issue with restoring data is that this configuration leaves a lot of room for data loss. For example, imagine that you run the backup at midnight each night. Now, imagine that the President of your company creates an important file at 9:00 AM and spends all day working on it. Now, imagine that at 4:00 there is a massive server crash and you have to restore an entire volume. You will get a lot of the data back, but you won’t be able to restore the president’s file because it hasn’t been backed up yet.
Another issue with restore operations is that they are very labor intensive and the end users don’t care that you have better things to do. Right now, if an end user calls you and asks you to restore some really unimportant file that was accidentally deleted, you will have to get the tape from last weekend and all of the tapes that have been made since. You would then have to index the tape from the weekend, find the file, start the restore operation, and switch tapes when ever the restore operation called for it. Since tapes are sequential rather than random access, the restore operation can take a very long time to complete, even for a small file.
Now, let’s look at how these situations could play out if Data Protection Server were being used. Data Protection Server takes a completely different approach to backing up the data. The software allows you to create groups of servers or volumes. You can then create a backup schedule and apply it to everything in the group. This means that you can back up ten servers just as easily as you could back up a single server.
The actual backup process is based of Distributed File System and on Volume Shadow Copy. This has several implications for the backup process. First, it means that you can backup files while they are open. Second, it means that you no longer have to wait for a convenient time to run the backup. You can backup data as often as you want (hourly if you like).
That all sounds great, but you are probably wondering about the feasibility of frequent backups if you have a huge amount of data to backup. Data Protection Server gets around this problem by taking a unique approach to incremental backups. Imagine for instance that someone in marketing has a one hundred MB power point presentation that they are working on for a big client. Since the presentation is actively being worked on, the file will be constantly changing and will be backed up during each backup cycle. To prevent the backups from taking too long or from consuming too much disk space, the whole 100 MB file is only backed up once. After that, only the bytes that have changed are backed up. This allows you to store up to 64 versions of the file in a tiny fraction of the amount of space that it would take to store 64 full copies. This byte level backup technology also saves disk space and network bandwidth. There is also a built in bandwidth throttling feature that insures that data related to backups won’t choke out other traffic on your network.
With that said, let’s go back to the situations that I described earlier and see how Data Protection Server could help. As you will recall, one of the situations involved the president of the company losing a file that hadn’t been backed up yet. However, if you are now backing data up hourly then no one should ever lose more than 59 minutes worth of work.
So what about the user that asked you to restore some insignificant file? In the previous example, the restore took hours. With Data Protection Server though, the entire procedure is reduced to a couple of minutes. There are two main reasons for this. First of all, disks are random access. It is possible to directly access the file within the backup rather than having to wait for a tape to scroll to the correct location. Some backup programs today, such as the one that I use, will allow you to store data to a disk, but all of the data is stored within a single file, so a restoration is still sequential because the backup program must read through the entire file to get to the spot where the file exists.
The second reason why the restore operation is so much faster is because there is no tape swapping. All incremental backups are stored within a common location, so you don’t need to swap media.
There is one other thing that can make the recovery process even easier. Data Protection Server allows you to set permissions as to who is allowed to perform restore operations. This means that you can restore files, the help desk can restore files on your behalf, or you can grant users permission to restore their own files! The restore procedure is supposedly so easy that any user will be able to do it.
The Bad News
As promising as Microsoft’s Data Protection Server looks, it does have one major downfall. If your data center were to be destroyed by a nuclear bomb, runaway bulldozer, bratty kid, or what ever, the data protection server would also be destroyed if it existed within the same building. Current backup schemes usually call for safeguarding data by moving the tapes off site. Data Protection Server doesn’t really give you that luxury though.
That isn’t to say however that such capabilities may not exist in the future. Data Protection Server is designed to run on Windows Server 2003. As such it relies on things like DFS and shadow copies. The Windows Server 2003 implementation of DFS supports storing replicas of data on multiple servers. By combining this replication technology with storage related technologies such as iSCSI (Internet SCSI or SCSI over TCP/IP), it may eventually be possible to back data up to an off site Data Protection Server.
To be fair, Data Protection Server is just now entering beta testing and it is possible that off site backup capabilities could even exist by the time that the product is officially released. Either way, I believe that Data Protection Server is a product worth keeping an eye on.