Exchange Backup and Restore - Streaming and Volume Shadow Service (VSS) Techniques
Extensible Storage Engine (ESE)
The core storage technology of Microsoft Exchange Server and Active Directory is called Exchange Extensible Storage Engine (ESE), also known as JET Blue. It is an Indexed Sequential Access Method (ISAM) data storage technology whose purpose is to allow applications to store and retrieve data via indexed and sequential access. Window Mail and Desktop Search in the Windows Vista operating system also makes use of ESE to store indexes and property information respectively.
To highlight basic features of ESE –
- Microsoft JET is an advanced 32-bit multithreaded database engine that combines speed and performance with other advanced features to enhance transaction-based processing capabilities.
- A crash recovery mechanism is provided so that data consistency is maintained even in the event of a system crash.
- Transactions in ESE are highly concurrent, making ESE suitable for server applications. ESE caches data intelligently to ensure high performance access to data.
- In addition, ESE is lightweight making, optimized for fast data storage and retrieval.
The ESE Runtime (ESENT.DLL) has shipped in every Windows release since Windows 2000, with the native x64 version of the ESE runtime shipping with x64 versions of Windows XP and Windows Server 2003. The 64 bit edition support started with Exchange 2007; up to Exchange 2003, ESE runtime was shipped with only the 32 bit edition.
Exchange Information Store
The information store, which is the key component for database management in Exchange Server, is actually two separate databases. The private information store database, Priv.edb, manages data in user mailboxes. The public information store, Pub.edb, manages data in public folders.
Private store consist of .edb and .stm files. The .edb file is the main repository for the mailbox data. The .edb file is accessed by ESE directly. The fundamental construct of the .edb file is the b-tree structure. The ESE 4 KB pages are arranged into tables that form a large database file containing Exchange data.
The .stm or streaming media file is used in conjunction with the .edb file to comprise the Exchange database. Both files together make up the database, and as such, they should always be treated as a single entity.
The information store works with the Messaging Application Programming Interface (MAPI) and the database engine to ensure that all user actions are recorded on the server's hard disk.
Companies, irrespective of their size, carry out careful Disaster Recovery (DR) planning for their business critical messaging system. Microsoft Exchange Server comes with standard API to programmatically backup and restore the Exchange stores (a.k.a. databases). As Exchange Server is transaction-based, performing a file-level or offline backup of the database files on disk can cause data inconsistency. The best way to ensure that you are preserving all data in the system, including transactions that have not yet been flushed to disk, is to perform regular online backups.
Microsoft allows four different types of backups for the entire server or single storage group. These are; Full, Copy, Differential, or Incremental backup type.
Full Backup - This backup type performs a backup of all the databases, transaction log files, and checkpoint files in a storage group, and after the backup is complete, truncates the log files.
Copy Backup - A copy backup performs the same steps as a full backup, but it does not truncate the transaction log files. You can use a copy backup to create a copy of the database for testing or analysis purposes.
Incremental Backup - The incremental backup backs up the transaction logs to record changes that occurred since the last incremental or full backup, and then truncates the transaction logs.
Differential Backup - A differential backup backs up the transaction logs to record changes that occurred since the last full backup, and does not truncate the transaction logs.
In the ensuing paragraphs we will look at the different Exchange Server backup techniques offered by Microsoft.
Predominantly there are two techniques for backing up and restoring Exchange Stores; these are:
Online Streaming backup approach,
Volume Shadow Service approach
1. Online Streaming Backup approach
Online backup through ESE API enables you to back up Exchange Server databases to your backup medium without shutting down the server. When Exchange Server is performing an online backup, all services, including the information store, continue to run normally. Pages continue to be updated in memory and transferred to the database files on disk, transactions are recorded in the log files, and the checkpoint file continues to move along.
The ESE backup and restore system supports backup and restore of entire storage groups, as well as individual databases within the storage groups. Because each storage group uses a single set of log files covering all of the databases in the storage group, restore and recovery operations should be done over the entire storage group.
Streaming Backup process
Several important operations occur at the start of Exchange storage group backup process. Here is the overview of the backup process:
When a full backup operation is initialized, ESE begins by flushing all the dirty pages in its cache to disk and halting the checkpoint. The checkpoint would not advance until the backup operation is complete. It is important to note that when a partial backup such as a differential, incremental, or copy backup runs, ESE lets the checkpoint advance because the backup operation does not touch the databases.
The next step in the backup process involves backing up the database files. The backup application uses backup API calls to pass to ESE a list of databases to be backed up. These databases are open files, so the backup application does not simply copy the databases to the backup set. Instead, ESE begins to send the backup application 64KB chunks of database pages (sixteen 4KB pages at a time) in sequential order. During this crucial step, ESE performs a checksum on each page; any errors cause the backup operation to terminate.
Next, ESE must store the transaction logs to the backup set. As mentioned earlier, ESE halts the checkpoint at the beginning of the backup. Although the checkpoint is halted, ESE continues to write transactions to the log files and continues to flush dirty pages from the database cache to disk. To back up the log files, the backup application uses the appropriate API call to request a list of log files (and patch files, if applicable) from ESE. When ESE receives this call, it closes the current log file, saves the file as the next log generation in a sequential list, and opens a new E0n.log file (n refers to the storage group—SG—instance of the log file). In the case of a full backup, ESE then returns a list of log files to the backup application; this list starts with the current log generation (in which the checkpoint was halted) and ends with the log generation that ESE just closed (i.e., E0n.log minus 1). In the case of an incremental or differential backup, ESE returns a list beginning with the oldest log generation on disk and ending with the most recently closed log generation. Using this list, the backup application can open file handles to the log files and copy them to the backup set. During this operation, ESE ensures that no log generation is missing from the sequence passed to the backup application.
After the log files have been stored to the backup set, they are not needed on disk. During full and incremental backup operations, ESE truncates the log files on disk after backing up the transaction logs is completed. The lower of either the checkpoint log generation or the log generation listed in the database header for the current full backup determines which log files ESE truncates.
After the log files are truncated, the backup operation is complete and the backup application closes the backup set. At this point, ESE can return to normal database-engine operations and permit the checkpoint to advance.
2. Volume Shadow Service approach
Microsoft Exchange Server 2003 uses the Volume Shadow Copy Service (VSS) that is included in the Microsoft Windows Server 2003 operating system to take volume shadow copies of Exchange Server 2003 databases and transaction log files.
What Is VSS?
VSS is a set of COM APIs that implements a framework that enables volume backups to be performed while applications on a system continue to write to the volumes. Requestors, writers, and providers communicate in the VSS framework to create and restore volume shadow copies. A shadow copy of a volume duplicates all the data held on that volume at one well-defined instant in time. By creating a read-only copy of the volume, backup programs are able to access every file (pertaining to the Exchange databases) without interfering with other programs writing to those same files.
The Exchange writer is automatically installed with Exchange Server 2003. Requestors can access the Exchange writer only if Exchange Server 2003 is installed on the Windows Server 2003 operating system. vssadmin utility to display list of writers and providers:
|vssadmin list writers - to display list of all installed writers
vssadmin list providers - to display list of all installed providers
VSS backups are not available for Exchange Server 2003 if Exchange Server 2003 is installed on Microsoft Windows 2000 Server.
VSS operates at the block level of the file system. There are three major components in the VSS framework, writer, requester (backup application) and provider (For more information about requestors, writers, and providers, see this article on MSDN here). The Volume Shadow Copy service coordinates communication between Requestors (backup applications), Writers (applications in Windows services like Exchange Server 2003), and Providers (system, software or hardware components that create the shadow copies). To use the Volume Shadow Copy service feature to backup Exchange Server 2003, the backup program must include an Exchange Server 2003 aware Volume Shadow Copy service requestor.
When instructed to do so by an Exchange aware requestor, the Exchange writer prepares Exchange databases for backup. The writer does this by suspending all disk write I/O to the databases for up to 20 seconds. This is referred to as freezing the databases. The provider must be able to complete the shadow copy within this window or the backup will be aborted. After backup finishes, the writer thaws the databases and resumes regular I/O operations.
VSS Backup Process
The backup process includes the following steps:
- The requestor initiates the backup process. The requestor instructs the writer to prepare a data set for backup.
- The writer prepares the data for backup. Exchange Server 2003 and other applications implement writers that prepare data according to the specific requirements of the application. After the data set is ready, the writer signals the requestor to back up the data set.
- The provider interacts with the disk system and manages shadow copies. When instructed by the requestor, the provider creates a shadow copy.
- The requestor signals backup success or failure to the writer, and completes the backup process.
By separating the functionality of requestors, writers, and providers, the VSS framework makes each component independent of the others. A single requestor can interact with different providers or with multiple writers.
Sonasoft Corp. automates the disk-to-disk backup and recovery process for Microsoft Exchange, SQL and Windows Servers with its groundbreaking SonaSafe Point-Click Recovery solutions. Designed to simplify and eliminate human error in the backup and recovery process, SonaSafe solutions also centralize the management of multiple servers and provide a cost-effective turnkey disaster recovery strategy for companies of all sizes. For more information, visit the Sonasoft website.
So which technology is better; Streaming or VSS? Both have precise advantages and limitations. Some of the important ones are listed below:
Both technologies allow Exchange database backups while the stores are online.
One thing is certain that the Exchange Server 2010 is making streaming approach obsolete (for more information on the changes made to Backup and Restore in Exchange 2010, follow this link). Since there will be no ESE support available, backup application providers will have no choice but to use VSS technology to provide DR functionality for Exchange 2010.
VSS is only supported from Exchange 2003 (Windows 2003 Server with SP1) onwards. Hence, the only option offered for Exchange 2000 was streaming backup and restore.
When a database is backed up using the Exchange streaming backup API, each page in the database is read in turn, and the checksum integrity of each page is verified during the backup process. The checksum integrity of transaction log files is also checked before they are backed up.
During a VSS backup, there is no opportunity for Exchange to read each database file in its entirety and to verify its checksum integrity. Therefore, database and transaction log file integrity must be verified by the backup application. This can be accomplished by running Eseutil; eseutil /k /i (for more information, check out these articles on MSDN, here and here).
The most important benefit of a VSS-based backup solution is that it allows for very rapid restoration of data. VSS solutions are most useful for deployments that include large databases that require a shorter restoration time (less than 60 minutes). This requirement is beyond the capabilities of current streaming backup solutions.