The Exchange Server 2000 Database Structure: It’s always a rainy day when you’re doing restorations
Exchange Server is a complex beast, to say the least. It's just a fact of life that we can't escape and if one is to work with Exchange it's just best to accept it and move on. As easy as Exchange 2000 is to work with, it's still not the easiest or most pleasurable thing to do on a Saturday afternoon (or any day for that matter). Understanding the underlying processes and construction of Exchange can go a long way towards increasing your happiness factor-which is, after all, the only thing that really matters in life!
Everything these days requires a database of some sort. The Internal Revenue Service, Active Directory, even the local supermarket-they all use a database. Databases are great, but they are not always easy to understand. Figure 1 shows an example of what the Exchange Server database looks like.
Figure 1 - The current database (from Chapter 28 of the Exchange 2000 Server Resource Kit).
As you can see from Figure 1, there are actually three files that make up a current Exchange database for a storage group, as explained below.
- The .edb file contains all the folders, tables and indexes for messaging data and MAPI messages and attachments.
- The .stm file (new to Exchange 2000) contains Internet content in its native format.
- The .log files (transaction logs) maintains a record of every message stored in a storage group and provides fault tolerance in the event that a database must be restored. Exchange 2000 log files are always 5MB in size (5,252,880 bytes) and if not, then they are damaged. Each storage group also reserves to log files, Res1.log and Res2.log that are placeholders for extra disk place that can be used if the service runs out of space.
The Checkpoint File
In addition to the files previously mentioned, there is one other file of special note that plays a big role in keeping your Exchange Server database in order. The checkpoint file (edb.chk) tracks which entries in the transaction log files have already been recorded, and thus which ones will need to be replayed during a restoration situation. The checkpoint file thus speeds up recovery by telling the ESE exactly which log file entries need to be replayed and which do not-thus preventing extra writing during the restoration process.
Typically, when a log file is filled, Exchange renames it and moves on to another, fresh log file. In this way, log files are not erased and thus continue to use space in 5MB increments. As the number of transactions grows, a set of log files is created. If a database fails, the transactions can be recovered by restoring the data from the log files. When circular logging is enabled, the first log file is overwritten and reused after the data that it contains has been written to the database. Circular logging is available to you, but is disabled by default. Should you enable circular logging, you cannot recover anything more recent than the last full backup. For this reason, circular logging is not normally recommended for use in a mission-critical production environment, with the possible exception of the Public folder that will house your NNTP news feeds, where log file sets are not required.
The concept of a checksum is not a new one. Checksums have been used for years to enable determination of file validity. Exchange Server makes use of checksums to verify the validity of the .edb files. Every .edb file is made up 4-KB pages and the integrity of each page is verified through a checksum and a 4-byte page number in the header of the database page. On each page in the database, the first 82 bytes contain the header information, which contains flags for the type of page and information about what kind of data the page contains. When the pages are read out of the database, they are compared for the correct page number and for the checksum. The checksum is calculated to ensure that the page being read is undamaged. If damage is detected, an error is returned, the database is stopped and an event is written in the to the event logs, thus ensuring that the database is operating with optimal integrity.
Other Important Files
Although not part of the actual Exchange database, the following two additional files may also be present on an Exchange Sever:
- The .srs files that permit backwards compatibility with Exchange 5.5 Server by emulating an Exchange 5.5 directory service. This will only be present if the Exchange ADC is installed and you then configure a Site Replication Server.
- The .kms files which provide security and encryption services. This will only be present on Exchange servers that have the KMS installed.
Yeah, But So What?
OK, so now we've got an idea of what makes up the Exchange Server databases and what special features they have. But who cares? What good does this do? Well:when I alluded earlier to working with Exchange on a Saturday afternoon, that seemed like a good time for the Exchange Server to crater and you find yourself in the midst of restoring it so that business can go on as normal on Monday. That's why you've got that pager after all, isn't it?
I discussed Exchange recovery in another article, Disaster Recovery, but I never really got into detail about setting up the backup system or how the restore action occurs.
Before we can get to the process of performing the backup, and more specifically, how Exchange handles a backup request, we need to understand what each type of backup will do for.
There are five basic types of backups that can be performed using ntbackup.exe, but only four of those apply to Exchange Server. They are summarized below:
- Full (normal) backups backup the entire Web Storage System and the Exchange log files. All transaction logs that contain transactions already committed to the database are deleted. Restoring from a full backup required only the full backup media. Full backups are the preferred means of backing up the Exchange databases.
- Copy backups act the same of full backups with the exception that the transaction log files are not deleted. You can perform a copy backup at any time without disturbing the status of any other type of backup.
- Incremental backups backup all log files prior to the checkpoint log and then delete them. Additionally, incremental backups back up all transaction log files and delete the log files that contain transactions that have been committed to the database. Restoring from an incremental backup requires that you have the last full backup and each subsequent incremental backup. If one incremental backup is damaged, you cannot restore any incremental backups made after that point as one damaged log file prevents replaying subsequent log files. It is critical that all incremental backups be restored prior to starting log file replay to prevent losing data or damaging the database.
- Differential backups backup all log files prior to the checkpoint file, but does not delete them. Due to this, each backup file will be larger in size than the previous file. Restoring from a differential backup requires that you have the last full backup and the last differential backup. Differential backups are the second most preferred method of performing backups, after full backups.
The Backup Process
When the backup process is started (using the ntbackup.exe), the Web Storage System informs the ESE that it is entering a backup mode and a patch file is created for each database in the backup (if for a full backup, otherwise a patch file is not created). The currently open log file is closed out and renames and new log file is opened at this time as well. This indicates the point at which the ESE can truncate the logs after the backup process has completed. Figure 2 illustrates the backup process.
Figure 2 - The backup process (from Chapter 28 of the Exchange 2000 Server Resource Kit).
When the backup is started, the agent requests that the database read and sequence all database pages from the ESE. As the database reads the pages, the ESE verifies them through a checksum to ensure that they are valid. If they are invalid, the backup stops to prevent the storage of damaged data. After the backup is complete and all the pages are read, the backup copies the logs and patch files to the backup set. The log files are then truncated or deleted at the point when the new generation started at the beginning of the backup. The backup set closes, the ESE enters normal mode, and the backup is complete.
The preceding description assumed that you were performing an online backup (databases online at the time of backup), which is the preferred mode since it allows the databases to remain online and usable. You can, however, perform an offline backup by taking the databases offline. Offline backups are always full backups as the databases are dismounted and therefore not available for writing by network clients.
Of course, the backup is just the first half of the solution. Being able to restore the data would be nice as well.
The Restoration Process
The restoration process pretty much mirrors the backup process, but obviously in reverse. Before you can perform a restoration, you will need to take the database (or storage group) offline by dismounting it. When the restoration process begins, the ESE enters restore mode. The backup agent copies the database from the backup media to the target location. The associated log and patch files are copied to a temporary location (as specified by the backup operator) so they aren't saved to the same location as current files in the Exchange or Production Database directory. Should the log and patch files happen to be places in the same location, log files can be overwritten which will cause corruption of the database. After the files are restored, a special instance of the ESE starts for the specific purpose of restoring the database. It applies the patch file and log files to bring the database up to date. After the restore is complete, the log and patch files are deleted from the temporary location and the storage group is mounted and made available for use. Figure 3 summarizes the restoration process.
Figure 3 - The restoration process (from Chapter 28 of the Exchange 2000 Server Resource Kit).
One Last Thing:
A point worth mentioning is that the version of ntbackup.exe that ships with Windows 2000 (5.0.2172.1) cannot be used to perform Exchange 2000 Server backups. You will need to have version 5.0.2195.1117 or later installed on your system. Figure 4 shows the version of ntbackup.exe that ships with Windows 2000 (unmodified) and Figure 5 shows the version that comes with Service Pack 2.
Figure 4 - ntbackup.exe original file.
Figure 5 - ntbackup.exe in Service Pack 2.
As we've seen, the Exchange Server database arrangement is a fairly complex one, although one that has safeguards built into it to minimize damage and prevent the use of damaged databases. The backup and restore processes are quite complex, although for the most part hidden away from us. The most important thing that I can leave you with is this: do not arbitrarily delete your transaction logs or checkpoint log-doing so may really, really screw up your weekend. Let Exchange and the backup process handle purging these files-its just better that way.