TechTutorials - Free Computer Tutorials  

Fundamentals of Exchange 5.5 Databases and Logs 

Added: 01/22/2002, Hits: 3,030, Rating: 0, Comments: 0, Votes: 0
Add To Favorites | Comment on this article
By Brian Talbert

You probably are aware of the problem of backing up data while a server is online. When applications such as Exchange open their databases, they remain open the entire time the program is running. For Exchange this typically means that the database is inaccessible to all but Exchange from the time the server starts until the time the server shuts down. Traditionally, you would have no choice but to shut down the service that is keeping the database open, back up the necessary files, and the start the services again. This is an obvious problem when your applications require high availability.

Fortunately, Exchange ships with a replacement version of NTBACKUP. This new version install automatically through Exchange Setup and provides some nice enhancements to the current version. Mainly, it allows Exchange databases to be backed up while the server is online! This will be described a little later on. First, let's take a look at Exchange datbase files and how the operate to provide fault tolerance.

The Database Files
All Exchange data is saved in Exchange Database files. They each, appropriately, have an EDB extension. The following table outlines the Databases and their associated files, along with default location.

Information StorePUB.EDB, PRIV.EDB, *.LOGexchsrvrmdbdata
DirectoryDIR.EDB, *.LOGexchsrvrdsadata

How Transaction Logs Work
Note that the Information Store actually consists of TWO EDB files. Also notice that in addition to the various EDB files, each database also has associated LOG files. Exchange implements the idea of LOG files for both performance and recoverability reasons.

All data is first written to log files before being written to the actual databases. If you are familiar with high-end database products, such as MS SQL Server, then you are no doubt familiar with this concept. It works in much the same way with Exchange, though actual implementation varies.

A transaction is a discrete unit of work that affects one of the three databases. It could be the addition of a message to the Private Information Store, or the deletion of a message from the Public Information Store, or even the modification of an object in the Directory.

When Exchange receives such a transaction, it is first written to the LOG files. Many such transactions may be recorded to the LOG files. When the CPU has idle cycles they are then copied from the LOGs to the actual Database. The transactions in the LOG are then marked as "committed" and will eventually be purged (more on this later).

This series of events has the benefit of increased performance. Because the data is first written to the LOG files, the active database is not having to contend with handling both READ and WRITE I/O operations. Instead, the WRITE operations can wait for idle CPU cycles. This is especially true when the database is on a RAID5 disk subsystem that must write parity information in addition to the data. Furthermore, since the LOG files are maintained in a SEQUENTIAL order, the disk heads do much less travel during WRITE operations. Compare this to the EDB files which have a more randomized organization.

The more significant benefit, though, is recoverability. Because data is written to the LOG in addition to the database, a second copy is available in the event the database is lost. This benefit is obviously most greatly realized when the LOG files and the EDB files are kept on seperate physical disks (highly recommended!). If this is the case and the database disk dies, a backup can be restored to a new disk and the current log files can be replayed against the database, providing recovery right up to the point of failure.

The Log Files
Log Files are pretty much self-maintaining but it is important to understand how they work so that you can plan your system effectively. As mentioned, log files are maintained within each Database directory. The current primary log file is always named EDB.LOG. Note that although the Information Store consists of two EDB files, there is only one primary LOG file.

Log files are ALWAYS 5MB (5,242,880 bytes) in size. This is regardless of the amount of actual data contained within the database. Once the EDB.LOG fills up, it is renamed using a sequential number scheme, EDB00001.LOG, EDB00002.LOG, EDB00003.LOG, and so on. A new EDB.LOG file is created and all current transactions go into the new EDB.LOG file.

The Checkpoint
So, what happens to all of these LOG files? As each transaction gets written to the database, they are marked as committed. During the next FULL backup, any logs that are ENTIRELY committed are PURGED, or deleted. This is maintained through a concept known as the "checkpoint file". Each database has a file called EDB.CHK. This file contains entries for each transaction indicating if the transaction was committed. When data is written from the LOG to the EDB file, the entry in the CHK file is updated to reflect is as being committed.

Since each of these LOG files are kept until a FULL backup is performed, they contain all data SINCE the LAST FULL backup. As such, when an INCREMENTAL or DIFFERENTIAL backup is performed, it is actually the LOG files that are being backed up instead of the EDB files.

The Reserve Logs
Now consider what might happen if there were not enough room on a hard disk for all of these transaction LOGs? Planning is obviously very important. However, Exchange does take this into account. Each database (IS and DS) keep two RESERVE LOG files, RES1.LOG and RES2.LOG. In the event that there is not enough room to create a new EDB.LOG file, one or both of these LOG files will be used to write any current transactions and then the respective services (IS or DS) will be shut down. Remember that the LOG files are preallocated at 5MB ... so there is at least room for 10MB worth of transactions.

How Online Backup Works
So HOW does Exchange enable the EDB files to be backed up while in use and still maintain integrity? It is quite simply eactually. If a transaction is received that would alter a portion of the EDB that has NOT yet been backed up, Exchange simply writes it into the EDB file as it otherwise would. However, if a transaction is received that would modify a portion of the EDB file that HAS been backed up, it is not only written out to the EDB file but also written out to a PATCH file. After the EDB has been backed up, the .PAT files are also backed up. Later, during the restore, these PAT files will get rolled into the EDB file on tape. The .PAT files are only visible during the actual backup operation. Once backed up they are deleted from disk.

Circular Logging
All of this LOG functionality is certainly great ... if it is enabled. Remember that CIRCULAR LOGGING IS ENABLED BY DEFAULT and therefore all of this logging functionality is next to useless.

When Circular Logging is enabled, transaction logs are still used, but they aren't saved. This means you get the performance benefit of logs but you don't get the recoverability features.

Exchange will attempt to maintain about 4 log files. If I/O is heavy there may be more. But essentially, when it is time to create a new log file, Exchange looks at the oldest one. If it is completely committed it gets purged. This means that your existing LOG files will not contain all dat since the last backup. For this reason Incremental and Differential backup options are NOT available while Circular Logging is enabled.

Comments (0)

Be the first to comment on this article

Related Items

7 Seconds Resources, Inc.

IT Showcase