This article describes the concepts and implementation of configuring log shipping when using IBM® DB2® Universal DatabaseTM for open systems.
As database systems become more and more critical to business success, the need to ensure 24x7 availability is greater than it has ever been. One common method for providing 99.99%, or "4-nines" availability, is to implement a "warm" standby database server. The use of standby servers is not a new concept; database administrators (DBAs) have been using them for years. Typically, standby servers require the DBA or operator to manually create backups of the database and logs on the primary system and then restore these backups on a standby server on a regular basis. If the primary server fails, the down time is isolated to the amount of time required to process the log files since the last backup was restored on the standby server.
Standby server failover is typically not automated. Someone has to decide whether or not it would take less time to bring up the standby server than it would to repair the original failure on the primary system.
Log shipping is a method where transaction logs are automatically backed up from a primary DB2 server and made accessible to a standby server. Once the log files are located on the standby server, it can stay relatively synchronized with the primary server.
What are the benefits of log shipping and why should you take the time to set this up? Log shipping provides the following benefits:
- A redundant failover system that does not require expensive software or hardware to implement. The primary and standby servers do not have to be identical, from a hardware or a software perspective (see prerequisites section below). The standby server can be used for other purposes; it does not have to sit idle. For example, a separate independent database could be running on the standby server while the secondary database remains inaccessible while it is processing incoming log files.
- A configuration that is relatively low cost and easy to maintain, once it is set up.
- A very reliable method of providing a redundant copy of the database.
- A warm standby, since the database is in roll forward mode, which has the database booted and the data cache already primed.
- The ability to be configured to permit very little, if any, data loss should a failure occur.
- A relatively inexpensive to implement and maintain configuration.
- Support for both local site and disaster (remote) scenarios.
Log shipping has some limitations. It does not provide the full functionality that systems such as HACMP or Veritas Cluster do. However, it also does not require any additional hardware or software. It comes down to a matter of cost versus availability and complexity. Log shipping is a practical solution for most customers who require redundant systems but who can also accept some data loss during a failover scenario.
Log shipping can only be completely automated with the use of additional software. The DBA or operator must still manually transfer the primary server's functionality to the standby server when a failure occurs; however it is possible to script this failure to minimize the human intervention. The users are interrupted for the amount of time required to replay the log file or files and back out of any incomplete transactions, in addition to the time required to reconnect the users' applications. The time required to get the standby database online can be controlled by how often the standby server processes the incoming log files, as well as the log file size.
Once the database has been switched to the standby server, the client applications have to be changed to point to the new server as well. Alternatively, you can transfer the host name and IP address.
When an index is rebuilt on DB2, a single log record is written to the log to indicate that this action has started. As the standby database processes this log record, it does not automatically rebuild the index on the standby server. DB2 can be configured (by setting the database manager INDEXREC configuration value) to have the index rebuilt either on the first connection to the database, after it is taken out of roll-forward pending state (take-over time, for example), or when the first attempt is made to access the index. Whatever method is employed, the end user sees some performance degradation when system failover occurs. One way to prevent this is to repopulate the standby database from a backup image of the primary database, or to refresh using I/O suspend and split mirror technology whenever an index is rebuilt.
Running the DB2 load utility on the primary database will affect the standby database server. When the LOAD command is invoked, you can choose to have the load utility make a backup image of the table space being loaded, or defer the creation of a backup image until later. If you choose to have the load utility create the backup image, the standby server must have access to the target device used by the load utility. If you choose not to back up until later, or the backup image is not available when the load log record is replayed, the standby server places the table space being loaded in restore pending state. In either case, you should refresh the standby database once the load operations are complete to ensure that all data is accessible on the standby server in case a failover is required.
Here are the prerequisites that must be met prior to the setup and configuration of log shipping with DB2:
- Both the primary and standby systems must be running the same version of DB2. You can failover to the standby server to install a new Fix Pack of DB2 on the primary system; however, the version must be the same or higher. You cannot use this method to "back out" of a Fix Pack, as both systems must be running not only the same level of DB2 but also the same level of operating system.
- The standby system must have at least as much disk space available for the database and log files as the primary system. You must account for the possibility that when one fails over to the standby system, the primary server may not be available for several days.
- All automated processes running on the primary server for database maintenance must be configured on the standby server. DB2 only allows one user exit program to be configured per instance. If the standby server already has an active database on it, it should be using a separate DB2 instance from that of the primary system.
- A log archive target must be accessible on the standby server. After the failover, the log files must be kept so that the primary database can be brought back online.
- A full database backup must be restored on the standby system to initialize the warm standby. All log files generated on the primary system after this backup was created are also required.
There are many way to implement log shipping with DB2. This article discusses some of the more popular methods.
In all cases, the standby server requires a scheduled job that issues the
db2 rollforward db <dbname> to end of logs command periodically. The frequency at which this runs determines how quickly the standby server can be made available in a failover scenario.
The frequency can also be used as a method to protect the database from application errors. For example, if the standby server is kept several hours behind the primary system and if an application corrupts data in the database, the database can be failed over to the standby server to "back out" the corrupted data while having minimal impact on the users.
All log-shipping configurations are implemented using a user exit program. This is the only method available to manage log files within DB2. The DB2 logger archives a log file when it becomes full. The db2uext executable is then responsible for that log file.
There are two methods for log shipping. In the pull method, the standby server pulls the log files from a central shared location (such as the log archive target) when required. In the push method, the primary server ensures that the log files reside on the standby server when it archives the primary log files.
DB2 archives log files to the target directory specified in the user exit program,
db2uext2. Samples of this user exit program are located in the DB2 instances directory
sqllib/samples/c. Examples for disk, tape, and Tivoli® Storage Manager are included (see the How do you set this up? section.)
The pull method involves configuring the user exit program on the primary system to archive log files on a target device to which both the primary and standby servers have access. The standby server is not notified that a log file has been archived, and has to check the archive target path. You can do this by using the
db2uext2.cadsm (renamed to
db2uext2.ctsm in later versions of DB2) sample user exit programs. The user exit executable must be located in the default DB2 instance path on both the primary and standby systems.
When the roll forward db command is invoked on the standby server, the DB2 logger automatically attempts to retrieve the next consecutive log file from the archive target path. The roll forward operation continues to retrieve log files until there are no more left to process.
Figure 1. Pull method
With the push method, the user exit program can be modified to copy or FTP the log file to the standby server's active log path or to an overflow log path that is accessible on the standby server. This can be accomplished by modifying the db2uext2.cdisk sample program to specify the standby server's log path as the target.
When the roll forward db command is invoked on the standby server, the DB2 logger automatically attempts to retrieve the next consecutive log file from the archive target path. The roll forward operation continues to retrieve log files until there is no more left to process.
Figure 2. Push method
Regardless of whether using the pull or push method, most of the setup procedure is similar to the steps explained below:
- Configure the database with user exit programs and log archiving enabled. Once this is done, the database will be placed in backup pending state. This backup image will be the initial starting point for recovery and should be kept until the next full database backup is performed.
- Locate the user exit executable somewhere in the DB2 Instance default search path. The sample source code modules for the DB2 user exit programs are located in the DB2 Instance
sqllib/samples/cdirectory. They are:
Db2uext2.cadsm- Support for Tivoli Storage Manager, also called ADSM
Db2uext2.cdisk- Support for disk
Db2uext2.ctape- Support for local tape, available on UNIX® systems only
Db2uext2.cxbsa- Support for an XBSA Draft 0.8 client
Each of these sample programs requires only minor modifications (such as
error_log_path). Each sample program includes the exact compile statement that must be issued once the modifications are completed.
There are also some third party vendors who provide their own DB2 user exit binary code (such as Veritas, Legato, SAP), all of which can be used to implement log shipping.
- Initialize the standby server's database. This can be done by restoring a full DB2 backup image of the primary server, either online or offline, or by using a split mirror copy. Details about using a split mirror copy are below. In either case, the hardware of the standby database does not have to match that of the primary database. The number and size of the processors and disks can be completely differently. The only restriction is that the size of each table space on the standby database be at least as large as those on the primary database. This prevents your standby system from running out of space while the primary system continues to grow. If the physical disk layout is different, a redirected restore is required to initialize the standby server.
- Configure a scheduled job on the standby system to periodically issue a db2 rollforward to end of logs command. This processes the log records received from the primary server and keeps ones standby server current.
Your standby server is now ready to go. Congratulations!
There are ways to ensure that zero data is lost in a log-shipping configuration. However, it requires additional configuration and/or hardware. Let's take a look at some of the more popular methods of implementing a standby server with no data loss.
One method to ensure that no data is lost is to mirror the volume that is used to contain the log files. This can be implemented using the operating system's disk/volume mirror features. With this method, each log record written to the primary database is also written to the standby database. Every log record is written to both systems, thus ensuring no data loss. The downside of this approach is the performance cost associated with performing two disks writes, one of which may be remote.
Another method to avoid data loss is to exploit the dual logging capability of DB2. When this feature is used, DB2 writes the same log record to two locations. One of these locations can be a remotely mounted file system. DB2 attempts to write each log record to both log paths. If an error occurs on one of the paths, an error message is logged in the db2diag.log file and processing continues. If a write to one of the paths fails, DB2 does not attempt to write to that path again until the active log file is filled. DB2 also does not resynchronize the log paths once the link has been re-established. This approach is only feasible if the network link between the primary and standby systems is highly reliable.
Many intelligent storage systems available today provide disk mirror capability to either a local or a remote storage system (such as IBM ESS, EMC, HDS). Each of these systems provides both a synchronous and an asynchronous method to mirror a file system. With intelligent storage systems, the implementation of mirroring log files between the primary and standby system is greatly simplified and highly reliable.
In conclusion, log shipping is a relatively simple and inexpensive method to provide a redundant failover system. It is easy to setup and to maintain, and can be used to support both local site and remote scenarios. This method of disaster recovery will not tax your existing database administrators, as once it is setup, it can run in an automated manner.
The following resources provide additional information on DB2 backup and recovery:
- IBM Redbooks: http://www.ibm.com/redbooks
- IBM ESS and IBM DB2 UDB Working Together: SG24-6262
- DB2 Warehouse Management: High Availability and Problem Determination Guide: SG24-6544
- Backing Up DB2 Using Tivoli Storage Manager: SG24-6247
- Managing VLDB Using DB2 EEE: SG24-5101
- The IBM TotalStorageTM NAS 200 and 300 Integration Guide: SG24-6505
- White papers on the DB2 Web site: http://www-3.ibm.com/software/data/pubs/papers
- Creating Hot Snapshots and Standby Databases with IBM DB2 Universal Database Version 7.2 and EMC TimeFinderTM
- Using Hitachi Data Systems 7700E Open ShadowImage (HOMRCF) and Open Remote Copy (HORC) with IBM DB2 Universal Database Enterprise Edition Version 7.2
- DB2 and High Availability on Sun Cluster 3.0
- Implementing IBM DB2 Universal Database Enterprise Edition with Microsoft®Cluster Server
- IBM DB2 Universal Database Enterprise - Extended Edition for AIX®and HACMP/ES
- IBM DB2 Universal Database Enterprise Edition for AIX and HACMP/ES
- DB2 Universal Database for Windows®: High Availability Support Using Microsoft Cluster Server - Overview
- DB2 Developor Domain articles:
Dale McInnis is the Senior Technical Development Manager responsible for backup, restore and recovery of DB2 for Linux, Unix and Windows. Dale joined IBM in 1988 and has been working in the DB2 development area since 1993. He regularly presents at DB2 conferences on the topics of High Availability and Disaster Recovery. Dale has a B.S. in Computer Science from the University of New Brunswick and a Masters in Electrical Engineering from the University of Toronto. You can reach Dale at email@example.com.