This article describes the IMS disaster recovery solutions available with the IMS base product and the IBM Tools products. The following sections discuss the key concepts for each solution. And you can download demonstrations that will help you grow your knowledge and understanding of each solution. This "Exploring IMS Disaster Recovery Solutions" series covers all of the non-storage mirroring DR solutions described in this article.
Disaster recovery is the ability to restart one or more production systems at a remote site following a disaster that makes the primary site unusable. Customers must determine the amount of time and the amount of data they can afford to lose when setting disaster recovery goals.
The Recovery Time Objective (RTO) is the amount of time it takes to restore critical operations at the remote site. The Recovery Point Objective (RPO) is the amount of data the user can afford to lose in the event of a disaster. RTO and RPO objectives are often determined on an application-by-application basis. The cost of the disaster recovery solution needs to be in proportion to the business value of the IT environment.
The amount of data transferred to a remote site and the currency of the data will greatly affect the RTO and RPO values of the underlying disaster recovery solution. If the production system can simply be emergency-restarted without recovering any databases, the RTO will be less than if many databases need to be recovered.
Once RTO and RPO objectives are established, users can determine the appropriate disaster recovery methodology that best meets their needs.
There are specific ways for customers to recover and restart their IMS subsystems at a remote site. However, prior to any IMS restart, the customer must build the remote system environment. This may include setting up the machines, bringing up the z/OS® environment, and establishing the network. These components are the foundation of any System z® environment.
There are five categories of IMS disaster recovery solutions:
- IMS recovery solutions
- IMS restart solutions
- IMS recovery and restart solutions
- Coordinated IMS and DB2 disaster restart solutions
- Coordinated IMS and DB2 disaster recovery and restart solutions
These are the primary categories for IMS disaster recovery. The coordinated IMS and DB2 disaster solutions bring IMS and DB2 back to a consistent point in time.
With IMS disaster recovery solutions, the user needs to transmit the resources required to recover the IMS database data sets. This might include image copies, archived logs (SLDS/RLDS), change-accumulation data sets (CA), and a backup of the DBRC RECON data set that is consistent with the latest recovery resources being transmitted. These data sets are used by IMS utilities and IBM Tools to recover databases to a point in time where the databases are considered to be in a consistent state. The specific recovery type employed will determine how IMS is to be restarted (cold start or emergency restart, for example) and whether uncommitted updates need to be dynamically backed out during restart or with batch back-out. The IMS base product is capable of performing full database recovery and timestamp recovery to a recovery point.
The IMS base recovery solutions are summarized in Table 1.
Table 1. IMS base disaster recovery solutions
|Type of recovery||Point of recovery||Resources for remote site|
|Timestamp recovery||Image copy timestamp||Clean image copies, backup RECON|
|Timestamp recovery||Valid recovery point||Clean or fuzzy image copies, change accumulations, SLDS or RLDS, backup RECON|
|Full database recovery||End of last good log||Clean or fuzzy image copies, change accumulations, SLDS or RLDS, backup RECON|
There are more sophisticated IMS disaster recovery solutions available that use IBM Tools. The IMS Recovery Solution Pack — which includes the IMS Database Recovery Facility (DRF), IMS Database Recovery Facility eXtended Function (DRF/XF), IMS High Performance Image Copy (HPIC), IMS Index Builder (IIB), and the IMS High Performance Change Accumulation (HPCA) utilities, along with the IMS High Performance Pointer Checker (HPPC) product — provides the functionality needed for these solutions.
With DRF and HPIC, it is possible to create incremental image copies (IIC), which are standalone image copies. The term incremental image copy is used differently by different products. With some products, an IIC is a subset of a full image copy, and all incremental image copies are used together to recover a database data set. However, this is not the case for DRF and HPIC, where an IIC is created by taking an existing image copy and applying log updates to it to produce a new image copy. The logs are applied offline without affecting the online IMS subsystems and without needing to take the affected databases offline. When the log updates end at a valid recovery point, the IIC is registered in the RECON as a batch (clean) image copy. By creating IIC at the production site, the user does not need to use the archived log data sets for recovery purposes at the remote site.
The DRF tool can also allow the user to do a point-in-time recovery (PITR) to any timestamp in the archive log. With PITR, DRF applies all committed updates on the log up to the provided timestamp. In doing so, there are no uncommitted updates that need to be backed out with dynamic back-out in an emergency restart.
These IMS Tools Recovery solutions are summarized in Table 2.
Table 2. IMS Tools Disaster Recovery Solutions
|Type of recovery||Point of recovery||Resources for remote site|
|Timestamp recovery||Incremental image copy timestamp||Incremental image copies, backup RECON|
|Point-in-time recovery||Any timestamp||Clean or fuzzy image copies, change accumulations, SLDS or RLDS, backup RECON|
With all of these IMS recovery solutions, a significant task is to condition the DBRC RECON data set so it can be used for recovery purposes at the remote site. The backup RECON is created at the production site each time an image copy is created or after an online log is archived. The backup RECON is an exact replica of the production RECON. The information in the RECON reflects a healthy IMS environment. However, for it to be used at the remote site, it must be changed to show that a disaster occurred and the IMS subsystems were terminated abnormally. The Online Log Data Set (OLDS) was not transmitted to the remote site, so it must be closed and archived in the RECON. If dual image copies are used, the primary image copy needs to be invalidated if it was not transmitted to the remote site. There may also be other conditioning required on the RECON. These steps can be performed manually, or they can be done using the IBM IMS Tools products. For example, the DRF/XF tool allows the user to use the RECON Clean Up (RCU) function to automatically condition the RECON. The IMS Recovery Expert product also has the capability of conditioning the RECON data set for use at the remote site.
The IMS restart DR solutions generally involve the mirroring of IMS production volumes to the remote site. The mirroring can be done by transmitting the IMS Recovery Expert System Level Backup (SLB) to the remote site or by continuously transmitting data to the remote site using storage-mirroring techniques. The SLB is created by copying source volumes to target volumes for all IMS production volumes. The SLB includes database data sets, logs, RECON, IMS system data sets and libraries, and the associated ICF user catalogs. When the SLB is restored at the remote site, all production data and libraries are restored exactly as they were at the production site. After restoring the SLB and recreating the exact same IMS system environment that existed at the production site when the SLB was created, the IMS system is emergency-restarted to allow uncommitted updates on the active log data sets to be backed out creating a consistent state in the databases.
The IMS recovery and restart DR solutions combine the IMS Recovery Expert SLB with the transmission of additional archived log data sets, conditioned RECON data sets, and, optionally, CA data sets and more recent image copies. The SLB is restored first at the remote site, but prior to restarting the IMS subsystem, database data sets are recovered to a later point in time using the DRF utility. DRF performs PITR using the latest timestamp in the last good log transmitted to the remote site. The RECON was conditioned by the IMS Recovery Expert product at the production site after each log was archived. The conditioned RECON data set is transmitted to the remote site in the remote PDS data set, along with the recovery JCL and a copy of the IMS Recovery Expert repository.
The storage-mirroring solutions are not described in this series. See a whitepaper titled "IMS Disaster Recovery with GDPS."
The IMS restart and IMS recovery and restart disaster recovery solutions are summarized in Table 3.
Table 3. IMS Tools disaster restart and disaster recovery and restart solutions
|Restart strategy||Point of recovery||Resources for remote site|
|IMS restart from SLB||SLB timestamp||SLB, remote PDS (restart JCL)|
|IMS recovery and restart from SLB and log data sets||Timestamp of last good log||SLB, logs, ICs, CAs, remote PDS (recovery JCL)|
|Storage mirroring||Last successful data transmission||Volume consistency group|
The SLB has multiple uses, which make it a valuable recovery resource in the IMS environment. The SLB can reduce the need to create thousands of image copy data sets. Each SLB contains the database data set at a specific point in time, which is functionally equivalent to creating an image copy. There are still a few reasons to create image copies since IMS requires an image copy following a database reorganization, but the vast majority of image copy executions can be eliminated. For this reason, the SLB is useful in local application recovery in that individual databases or groups of databases in an application can be recovered using data from the SLB along with image copies and log data sets. The IMS Recovery Expert product can drive the IMS Database Recovery Facility (DRF), which is included in the IMS Recovery Solution Pack to do timestamp recovery to a recovery point or PITR to any timestamp. These solutions are shown in Table 4.
Table 4. IMS Tools local application recovery solutions
|Restart strategy||Point of recovery||Recovery resources|
|IMS local application timestamp recovery using SLB||Recovery point||SLB, image copies, change accumulations|
|IMS local application PITR using SLB||Any timestamp||SLB, image copies, change accumulations|
With IMS Recovery Expert and DB2 Recovery Expert, it is possible to have coordinated IMS and DB2 disaster recovery, in which IMS and DB2 are recovered at the remote site to a consistent point in time. There are two ways to accomplish this:
- Combine the IMS and DB2 volumes into a single SLB. After restoring this combined SLB, IMS and DB2 are then restarted from the point in time when the SLB was created. Since uncommitted updates are likely on the active log data sets at the time the SLB was created, IMS must perform dynamic back-out during emergency restart to back out these uncommitted updates, and DB2 must do UNDO/REDO processing to ensure that only the committed updates are applied to the DB2 spaces.
- Treat IMS and DB2 separately by creating separate SLBs for IMS and DB2 at different points in time. The SLB and the archived logs for IMS and DB2 are transmitted together to the remote site. At the remote site, IMS and DB2 use PITR to recover the databases to the same point in time selected by the Recovery Expert products. While DB2 has the ability to do PITR recovery by using the recovery utility provided in the DB2 Utilities Suite, IMS Recovery Expert drives the IMS Database Recovery Facility (DRF) utility, which is included in the IMS Recovery Solution Pack.
The two coordinated IMS and DB2 disaster recovery solutions are shown in Table 5.
Table 5. Coordinated IMS and DB2 disaster recovery solutions
|Restart strategy||Point of recovery||Resources for remote site|
|Coordinated IMS and DB2 restart from SLB||Combined SLB timestamp||Combined SLB, remote PDS (restart JCL)|
|Coordinated IMS and DB2 PITR recovery from SLBs and logs||Coordinated log timestamp||IMS SLB, DB2 SLB, remote PDS (recovery JCL), IMS and DB2 archived logs|
This article has briefly introduced IMS recovery concepts. Upcoming pieces describe each disaster recovery and local application recovery solutions in detail.
|IMS Disaster Recovery Demonstrations1||IMS_Backup_and_Recovery_Demos.zip||3330KB||HTTP|
|Download Instructions for Demonstrations2||IMSBackupandRecoveryDemoInstructions.pdf||768KB||HTTP|
- This download is the same for all parts of this article/tutorial series.
- This PDF file is the same for all parts of this series. If you have downloaded it for a previous article or tutorial in the series, there is no need to download it again.
- Learn more about the IMS family.
- Learn more about Information Management at the developerWorks Information Management
zone. Find technical documentation,
how-to articles, education, downloads, product information, and
- Stay current with
developerWorks technical events and webcasts.
- Follow developerWorks on
Get products and technologies
- Build your next
development project with
IBM trial software,
available for download directly from developerWorks.
- Now you can use
DB2 for free. Download DB2 Express-C, a no-charge
version of DB2 Express Edition for the community that offers the same core
data features as DB2 Express Edition and provides a solid base to build
and deploy applications.
- Participate in the discussion forum.
- Check out the
blogs and get involved in the
- Participate in the DB2
Temporal discussion forum.
Glenn Galler is a certified IT specialist for the IMS product in the IBM Advanced Technical Skills (ATS) group. He is a senior programmer specializing in disaster recovery. He joined the ATS group in March 2007. Galler is also the campus recruiting manager for the IBM Software Group for the University of Michigan, holding this position since 1998. He joined IBM in 1982 receiving his bachelor's degree in computer science from the University of Michigan. In 1989, he received a master's degree in computer engineering from the University of Santa Clara. Galler has worked in many areas of IMS, including testing, development, marketing and management. From 1992 to 1997, he held an international assignment in England as the European program manager for the IMS Quality Partnership Program (QPP).
Ron Bisceglia is a lead software developer for Rocket Software, based in Houston. He has worked with IMS for more than 24 years, and for the past 20 years has been involved in the design, development, and support of a range of IMS tools. He has been involved in the development of database reorganization utilities, data propagation tools, database monitoring and analysis solutions, data replication, and backup and recovery products.