Exploring IMS disaster recovery solutions, Part 1: Overview

Every customer needs a Disaster Recovery (DR) plan. The strategies used differ from one customer to another and they differ in time to recovery and loss of data. For IMS™, there are five types of disaster recovery solutions: restart, recovery, recovery and restart, coordinated IMS and DB2 restart, and coordinated IMS and DB2® disaster recovery and restart. While the Storage Mirroring recovery solutions are classified as restart solutions, we will focus only on the non-Storage Mirroring IMS disaster recovery solutions in this series.

Share:

Glenn Galler (gallerg@us.ibm.com), Certified IT IMS Specialist, IBM

Glenn GallerGlenn Galler is a certified IT specialist for the IMS product in the IBM Advanced Technical Skills (ATS) group. He is a senior programmer specializing in disaster recovery. He joined the ATS group in March 2007. Galler is also the campus recruiting manager for the IBM Software Group for the University of Michigan, holding this position since 1998. He joined IBM in 1982 receiving his bachelor's degree in computer science from the University of Michigan. In 1989, he received a master's degree in computer engineering from the University of Santa Clara. Galler has worked in many areas of IMS, including testing, development, marketing and management. From 1992 to 1997, he held an international assignment in England as the European program manager for the IMS Quality Partnership Program (QPP).



Ron Bisceglia (RBisceglia@rocketsoftware.com), Lead Software Developer, Rocket Software

Ron BiscegliaRon Bisceglia is a lead software developer for Rocket Software, based in Houston. He has worked with IMS for more than 24 years, and for the past 20 years has been involved in the design, development, and support of a range of IMS tools. He has been involved in the development of database reorganization utilities, data propagation tools, database monitoring and analysis solutions, data replication, and backup and recovery products.



29 March 2012

Introduction

This article describes the IMS disaster recovery solutions available with the IMS base product and the IBM Tools products. The following sections discuss the key concepts for each solution. And you can download demonstrations that will help you grow your knowledge and understanding of each solution. This "Exploring IMS Disaster Recovery Solutions" series covers all of the non-storage mirroring DR solutions described in this article.


Recovery Time Objective and Recovery Point Objective

Disaster recovery is the ability to restart one or more production systems at a remote site following a disaster that makes the primary site unusable. Customers must determine the amount of time and the amount of data they can afford to lose when setting disaster recovery goals.

The Recovery Time Objective (RTO) is the amount of time it takes to restore critical operations at the remote site. The Recovery Point Objective (RPO) is the amount of data the user can afford to lose in the event of a disaster. RTO and RPO objectives are often determined on an application-by-application basis. The cost of the disaster recovery solution needs to be in proportion to the business value of the IT environment.

The amount of data transferred to a remote site and the currency of the data will greatly affect the RTO and RPO values of the underlying disaster recovery solution. If the production system can simply be emergency-restarted without recovering any databases, the RTO will be less than if many databases need to be recovered.

Once RTO and RPO objectives are established, users can determine the appropriate disaster recovery methodology that best meets their needs.


IMS recovery vs. IMS restart vs. IMS restart and recovery

There are specific ways for customers to recover and restart their IMS subsystems at a remote site. However, prior to any IMS restart, the customer must build the remote system environment. This may include setting up the machines, bringing up the z/OS® environment, and establishing the network. These components are the foundation of any System z® environment.

There are five categories of IMS disaster recovery solutions:

  1. IMS recovery solutions
  2. IMS restart solutions
  3. IMS recovery and restart solutions
  4. Coordinated IMS and DB2 disaster restart solutions
  5. Coordinated IMS and DB2 disaster recovery and restart solutions

These are the primary categories for IMS disaster recovery. The coordinated IMS and DB2 disaster solutions bring IMS and DB2 back to a consistent point in time.

IMS recovery solutions

With IMS disaster recovery solutions, the user needs to transmit the resources required to recover the IMS database data sets. This might include image copies, archived logs (SLDS/RLDS), change-accumulation data sets (CA), and a backup of the DBRC RECON data set that is consistent with the latest recovery resources being transmitted. These data sets are used by IMS utilities and IBM Tools to recover databases to a point in time where the databases are considered to be in a consistent state. The specific recovery type employed will determine how IMS is to be restarted (cold start or emergency restart, for example) and whether uncommitted updates need to be dynamically backed out during restart or with batch back-out. The IMS base product is capable of performing full database recovery and timestamp recovery to a recovery point.

The IMS base recovery solutions are summarized in Table 1.

Table 1. IMS base disaster recovery solutions
Type of recoveryPoint of recoveryResources for remote site
Timestamp recoveryImage copy timestampClean image copies, backup RECON
Timestamp recoveryValid recovery pointClean or fuzzy image copies, change accumulations, SLDS or RLDS, backup RECON
Full database recoveryEnd of last good logClean or fuzzy image copies, change accumulations, SLDS or RLDS, backup RECON

There are more sophisticated IMS disaster recovery solutions available that use IBM Tools. The IMS Recovery Solution Pack — which includes the IMS Database Recovery Facility (DRF), IMS Database Recovery Facility eXtended Function (DRF/XF), IMS High Performance Image Copy (HPIC), IMS Index Builder (IIB), and the IMS High Performance Change Accumulation (HPCA) utilities, along with the IMS High Performance Pointer Checker (HPPC) product — provides the functionality needed for these solutions.

With DRF and HPIC, it is possible to create incremental image copies (IIC), which are standalone image copies. The term incremental image copy is used differently by different products. With some products, an IIC is a subset of a full image copy, and all incremental image copies are used together to recover a database data set. However, this is not the case for DRF and HPIC, where an IIC is created by taking an existing image copy and applying log updates to it to produce a new image copy. The logs are applied offline without affecting the online IMS subsystems and without needing to take the affected databases offline. When the log updates end at a valid recovery point, the IIC is registered in the RECON as a batch (clean) image copy. By creating IIC at the production site, the user does not need to use the archived log data sets for recovery purposes at the remote site.

The DRF tool can also allow the user to do a point-in-time recovery (PITR) to any timestamp in the archive log. With PITR, DRF applies all committed updates on the log up to the provided timestamp. In doing so, there are no uncommitted updates that need to be backed out with dynamic back-out in an emergency restart.

These IMS Tools Recovery solutions are summarized in Table 2.

Table 2. IMS Tools Disaster Recovery Solutions
Type of recoveryPoint of recoveryResources for remote site
Timestamp recoveryIncremental image copy timestampIncremental image copies, backup RECON
Point-in-time recoveryAny timestampClean or fuzzy image copies, change accumulations, SLDS or RLDS, backup RECON

With all of these IMS recovery solutions, a significant task is to condition the DBRC RECON data set so it can be used for recovery purposes at the remote site. The backup RECON is created at the production site each time an image copy is created or after an online log is archived. The backup RECON is an exact replica of the production RECON. The information in the RECON reflects a healthy IMS environment. However, for it to be used at the remote site, it must be changed to show that a disaster occurred and the IMS subsystems were terminated abnormally. The Online Log Data Set (OLDS) was not transmitted to the remote site, so it must be closed and archived in the RECON. If dual image copies are used, the primary image copy needs to be invalidated if it was not transmitted to the remote site. There may also be other conditioning required on the RECON. These steps can be performed manually, or they can be done using the IBM IMS Tools products. For example, the DRF/XF tool allows the user to use the RECON Clean Up (RCU) function to automatically condition the RECON. The IMS Recovery Expert product also has the capability of conditioning the RECON data set for use at the remote site.

IMS restart and IMS recovery and restart solutions

The IMS restart DR solutions generally involve the mirroring of IMS production volumes to the remote site. The mirroring can be done by transmitting the IMS Recovery Expert System Level Backup (SLB) to the remote site or by continuously transmitting data to the remote site using storage-mirroring techniques. The SLB is created by copying source volumes to target volumes for all IMS production volumes. The SLB includes database data sets, logs, RECON, IMS system data sets and libraries, and the associated ICF user catalogs. When the SLB is restored at the remote site, all production data and libraries are restored exactly as they were at the production site. After restoring the SLB and recreating the exact same IMS system environment that existed at the production site when the SLB was created, the IMS system is emergency-restarted to allow uncommitted updates on the active log data sets to be backed out creating a consistent state in the databases.

The IMS recovery and restart DR solutions combine the IMS Recovery Expert SLB with the transmission of additional archived log data sets, conditioned RECON data sets, and, optionally, CA data sets and more recent image copies. The SLB is restored first at the remote site, but prior to restarting the IMS subsystem, database data sets are recovered to a later point in time using the DRF utility. DRF performs PITR using the latest timestamp in the last good log transmitted to the remote site. The RECON was conditioned by the IMS Recovery Expert product at the production site after each log was archived. The conditioned RECON data set is transmitted to the remote site in the remote PDS data set, along with the recovery JCL and a copy of the IMS Recovery Expert repository.

The storage-mirroring solutions are not described in this series. See a whitepaper titled "IMS Disaster Recovery with GDPS."

The IMS restart and IMS recovery and restart disaster recovery solutions are summarized in Table 3.

Table 3. IMS Tools disaster restart and disaster recovery and restart solutions
Restart strategyPoint of recoveryResources for remote site
IMS restart from SLBSLB timestampSLB, remote PDS (restart JCL)
IMS recovery and restart from SLB and log data setsTimestamp of last good logSLB, logs, ICs, CAs, remote PDS (recovery JCL)
Storage mirroringLast successful data transmissionVolume consistency group

The SLB has multiple uses, which make it a valuable recovery resource in the IMS environment. The SLB can reduce the need to create thousands of image copy data sets. Each SLB contains the database data set at a specific point in time, which is functionally equivalent to creating an image copy. There are still a few reasons to create image copies since IMS requires an image copy following a database reorganization, but the vast majority of image copy executions can be eliminated. For this reason, the SLB is useful in local application recovery in that individual databases or groups of databases in an application can be recovered using data from the SLB along with image copies and log data sets. The IMS Recovery Expert product can drive the IMS Database Recovery Facility (DRF), which is included in the IMS Recovery Solution Pack to do timestamp recovery to a recovery point or PITR to any timestamp. These solutions are shown in Table 4.

Table 4. IMS Tools local application recovery solutions
Restart strategyPoint of recoveryRecovery resources
IMS local application timestamp recovery using SLBRecovery pointSLB, image copies, change accumulations
IMS local application PITR using SLBAny timestampSLB, image copies, change accumulations

Coordinated IMS and DB2 disaster recovery solutions

With IMS Recovery Expert and DB2 Recovery Expert, it is possible to have coordinated IMS and DB2 disaster recovery, in which IMS and DB2 are recovered at the remote site to a consistent point in time. There are two ways to accomplish this:

  1. Combine the IMS and DB2 volumes into a single SLB. After restoring this combined SLB, IMS and DB2 are then restarted from the point in time when the SLB was created. Since uncommitted updates are likely on the active log data sets at the time the SLB was created, IMS must perform dynamic back-out during emergency restart to back out these uncommitted updates, and DB2 must do UNDO/REDO processing to ensure that only the committed updates are applied to the DB2 spaces.
  2. Treat IMS and DB2 separately by creating separate SLBs for IMS and DB2 at different points in time. The SLB and the archived logs for IMS and DB2 are transmitted together to the remote site. At the remote site, IMS and DB2 use PITR to recover the databases to the same point in time selected by the Recovery Expert products. While DB2 has the ability to do PITR recovery by using the recovery utility provided in the DB2 Utilities Suite, IMS Recovery Expert drives the IMS Database Recovery Facility (DRF) utility, which is included in the IMS Recovery Solution Pack.

The two coordinated IMS and DB2 disaster recovery solutions are shown in Table 5.

Table 5. Coordinated IMS and DB2 disaster recovery solutions
Restart strategyPoint of recoveryResources for remote site
Coordinated IMS and DB2 restart from SLBCombined SLB timestampCombined SLB, remote PDS (restart JCL)
Coordinated IMS and DB2 PITR recovery from SLBs and logsCoordinated log timestampIMS SLB, DB2 SLB, remote PDS (recovery JCL), IMS and DB2 archived logs

Conclusion

This article has briefly introduced IMS recovery concepts. Upcoming pieces describe each disaster recovery and local application recovery solutions in detail.


Downloads

DescriptionNameSize
IMS Disaster Recovery Demonstrations1IMS_Backup_and_Recovery_Demos.zip3330KB
Download Instructions for Demonstrations2IMSBackupandRecoveryDemoInstructions.pdf768KB

Notes

  1. This download is the same for all parts of this article/tutorial series.
  2. This PDF file is the same for all parts of this series. If you have downloaded it for a previous article or tutorial in the series, there is no need to download it again.

Resources

Learn

Get products and technologies

  • Build your next development project with IBM trial software, available for download directly from developerWorks.
  • Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edition and provides a solid base to build and deploy applications.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=807263
ArticleTitle=Exploring IMS disaster recovery solutions, Part 1: Overview
publish-date=03292012