Recovering data with XRC

Before you begin: If the XRC system data mover is running at the recovery site at the time of the failure, first issue an XEND command, then go to step 3.

Perform the following steps to have the recovery system take over for the primary system:

  1. Reconfigure the path connections, if necessary, to connect local systems to the recovery system disk.
  2. Start the DFSMS address space, if it is not already active on the recovery system. Run the IDCAMS recatalog function before activating the recovery system. The state, control, journal, and master (if applicable) data sets need to be cataloged on the recovery system. If you use a cluster session to couple sessions, catalog the cluster data set on the recovery system.

    If you are running multiple XRC sessions, you must issue the XRECOVER command for each individual session. If the sessions are coupled, the XRECOVER command coordinates the recovery of the sessions so that each session is recovered to the same consistency time. A master XRECOVER command does not exist. If you couple sessions to different cluster sessions, issue the XRECOVER commands for those sessions coupled to one cluster session before you issue the commands for another cluster session.

  3. Issue the XRC command XRECOVER for each XRC session on the recovery system to add all valid, nonapplied journal data to the secondary (target) volumes. This action also sets the secondary volume serial numbers equal to the serial numbers of the primary volumes for those volumes which had previously reached a duplex state.

    Rule: The XRC recovery function must have access to the appropriate journal, control, and state data sets that were in use on the recovery system at the time of the failure. When the session is coupled, the recovery function must have access to the master data set. If you couple the sessions by using a cluster session, you must have access to the cluster data set.

    The XRECOVER command does not process volumes that are in the pending state, or that were suspended before reaching duplex state, as they have never reached duplex state. XRC recovery will not apply data to secondary volumes if the session was ended or suspended by command. The exception to this is when an error occurs during an update to a secondary volume while there is data from the journal waiting to be written to the secondary volume. This allows the volume to be recovered to the same consistency time as the rest of the secondary volumes in the session.

    Note: For a coupled session that was not ended or suspended by an XRC command, the starting consistency_group_time reported is the timestamp for the last known update for the session when the following conditions exist:
    • A session status is noninterlocked
    • Updates were not occurring when the session ended (the session was idle).
    If the above conditions exist, the consistency time reported may be earlier than the master recovery time or the last session consistency time indicated in the output of an XQUERY command.

    If XRC locates valid updates to suspended volumes, XRC applies those updates to the secondary volumes and increments their suspension times accordingly. If the volume suspension time is different than the recovery time of the XRC session, the data on the volume may not be consistent with data on other volumes.

    The XRC recovery function uses the appropriate journal, control, and state data sets to put all secondary volumes in a known consistent state. The XRECOVER command, as part of XRC’s recovery function, automatically creates an XQUERY volume report to assist you with recovery (see Creating a recovery volume report for more information). XRC also initiates a request sequence to vary the secondary volumes offline, and then online to bring them to a ready state.
    Note: In an actual disaster recovery, the primary volume would be offline when you issue the XRECOVER command. You must manually vary the primary volume offline during a disaster recovery verification test.
  4. Change the volume serial numbers, if needed. You might notice that some recovery system disk devices may still be in an error state after you have completed the preceding steps. Changing the volume serial numbers prevents the application from accessing the volume until recovery procedures have been restored or have updated their contents.
    Note: It is not enough to vary these devices offline, as other applications can bring the devices online to another system.
    If the secondary has more cylinders than the primary, run the ICKDSF REFORMAT REFVTOC function to the secondary volume from a system in the production SMSPlex to refresh the VTOC and reflect the additional space, and to ensure that DFSMS space statistics accurately reflect the space on the secondary volume.

    You might occasionally find that you cannot access data on a successfully recovered volume because the volume’s indexed VTOC has been disabled. In this case, run the BUILDIX function of ICKDSF to enable the indexed VTOC.

  5. Restart primary systems and perform the same systems and application startup procedures that are performed on the primary system when applications start up following a system failure.

You might want to include catalog volumes as part of the data that you copy to the recovery system. Recovery system catalog entries will then be consistent with those on the primary system. Use the procedures that are listed in Copying the catalog and control data sets to manage catalog updates that are not made to the recovery system.