Possible failures during CDS and journal backup

If a CDS or journal backup fails, DFSMShsm documents the failure with message ARC0744E. This topic explains how to proceed after the most likely type of error, and more general suggestions for other error conditions.

Growth of your control data sets

Over time, the number of data sets migrated and backed up, the number of tape volumes owned by DFSMShsm, and so on, typically increase, causing your control data sets to grow in size. A control data set may grow such that the space allocated to its DASD backup copy is no longer enough, producing return code 12, 13, 18, 47, or 48 in message ARC0744E during an attempted backup.

Assuming from analyzing messages from DFSMSdss or Access Method Services that a backup copy is full, proceed as follows:

  1. Issue the QUERY CDSVERSIONBACKUP command. The resulting message ARC0376I, for example,
    ARC0376I
    BACKUPCOPIES=3,

    BACKUPDEVICECATEGORY=DASD,

    LATESTFINALQUALIFIER=V0000170,

    DATAMOVER=HSM

    gives you the latest version qualifier, and how many backup copies need to be reallocated.

  2. Examine the message for the control data set that is identified in the ARC0744E message and determine which backup copies (MCDS, BCDS, OCDS, or JRNL) must be made larger. Assume that the backup copies of the BCDS must be reallocated.
  3. Rename the backup data sets uid.BCDS.BACKUP.V0000168 through uid.BCDS.BACKUP.V0000170 so they can be kept until there are successful backups.
  4. Compare the (used) size of the current BCDS to its size when the backup data sets for the BCDS were last allocated. This allows you to get a rough idea of how much larger the backup data sets must be.
  5. Use the ratio from step 4, and reallocate backup data sets uid.BCDS.BACKUP.V0000168 through uid.BCDS.BACKUP.V0000170 to their new larger sizes.
  6. Issue the commands
    RELEASE BACKUP
    BACKVOL CDS

    If the failed backup was for the journal, DFSMShsm will have inhibited journaling. A successful BACKVOL CDS command should back up the journal, then null the online journal.

  7. Issue commands to RELEASE the other DFSMShsm functions that you are using once the BACKVOL command is successful.
  8. Delete the backup data sets that were renamed in step 3 when DFSMShsm makes three successful backups. A successful backup generates the following message as part of the output from a QUERY CDSV:
    ARC0376I
    . . .

    LATESTFINALQUALIFIER=V0000173,

Other failures

By considering the return code in message ARC0744E, you can determine which type of failure occurred:

  • If there is an indication of an error in the structure or contents of the control data set itself, perform the following steps:
    1. Take the action to correct the error as indicated by the error message number and the associated return code. This information is contained in the message documentation.
    2. You might need to recover the CDS to a previous level.

  • If there is an indication of an error in the system environment outside of the control data set (for example, an output tape could not be allocated), perform the following steps:
    1. Stop DFSMShsm if necessary (if, for example, the error was fragmented storage such that IDCAMS could not be loaded).
    2. Correct the environmental condition, if necessary.
    3. Restart DFSMShsm, if you stopped it.
    4. Issue the BACKVOL CDS command to make up for the backup that failed.
  • If the non-intrusive journal backup method was intended, but was not used, perform the following steps:
    1. Review the output of message ARC0750I, which is issued at the start of journal backup. This message will indicate which journal backup method was used and why non-intrusive journal backup was not used.
    2. Verify all requirements to use non-intrusive journal backup are satisfied. For a list of requirements, see Using non-intrusive journal backup.
    When the conditions preventing non-intrusive journal backup are corrected, future CDS backup processing will use non-intrusive journal backup.
  • If there is an indication of a probable program error, gather the necessary data for contacting the IBM® Support Center.
Related reading: