Case 6: Overwritten migration tape

Important: This recovery scenario applies only to an overwritten header label. If the damage to the tape is more extensive than an overwritten label, the following recovery procedure will not work. Also, if the tape has been encrypted, there is similarly no way to recover it. If the tape cannot be recovered due to hardware problems related to the drive itself, consider sending it to your tape drive manufacturer for recovery. This scenario applies only to native tape; it will not work for the recovery of data from virtual tape.

If you have a duplex or a tape copy, perform a tape replace. This procedure is described in TAPEREPL command: Replacing cartridge-type tape volumes with their alternate volumes.

In either case, you should route the recovery assistance through IBM®. You may open a software problem report to diagnose the root cause of the damage, if it is not known.

Ensure that you have a backup copy of any data set that is going to tape (migration level 2). If a backup copy is available, you can recover from a backup version without having to recover from the damaged tape.

The following information is relevant to know or have available when you open a problem record:

  1. Specific failure indications—full message text of any related message that was issued
  2. Type of drives that you are using
  3. Microcode levels for the drive and controller
  4. Reasonably available history for the tape
  5. Pervasiveness of the problem
  6. Attempted recovery actions taken–RECALL, RECYCLE, DELVOL, and so forth.
  7. Customers should gather the following information when the problem arises:
    • List TTOC of the affected volume
    • Library Manager Log for an ATL
    • CTRACE (if you are already writing to an external writer)
    • SYSLOG
    • DFSMShsm PDA trace
    • A dump on the failure, if it can be trapped using a slip (ABEND637, for example).

In some cases, you cannot recover some or any of the data (depending on the extent of the damage). A backup of the tape ensures that this is not a concern. Back up the data, if the data matters, before going to migration level 2 (ML2).

The main reason for this type of damage is that the wrong tape volume is mounted to satisfy a mount request for a scratch tape. The problem typically remains undiscovered unless you need the overwritten tape volume to recall a data set.

Use the following recovery procedure to restore access to DFSMShsm data on tape cartridges. This procedure will not work for reel-type tape because of a difference in file format.

  1. Copy the damaged tape volume.

    (This step is optional.) If the user data that was written over the DFSMShsm data is wanted, you will need to create a copy of that data. The damaged tape volume will be returned to scratch or to DFSMShsm for its later use, and the user data that was written on this volume will be lost at the end of this procedure.

  2. Reinitialize the damaged tape volume.
    Reinitialize the tape volume with the data set name that is used by DFSMShsm for all of its tape volumes. Write a dummy data set with the first file on the volume named as below:

    prefix.HMIGTAPE.DATASET

    The prefix can be any valid qualifier. For example, you can use the migration prefix as specified on the SETSYS MIGRATEPREFIX command, or the UID. This step will create a good tape label on the damaged tape. Once the tape label is valid, the volume can be processed by RECYCLE in step 4 of this procedure.

    IEBGENER can be used to perform the reinitialization task.

    The sample JCL that follows can be used to reinitialize a damaged migration volume. Modify the JCL to fit your needs by doing the following tasks:
    • Change the UNIT parameter on the SYSUT2 DD statement, if necessary.
    • Change prefix to the appropriate prefix.
    • Change ml2volser to your volume serial number.

    Included in the example that follows are the various messages that are generated from this job and the appropriate replies.

    Attention: Do not perform the AUDIT MEDIACONTROLS function. AUDIT MEDIACONTROLS will remove all of your data.
    //GENR     JOB
    //STEP1    EXEC PGM=IEBGENER
    //SYSUT1   DD DUMMY,LRECL=16384,BLKSIZE=16384,RECFM=F
    //SYSUT2   DD DISP=(NEW,KEEP),DCB=*.SYSUT1,UNIT=?3480,
    //            DSN=prefix.HMIGTAPE.DATASET,VOL=SER=?ml2volser
    //SYSPRINT DD SYSOUT=*
    //SYSIN    DD DUMMY
    /*
                          J E S 2  J O B  L O G
    
     JOB09371  $HASP373 HAMMERTA STARTED - INIT 3 - CLASS A - SYS ..
     JOB09371 *IEF233A M 5A8,HAMMRT,,HAMMERTA,STEP1
     JOB09371  IEC512I LBL ERR 5A8,SMFDMP,SL,HAMMRT,SL,HAMMERTA,STEP1
     JOB09371 *IEC507D E 5A8,SMFDMP,HAMMERTA,STEP1,DMP.T241T240.DATA
     JOB09371  IEC507D REPLY 'U'-USE OR 'M'-UNLOAD
     JOB09371  R 78,U
     JOB09371 *IEC534D A 5A8,SMFDMP,SL,HAMMERTA,STEP1
     JOB09371  IEC534D REPLY 'U'-USE OR 'M'-UNLOAD
     JOB09371  R 84,U
     JOB09371 *IEC704A L 5A8,HAMMRT,SL,NOCOMP,HAMMERTA,STEP1
     JOB09371  IEC704A REPLY 'VOLSER,OWNER INFORMATION','M'OR'U'
     JOB09371  R 87,'HAMMRT,DFSMSHSM'
     JOB09371  IEC705I TAPE ON 5A8,HAMMRT,SL,NOCOMP,HAMMERTA,STEP1
     JOB09371  IEF234E K 5A8,HAMMRT,PVT,HAMMERTA,STEP1
     JOB09371  HAMMERTA STEP1  : 00:17:30/ 00:00:00/    53 - R000
     JOB09371  $HASP395 HAMMERTA ENDED
  3. Determine if the first data set on the damaged tape volume spanned from another volume.
    1. Issue the LIST TTOC(volser) ODS(dsname) command to determine if your damaged tape volume is associated with a previous tape volume. If your damaged tape is associated with a previous tape, it will be listed under the heading “PREV VOL.” When there is a previous tape, it means that the first data set on your damaged tape began on the previous tape.

      To verify whether your damaged tape is associated with a previous tape, enter the LIST TTOC(prev_volser) command and check to see that the last data set name on the previous tape is the same as the first data set name on your damaged tape. Note that if your installation is using extended tape table of contents, the LIST TTOC command might require an extended period of time to process each tape volume.

    2. Delete the first data set on your damaged tape volume if it spanned from a previous tape volume. To delete the first data set on a damaged migration tape, issue the following command:
      DELETE dsn
    3. Issue the following command to recover the first data set that spanned to the previous volume:
      RECOVER dsn
  4. Depending on the type of error that you are recovering from, it may be necessary to run RECYCLE several times. The following patch command causes DFSMShsm to re-drive the POINT MACRO up to ten times when an I/O error is encountered during recycle, in order to recover data from a tape that has been overwritten and contains valid DFSMShsm data beyond the tape marks created by the overwrite. The following patch value (X'0A') is appropriate for IBM 3590-technology tapes. Contact your tape-drive vendor for other tape technologies to determine the correct number of re-points.
    HSEND PATCH .MCVT.+267 X'0A' VERIFY(.MCVT.+267 X'00')
    At the end of the recovery procedures, issue the following patch command to prevent the re-drive of the POINT macro.
    HSEND PATCH .MCVT.+267 X'00' VERIFY(.MCVT.+267 X'0A')
    Issue the following command:
    RECYCLE VOLUME(volser) EXECUTE
    This step should be repeated to ensure that all available data sets are moved without the use of the FORCE parameter of the RECYCLE command.
    Note: Readable, complete, and valid data sets are subsequently recycled to a new volume.
  5. Identify data sets that have been overwritten and build appropriate recovery commands.
    Create a list showing which migrated data sets you need to recover (because the RECYCLE command was not able to remove them from the volume), by issuing the following command:
    LIST TTOC(volser) ODS(dsn)

    Build RECOVER commands for each data set on the list, but do not process the commands until step 7.

    An alternate to using the LIST TTOC command is to use DFSORT. The following is an example of a DFSORT job that identifies and builds RECOVER commands for each migrated data set that was overwritten. This job only builds the RECOVER commands; it does not process them.
    //RCVCMDS JOB
    //*******************************************************************
    //* Please change all the following to the appropriate values:
    //* ?USERID - to your user id
    //* ?DFSMShsm - to the DFSMShsm prefix for your MCDS
    //* ?VLSER - to the volser of your damaged tape volume
    //*******************************************************************
    //* This step uses SORT utility to build RECOVER
    //* commands.
    //*******************************************************************
    //STEP1 EXEC PGM=SORT,REGION=4096K
    //*******************************************************************
    //* The INCLUDE statement selects MCD records (X'00'
    //* at offset 51), that have the volser of your damaged tape and
    //* have the MCDFASN bit on.
    //* The OUTREC statement builds a RECOVER command for each
    //* migrated data set found.
    //*******************************************************************
    //SYSOUT DD SYSOUT=*
    //MCDS DD DSN=?DFSMSHSM.MCDS,DISP=SHR
    //CMDS DD DSN=?USERID.TEST.SORTMCD,DISP=(NEW,CATLG),
    // DCB=(LRECL=255,BLKSIZE=0,RECFM=VB),
    // SPACE=(CYL,(5,1),RLSE),UNIT=SYSDA
    //SYSIN DD *
    SORT FIELDS=(COPY)
    INCLUDE COND=((51,1,BI,EQ,X'00'),AND,
                                    (69,6,CH,EQ,C'?VLSER'),AND,
                                    (75,1,BI,EQ,B'1.......'))
    OUTREC FIELDS=(1,4,C' HSEND RECOVER ',5,44)
    /*
  6. Issue the RECYCLE VOLUME(volser) FORCE EXECUTE command for the damaged volume.

    The FORCE parameter of the RECYCLE command causes all data sets with errors to be deleted from the control data sets and the catalog. The deletion makes the tape empty, and the tape is removed from DFSMShsm's inventory of tapes that contain data.

    If the RECYCLE command fails with return code 16, you must issue a second RECYCLE VOLUME command to recycle the valid data sets while processing in single buffer mode. For an explanation of return code 16, refer to z/OS MVS System Messages, Vol 2 (ARC-ASA) and the table in the appendix titled "Reason Codes for Message ARC0734I when the Action is RECYCLE and Return Code is 16."

  7. Issue the RECOVER commands for the migrated data sets that were lost. The RECOVER commands were generated in step 5.