Repairing storage pools from container-copy storage pool volumes

If files, directories, or storage pools on a source server are damaged, you can repair data extents in a directory-container storage pool on the source server by retrieving the deduplicated data extents from onsite or offsite container-copy storage pool tape volumes.

Before you begin

Complete the following steps:
  1. Evaluate your storage environment to determine whether outages, network issues, or hardware failures are causing damage to data or causing the data to appear damaged. If issues in your environment are causing damage to data, identify and resolve the issues.
  2. Ensure that enough space is available in the directory-container storage pool for the recovered data. The PREVIEW=YES parameter in the REPAIR STGPOOL command specifies how much data will be repaired. If the space is insufficient, use the DEFINE STGPOOLDIRECTORY command to provision space.
  3. Back up the IBM Spectrum® Protect server database by using one of the following methods:
    • On the Operations Center Overviews page, click Servers, select a server, and click Back Up.
    • Issue the administrative command, BACKUP DB.
  4. Review the latest information about repairing and recovering data in technote 2013682.
  5. To plan the next steps, review the following restrictions about using the AUDIT CONTAINER command.
    Attention:
    • If you issue the AUDIT CONTAINER command with the ACTION=MARKDAMAGED setting for an entire storage pool, referenced data is unavailable for restore operations until the storage pool is repaired. Depending on the database size, network bandwidth, media speed, and other factors, the REPAIR STGPOOL command might run for hours or days. For this reason, if some of the data in the storage pool is available, or the status of data in the storage pool is unknown, follow these guidelines:
      1. Consider running the AUDIT CONTAINER command with the ACTION=SCANALL setting first. The ACTION=SCANALL setting identifies database records that refer to data extents with inconsistencies. Only those data extents are marked as damaged in the database.
      2. After the extents are marked as damaged, you can run the REPAIR STGPOOL command.
    • If you plan to run the AUDIT CONTAINER command with the ACTION=REMOVEDAMAGED setting, follow these guidelines:
      1. Considering running the QUERY DAMAGED command first to determine the scope of damaged data extents in the storage pool.
      2. After that, you can run the REPAIR STGPOOL command to repair damaged extents in the storage pool.
      3. Finally, you can run the AUDIT CONTAINER command with the ACTION=REMOVEDAMAGED setting to remove any damaged data extents that remain in the storage pool.

About this task

Use the procedure to repair the following types of damage:
  • Minor damage that is caused by accidental deletion of files or directories, overwritten files, accidental changes in file permissions, or disk errors caused by hardware issues.
  • Moderate damage that is caused by disk errors or disk mount errors. This type of damage results in the loss of one or more directories, but not a loss of the entire storage pool.
Damaged deduplicated extents are repaired with extents that were protected to container-copy storage pools.
Restriction: You can issue the REPAIR STGPOOL command for a specified storage pool only if you already copied the data to container-copy storage pools by using the PROTECT STGPOOL command.
When you repair a directory-container storage pool from container-copy pools, the REPAIR STGPOOL command fails if any of the following conditions occur:
  • The container-copy storage pool is unavailable.
  • The container-copy storage pool is damaged.
  • The container-copy storage pool volumes are unavailable or damaged.

Procedure

  1. If you suspect minor damage, issue the AUDIT CONTAINER command for the container storage pool at the directory level to identify inconsistencies between the database and the directory-container storage pool. By identifying the damaged data extents in the directory-container storage pool, you can determine which data extents to repair. To conserve time and resources, audit only containers that you suspect are damaged. If you suspect that your container storage pool has more serious damage, issue the AUDIT CONTAINER command at the storage pool level.
    For example, to audit a directory, n:\pooldir, in a storage pool that is named STGPOOL1, issue the following command:
    audit container stgpool=stgpool1 stgpooldirectory=n:\pooldir
    To audit a storage pool that is named STGPOOL1, issue the following command:
    audit container stgpool=stgpool1

    The audit process might run for several hours.

    During the repair operation, the server prompts you for the volumes that it requires. In step 3, you will bring the volumes onsite and check them into the library. The required volumes must be brought onsite and checked into the library.

  2. To preview the repair operation and generate the list of tape volumes that are needed for the repair operation, issue the REPAIR STGPOOL command and specify the SRCLOCATION=LOCAL and PREVIEW=YES parameters.
    For example, to preview the repair operation for a storage pool that is named STGPOOL1 from container-copy storage pools, issue the following command:
    repair stgpool stgpool1 srclocation=local preview=yes

    The preview process might take some time to finish.

  3. If some of the required volumes are offsite, complete the following steps:
    1. Use the list from the preview operation to determine which volumes need to be brought onsite.
    2. When the volumes are back onsite, check them into the library by issuing the CHECKIN LIBVOLUME command and specifying the STATUS=PRIVATE parameter.
    3. Update the status of the volumes by issuing the UPDATE STGPOOL command and specifying the ACCESS=READWRITE parameter.
    For detailed instructions about the disaster recovery manager (DRM) function, see Using disaster recovery manager for tape environments (V7.1.1).
  4. Based on the information that you obtained during the preview operation, ensure that the storage pool contains enough space for the recovered data. If there is not enough space, use the DEFINE STGPOOLDIRECTORY command to provision space.
  5. To repair the directory-container storage pool, issue the REPAIR STGPOOL command and specify the SRCLOCATION=LOCAL parameter.
    For example, to repair a storage pool that is named STGPOOL1 from a container-copy storage pool, issue the following command:
    repair stgpool stgpool1 srclocation=local

    When you issue the REPAIR STGPOOL command, the damaged extents are deleted from the volume immediately after they are repaired. The damaged extents are not retained according to the value specified by the REUSEDELAY parameter.

  6. Identify any additional damaged extents by issuing the QUERY DAMAGED command.
  7. If damage is detected and deduplicated extents cannot be repaired from the container-copy storage pools, it is still possible that they will be repaired. In some cases, the client node resends data during a backup operation and the damaged extents are repaired. Wait two backup cycles to allow client backup operations to occur. After two backup cycles, complete the following steps:
    1. To confirm that the damage is repaired, reissue the QUERY DAMAGED command.
    2. If an entire storage pool directory is damaged, create a new replacement storage pool directory using the DEFINE STGPOOLDIRECTORY command.
    3. To remove objects that refer to damaged data, issue the AUDIT CONTAINER command and specify the ACTION=REMOVEDAMAGED parameter.
      For example, to audit a directory-container storage pool that is named STGPOOL1 and remove damaged objects, issue the following command:
      audit container stgpool=stgpool1 action=removedamaged
    4. Optionally, issue the DELETE STGPOOLDIRECTORY command to delete the empty storage pool directory that you replaced with a new directory in step 7.b.
  8. If you repaired an entire storage pool directory, delete the original directory, which is empty and was replaced by a new directory. Delete the original directory by issuing the DELETE STGPOOLDIRECTORY command.

What to do next

If you continue to detect damaged data over time, issue the AUDIT CONTAINER command for the directory-container storage pool to determine whether there is more widespread damage. For example, to audit a storage pool that is named STGPOOL1, issue the following command:
audit container stgpool=stgpool1