IBM Support

How to initiate an optimized restore operation for a large number of objects stored by IBM Spectrum Protect on tape

How To


Summary

This technote describes options for restoring a large number of objects that were backed up to an IBM Spectrum Protect server and are stored on sequential-access media such as tape.

Objective

The objective of this technote is to describe the available restore methods and provide a detailed procedure for restoring a large number of objects from multiple tape cartridges when the no-query restore method is not an option. The technote also provides a sample script to facilitate the restore process.
Problem description
Large file systems and file servers that hold millions of objects are typically used over the course of many years. To accommodate the large number of objects, high-capacity backup storage such as tape is required. As objects are added, updated, and deleted across the entire file system, the IBM Spectrum Protect backup-archive client runs incremental backup operations, which send data to the IBM Spectrum Protect server for storage on tape. In a typical situation, the data is distributed across many different tape cartridges.
The restore process requires access to these tape cartridges. If the restore process is not coordinated and does not occur in a tape-optimized manner, the process of mounting and spooling the tape cartridges can consume considerable time and resources.
Solutions
IBM Spectrum Protect supports two options for optimizing the restore operations:
  1. The no-query restore method. This method can be used to restore an entire file system or large directories. To use the IBM Spectrum Protect backup-archive client for this operation, follow the instructions in the No-query restore process topic in IBM Knowledge Center.
  2. A file list-based restore method. This method can be used when not all objects in a file system must be restored and the no-query restore method is not an option. This technote describes the file list-based restore method.
Tip: IBM Spectrum Protect provides options to improve the collocation of data when multiple nodes or multiple filespaces are protected in the same storage pool. Collocation of data on node or filespace level improves the performance of the restore. To use these options, follow the instructions in the collocation topic in IBM Knowledge Center.

 

Environment

The described method is supported in environments where the IBM Spectrum Protect Version 8.1.2 or later backup-archive client protects file system objects. The backup-archive client can be on any operating system and system architecture that are supported for the selected level.

 

Steps

To restore file system objects that are stored on sequential-access backup storage like tape, complete the following steps:
  1. Ensure that your environment includes an output directory with sufficient free space to query backup information about the objects that you want to restore.
    To calculate the amount of space that is required for each queried object, use this formula:
    (Average size of a single object path length) + 500 bytes = A
    Then, multiply A by the number of objects that you plan to query.
     
  2. Query backup information about the objects that you want to restore by issuing the following backup-archive client command on a computer that is connected to the specific IBM Spectrum Protect node:
    dsmc query backup -subdir=yes -detail > /tmp/backup_query.out
    The output is similar to the following example:
    1,048,576  B  12/10/2020 10:46:58             DEFAULT              A  /gpfs/test2/file_99
    Modified: 12/10/2020 10:45:44  Accessed: 12/10/2020 10:45:44  Inode changed: 12/10/2020 10:45:44
    Compression Type: None  Encryption Type:        None  Client-deduplicated: NO  Migrated: NO  Inode#: 61554
    ACL Size: 0  Media Class: Fixed  Volume ID: 0003  Restore Order: 00000000-0000C95F-00000000-00300417
     
  3. For each object, extract the file name, the volume ID, and the restore order from the output file.
    In the following sample output, the required information is highlighted in green:

    1,048,576  B  12/10/2020 10:46:58             DEFAULT              A  /gpfs/test2/file_99
    Modified: 12/10/2020 10:45:44  Accessed: 12/10/2020 10:45:44  Inode changed: 12/10/2020 10:45:44
    Compression Type: None  Encryption Type:        None  Client-deduplicated: NO  Migrated: NO  Inode#: 61554
    ACL Size: 0  Media Class: Fixed  Volume ID: 0003  Restore Order: 00000000-0000C95F-00000000-00300417

     
  4. Write the information that you extracted to a text file in the following format:
    volume_id restore_order file_name

    For example, if you extract information from the previous sample output, you obtain the following result:
    0003 00000000 0000C95F 00000000 00300417 /gpfs/test2/file_99
     
  5. In the text file, organize the entries in ascending order based on volume ID, where the lowest volume ID is placed at the top of the list.
     
  6. Split the list based on the first column, which reflects the volume ID, and remove the restore order information from the list so that only the file names appear:
    file_name

    The previous example produces the following result:
    /gpfs/test2/file_99
    After you complete the process, you will have several file lists with one file list per volume. All of the files are listed in a tape-optimized restore order.
     
  7. Use the backup-archive client to restore each of the per volume file lists separately.
    Multiple processes of the backup-archive client can be started at the same time.
    The maximum number of parallel backup-archive client processes is limited by the available tape drives and the setting of the IBM Spectrum Protect option MAXNUMMP.
    Issue the following backup-archive client command:
    dsmc restore -filelist=/tmp/backup_query_sorted_volume_0001.out

    Note: Do not start multiple restore processes against the same volume at the same time to prevent that restore processes wait for tape cartridges in use.

 
Tip: To automate the ordering and sorting process (Steps 3-6), you can use a script that is based on the following sample script: prepare_tape_opt_restore.sh
 
The only parameter of the script is the file that contains the result of the backup query that was run in Step 2.
prepare_tape_opt_restore.sh /tmp/backup_query.out
The script reads the input file, sort the content, and create output files. One output file is created for each volume that contains backup data, as shown in the following example:
scorpio:/gpfs # ./prepare_tape_opt_restore.sh /tmp/backup_query.out
Tue Jan 12 15:39:25 CET 2021 ... start ...
Tue Jan 12 15:39:25 CET 2021 ... remove directory ./tape_optimized_restore
Tue Jan 12 15:39:25 CET 2021 ... parse input file and write to file ./tape_optimized_restore/parser.out
Tue Jan 12 15:39:25 CET 2021 ... sort parser output and write to file ./tape_optimized_restore/sort.out
Tue Jan 12 15:39:25 CET 2021 ... create sorted lists and write to files ./tape_optimized_restore/sorted_volume.out*
Tue Jan 12 15:39:25 CET 2021 ...... sorted restore list for volume 0000  file name ./tape_optimized_restore/sorted_volume.out.Fixed.0000
Tue Jan 12 15:39:25 CET 2021 ...... sorted restore list for volume 0002  file name ./tape_optimized_restore/sorted_volume.out.Fixed.0002

Tue Jan 12 15:39:25 CET 2021 ...... sorted restore list for volume 0018  file name ./tape_optimized_restore/sorted_volume.out.Fixed.0018
Tue Jan 12 15:39:25 CET 2021 ... end ...
The sample script was implemented and tested on a Linux® operating system and is provided as a convenience, without guarantees or support.

 

Related Information

Document Location

Worldwide

[{"Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEQVQ","label":"IBM Spectrum Protect"},"ARM Category":[{"code":"a8m0z00000006gsAAA","label":"Client->Restore"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.1.2;and future releases"}]

Document Information

Modified date:
20 January 2021

UID

ibm16380770