How To
Summary
This technote describes options for restoring a large number of objects that were backed up to an IBM Spectrum Protect server and are stored on sequential-access media such as tape.
Objective
The objective of this technote is to describe the available restore methods and provide a detailed procedure for restoring a large number of objects from multiple tape cartridges when the no-query restore method is not an option. The technote also provides a sample script to facilitate the restore process.
Problem description
Large file systems and file servers that hold millions of objects are typically used over the course of many years. To accommodate the large number of objects, high-capacity backup storage such as tape is required. As objects are added, updated, and deleted across the entire file system, the IBM Spectrum Protect backup-archive client runs incremental backup operations, which send data to the IBM Spectrum Protect server for storage on tape. In a typical situation, the data is distributed across many different tape cartridges.
The restore process requires access to these tape cartridges. If the restore process is not coordinated and does not occur in a tape-optimized manner, the process of mounting and spooling the tape cartridges can consume considerable time and resources.
Solutions
IBM Spectrum Protect supports two options for optimizing the restore operations:
- The no-query restore method. This method can be used to restore an entire file system or large directories. To use the IBM Spectrum Protect backup-archive client for this operation, follow the instructions in the No-query restore process topic in IBM Knowledge Center.
- A file list-based restore method. This method can be used when not all objects in a file system must be restored and the no-query restore method is not an option. This technote describes the file list-based restore method.
Tip: IBM Spectrum Protect provides options to improve the collocation of data when multiple nodes or multiple filespaces are protected in the same storage pool. Collocation of data on node or filespace level improves the performance of the restore. To use these options, follow the instructions in the collocation topic in IBM Knowledge Center.
Environment
The described method is supported in environments where the IBM Spectrum Protect Version 8.1.2 or later backup-archive client protects file system objects. The backup-archive client can be on any operating system and system architecture that are supported for the selected level.
Steps
To restore file system objects that are stored on sequential-access backup storage like tape, complete the following steps:
- Ensure that your environment includes an output directory with sufficient free space to query backup information about the objects that you want to restore.
To calculate the amount of space that is required for each queried object, use this formula:
(Average size of a single object path length) + 500 bytes = A
Then, multiply A by the number of objects that you plan to query.
- Query backup information about the objects that you want to restore by issuing the following backup-archive client command on a computer that is connected to the specific IBM Spectrum Protect node:
dsmc query backup -subdir=yes -detail > /tmp/backup_query.out
1,048,576 B 12/10/2020 10:46:58 DEFAULT A /gpfs/test2/file_99 Modified: 12/10/2020 10:45:44 Accessed: 12/10/2020 10:45:44 Inode changed: 12/10/2020 10:45:44 Compression Type: None Encryption Type: None Client-deduplicated: NO Migrated: NO Inode#: 61554 ACL Size: 0 Media Class: Fixed Volume ID: 0003 Restore Order: 00000000-0000C95F-00000000-00300417
- For each object, extract the file name, the volume ID, and the restore order from the output file.
In the following sample output, the required information is highlighted in green:
1,048,576 B 12/10/2020 10:46:58 DEFAULT A /gpfs/test2/file_99
Modified: 12/10/2020 10:45:44 Accessed: 12/10/2020 10:45:44 Inode changed: 12/10/2020 10:45:44
Compression Type: None Encryption Type: None Client-deduplicated: NO Migrated: NO Inode#: 61554
ACL Size: 0 Media Class: Fixed Volume ID: 0003 Restore Order: 00000000-0000C95F-00000000-00300417
- Write the information that you extracted to a text file in the following format:
volume_id restore_order file_name
For example, if you extract information from the previous sample output, you obtain the following result:
0003 00000000 0000C95F 00000000 00300417 /gpfs/test2/file_99
- In the text file, organize the entries in ascending order based on volume ID, where the lowest volume ID is placed at the top of the list.
- Split the list based on the first column, which reflects the volume ID, and remove the restore order information from the list so that only the file names appear:
file_name
The previous example produces the following result:
/gpfs/test2/file_99
After you complete the process, you will have several file lists with one file list per volume. All of the files are listed in a tape-optimized restore order.
- Use the backup-archive client to restore each of the per volume file lists separately.
Multiple processes of the backup-archive client can be started at the same time.
The maximum number of parallel backup-archive client processes is limited by the available tape drives and the setting of the IBM Spectrum Protect option MAXNUMMP.
Issue the following backup-archive client command:dsmc restore -filelist=/tmp/backup_query_sorted_volume_0001.out
Note: Do not start multiple restore processes against the same volume at the same time to prevent that restore processes wait for tape cartridges in use.
Tip: To automate the ordering and sorting process (Steps 3-6), you can use a script that is based on the following sample script: prepare_tape_opt_restore.sh
The only parameter of the script is the file that contains the result of the backup query that was run in Step 2.
prepare_tape_opt_restore.sh /tmp/backup_query.out
The script reads the input file, sort the content, and create output files. One output file is created for each volume that contains backup data, as shown in the following example:
scorpio:/gpfs # ./prepare_tape_opt_restore.sh /tmp/backup_query.out
Tue Jan 12 15:39:25 CET 2021 ... start ...
Tue Jan 12 15:39:25 CET 2021 ... remove directory ./tape_optimized_restore
Tue Jan 12 15:39:25 CET 2021 ... parse input file and write to file ./tape_optimized_restore/parser.out
Tue Jan 12 15:39:25 CET 2021 ... sort parser output and write to file ./tape_optimized_restore/sort.out
Tue Jan 12 15:39:25 CET 2021 ... create sorted lists and write to files ./tape_optimized_restore/sorted_volume.out*
Tue Jan 12 15:39:25 CET 2021 ...... sorted restore list for volume 0000 file name ./tape_optimized_restore/sorted_volume.out.Fixed.0000
Tue Jan 12 15:39:25 CET 2021 ...... sorted restore list for volume 0002 file name ./tape_optimized_restore/sorted_volume.out.Fixed.0002
Tue Jan 12 15:39:25 CET 2021 ...... sorted restore list for volume 0018 file name ./tape_optimized_restore/sorted_volume.out.Fixed.0018
Tue Jan 12 15:39:25 CET 2021 ... end ...
The sample script was implemented and tested on a Linux® operating system and is provided as a convenience, without guarantees or support.
Related Information
Document Location
Worldwide
[{"Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEQVQ","label":"IBM Spectrum Protect"},"ARM Category":[{"code":"a8m0z00000006gsAAA","label":"Client->Restore"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.1.2;and future releases"}]
Was this topic helpful?
Document Information
Modified date:
20 January 2021
UID
ibm16380770