Sizing a cold-data-cache storage pool

Before you back up object client data to a cold-data-cache storage pool, size the cold-data-cache storage pool. The cold-data-cache storage pool acts as the initial disk-based storage location for object client data that is copied to IBM® Storage Protect for archiving to tape storage. By correctly sizing the cold-data-cache storage pool, you can help to improve the throughput of archive operations, reduce the risk of archive failures, and ensure that enough storage capacity is available for data ingestion and restore operations.

Before you begin

An object client must be an IBM Storage Protect Plus server. Review the following information:
Tip: In previous releases, the process of copying data from IBM Storage Protect Plus to secondary backup storage was known as offloading data. Beginning with IBM Storage Protect 8.1.9, the process is known as copying data.

About this task

Data that is copied from IBM Storage Protect Plus is stored temporarily on disk in file volumes that are specified for the cold-data-cache storage pool. Then, data is migrated to the next storage pool that is defined on the DEFINE STGPOOL command for the cold-data-cache storage pool. After the data is migrated to a tape storage pool, the data is deleted from the cold-data-cache storage pool.

Tip: The tape storage pool is defined as a next storage pool by specifying the NEXTSTGPOOL parameter on the DEFINE STGPOOL command for the cold-data-cache storage pool.

Similarly, during a restore operation, the object data is restored temporarily to the cold-data-cache pool before the data can be read by an object client. When IBM Storage Protect Plus issues a request to restore the object data from tape storage, the IBM Storage Protect server copies the data from the tape storage pool to the cold-data-cache storage pool temporarily. Then, IBM Storage Protect Plus can restore the data. Requested data is stored on the cold-data-cache storage pool for a specified number of days before deletion.

Consider the following guidelines for running migration processes on cold-data-cache storage pools:
  • Data becomes eligible for migration from the cold-data-cache storage pool as file volumes become full or are closed.
  • Processes to ingest new data and migrate eligible data to next storage pools can occur in parallel. As the data is migrated, it is deleted from the cold-data-cache storage pool. You can configure the number of parallel processes by specifying the MIGPROCESS parameter on the DEFINE STGPOOL command for the cold-data-cache storage pool. The number of parallel processes might be limited by the number of drives that are available for migration on the tape storage pool.
  • Migration performance can be limited by the throughput capability of the tape storage pool drives. For example, throughput rates of 300-400® MBs per second are common with LTO-8 tape drives and volumes during migration.

To accommodate both recently copied data and data copies that are staged for restore operations back to the object client, adequate space must be provisioned for the cold-data-cache storage pool. The IBM Storage Protect server reads and writes to the cold-data-cache storage pool predominantly in 256 KB blocks.

Procedure

To size and tune the cold-data-cache storage pool, follow the guidelines:

  • Use the tsmdiskperf.pl Perl script as a benchmarking tool to size the cold-data-cache storage pool.
    • Benchmark the directory paths to be used for the cold-data-cache storage pool with an overlapped, sequential read-and-write workload with a 256 KB block size.
    • To run the script, issue the following command:
      perl tsmdiskperf.pl workload=stgpool fslist=directory_list
      where directory_list is a comma-separated list of directory paths.
    • Ensure that the data ingestion rate that is obtainable for these directory locations satisfies the speed requirements for data-ingestion operations in your environment.
    For benchmarking tools and sample benchmarking tests, see the IBM Storage Protect Blueprints. The benchmarking tool tsmdiskperf.pl is available in the Blueprint configuration scripts package.
  • Ensure that the cold-data-cache storage pool is large enough to hold the daily volume of data from a copy operation. In this way, if an issue with the next tape storage pool prevents or slows migration, sufficient space is available to contain the daily workload and avoid failures.
  • Where possible, optimize disk system performance by configuring the disk system for random read/write operations rather than sequential read/write operations.
  • Use RAID 5, RAID 6, or other disk protection for the cold-data-cache directory file system disks to avoid data loss.
  • On the DEFINE STGPOOL or UPDATE STGPOOL commands for the cold-data-cache storage pool, set the MIGPROCESS parameter value to match the number of tape drives from the next tape storage pool that can be used for migration activities. To optimize migration performance and ensure that the cold-data-cache storage pool releases space as quickly as possible, set the MIGPROCESS parameter with as high a value as possible. You can enter a value in the range 1 - 999.
    Tip: When you specify the MIGPROCESS parameter, consider other uses of the tape storage pool that might compete for resources. For example, you might use the tape storage pool to back up the IBM Storage Protect database.
  • For optimal throughput for the object client node that is running the backup and restore operations to the cold-data-cache storage pool, set the MAXNUMMP parameter on the REGISTER NODE or UPDATE NODE commands to a value of at least 100.
    Tip: This parameter limits how many mount points a node can use on the server. The IBM Storage Protect object agent can distribute backup and restore data movement across as many as 100 sessions for a single client node.
  • On the DEFINE STGPOOL or UPDATE STGPOOL commands for the tape storage pool, set the COLLOCATE parameter to match your requirements. By default, group level collocation is used for sequential-access storage pools. If no collocation groups exist on the server, collocation by node is used by default. Each migration process from the cold-data-cache storage pool attempts to use a drive on the next tape storage pool, if available. When collocation is used, the IBM Storage Protect server attempts to store group, node, or file space data together on as few tape volumes as possible.
    Tip: During an operation to restore data from tape storage, the IBM Storage Protect server might attempt to use multiple tape volume mounts, depending on the number of tape volumes in use. By default, the IBM Storage Protect server attempts to use up to four processes to restore data from tape volumes. The number of volumes limits the number of processes.
  • To release space and allow for the ingestion of recently copied data to preempt data restore operations, specify the REMOVERESTOREDCOPYBEFORELIFETIMEEND=YES setting on the DEFINE STGPOOL or UPDATE STGPOOL commands of the cold-data-cache storage pool. When this parameter is set to YES, IBM Storage Protect removes certain restored data copies (that are eligible for early deletion according to defined conditions) to create space for new data copy operations.
  • By default, the MAXSCRATCH parameter on the DEFINE STGPOOL command is set to 5000 for a cold-data-cache storage pool. This parameter controls the maximum number of scratch file volumes that can be created in the storage pool during data ingestion and restore operations. By default, the device class that is created when you define the cold-data-cache storage pool has a volume size of 10 GB for an overall default capacity of 50,000 GB.
    If a larger capacity is needed, use the UPDATE STGPOOL command to increase the MAXSCRATCH parameter value for the cold-data-cache storage pool. The maximum value for this parameter is 9999. If more capacity is needed, you can also increase the cold-data-cache storage pool's device class volume size by issuing the UPDATE DEVCLASS command.

Example architecture of data flows for copy and restore operations

The following image shows an example of a typical data flow to copy data from IBM Storage Protect Plus to the cold-data-cache storage pool on an IBM Storage Protect server so that the server can move the data to tape storage.
Figure 1. Data flow for copying data
The image depicts the high-level architecture of the typical data flow to copy data from an IBM Storage Protect Plus object client to the cold-data-cache storage pool on the IBM Storage Protect server and from there to tape storage.
Tip: For detailed instructions, see Configuring operations for copying data to tape.

The following image shows an example of a typical data flow to restore data from tape storage to the IBM Storage Protect Plus object client by using cold-data-cache storage pools on the IBM Storage Protect server.

Figure 2. Data flow for restoring data
The image depicts the high-level architecture of the typical data flow to restore data from tape storage to the IBM Storage Protect Plus object client by using cold-data-cache storage pools on the IBM Storage Protect server.
Tip: For detailed instructions, see Restoring data from tape to IBM Storage Protect Plus.

What to do next

  • Monitor used space within the cold-data-cache storage pool. If the storage pool frequently runs out of space, the performance of disk-read and tape-write operations might be insufficient to handle the target data ingestion workload.