Sizing a cold-data-cache storage pool

Before you back up object client data to a cold-data-cache storage pool, size the cold-data-cache storage pool. The cold-data-cache storage pool acts as the initial disk-based storage location for data from an object client that is offloaded to IBM Spectrum Protect for archiving to tape storage. By correctly sizing the cold-data-cache storage pool, you can help to improve the throughput of archive operations, reduce the risk of archive failures, and ensure that enough storage capacity is available for data ingestion and restore operations.

Before you begin

An object client must be an IBM Spectrum Protect Plus server. For instructions about setting up IBM Spectrum Protect Plus as an object client to the IBM Spectrum Protect server, see Offloading data from IBM Spectrum Protect Plus.

About this task

Data that is offloaded from an object client is stored temporarily on disk in file volumes that are specified for the cold-data-cache storage pool. Then, data is migrated to the next storage pool that is defined on the DEFINE STGPOOL command for the cold-data-cache storage pool. After the data is migrated to a tape storage pool, the data is deleted from the cold-data-cache storage pool. Similarly, during a restore operation, the object data is restored temporarily to the cold-data-cache pool before the data can be read by an object client.

Consider the following guidelines for running migration processes on cold-data-cache storage pools:
  • Data becomes eligible for migration from the cold-data-cache storage pool as file volumes become full or are closed.
  • Processes to ingest new data and migrate eligible data to next storage pools can occur in parallel. As the data is migrated, it is deleted from the cold-data-cache storage pool. You can configure the number of parallel processes by specifying the MIGPROCESS parameter on the DEFINE STGPOOL command for the cold-data-cache storage pool. The number of parallel processes might be limited by the number of drives that are available for migration on the tape storage pool.
  • Migration performance can be limited by the throughput capability of the tape storage pool drives. For example, throughput rates of 300-400 MBs per second are common with LTO-8 tape drives and volumes during migration.
If IBM Spectrum Protect Plus issues a restore request to restore the offloaded data that was archived to tape storage, IBM Spectrum Protect server copies the data from the tape storage pool to the cold-data-cache storage pool temporarily. The data can then be restored by IBM Spectrum Protect Plus. Requested data is stored on the cold-data-cache storage pool for a specified number of days before deletion.
Tip: The tape storage pool is defined as a next storage pool by specifying the NEXTSTGPOOL parameter on the DEFINE STGPOOL command for the cold-data-cache storage pool.

To accommodate both newly-offloaded data and data copies that are staged for restore operations back to the object client, adequate space must be provisioned for the cold-data-cache storage pool. The IBM Spectrum Protect server reads and writes to the cold-data-cache storage pool predominantly in 256 KB blocks.

Procedure

To size and tune the cold-data-cache storage pool, follow the guidelines:

  • Use the tsmdiskperf.pl Perl script as a benchmarking tool to help you to size the cold-data-cache storage pool. Benchmark the directory paths to be used for the cold-data-cache storage pool with an overlapped, sequential read-and-write workload with a 256 KB block size.
    To run the script, issue the following command: perl tsmdiskperf.pl workload=stgpool fslist=directory_list where directory_list is a comma-separated list of directory paths.
    Ensure that the data ingestion rate that is obtainable for these directory locations satisfies the speed requirements for data-ingestion operations in your environment.
    For benchmarking tools and sample benchmarking tests, see the IBM Spectrum Protect Blueprints. The benchmarking tool tsmdiskperf.pl is available in the Blueprint configuration scripts package.
  • Ensure that the cold-data-cache storage pool is large enough to hold the daily volume of offloaded data. In this way, if an issue with the next tape storage pool prevents or slows migration, sufficient space is available to contain the daily workload and to avoid failures.
    You can also size the cold-data-cache storage pool to be smaller than the daily workload. However, archive failures might occur if the cold-data-cache storage pool runs out of space. In this case, retry the offload operation.
  • Where possible, optimize disk system performance by configuring the disk system for random read/write operations rather than sequential read/write operations.
  • Use RAID 5, RAID 6, or other disk protection for the cold-data-cache directory file system disks to avoid data loss.
  • On the DEFINE STGPOOL or UPDATE STGPOOL commands for the cold-data-cache storage pool, set the MIGPROCESS parameter value to match the number of tape drives from the next tape storage pool that can be used for migration activities. To optimize migration performance and ensure that the cold-data-cache storage pool releases space as quickly as possible, set the MIGPROCESS parameter with as high a value as possible. You can enter a value in the range 1 - 999.
    Tip: When you specify the MIGPROCESS parameter, consider other uses of the tape storage pool that might compete for resources. For example, you might use the tape storage pool to back up the IBM Spectrum Protect database.
  • For optimal throughput for the object client node that is running the backup and restore operations to the cold-data-cache storage pool, set the MAXNUMMP parameter on the REGISTER NODE or UPDATE NODE commands to a value of at least 100.
    Tip: This parameter limits how many mount points a node can use on the server. The IBM Spectrum Protect object agent can distribute backup and restore data movement across as many as 100 sessions for a single client node.
  • On the tape storage pool, set the COLLOCATE parameter to match your requirements. By default, group level collocation is used for sequential-access storage pools. If no collocation groups exist on the server, collocation by node is used by default. Each migration process from the cold-data-cache storage pool attempts to use a drive on the next tape storage pool, if available. When collocation is used, the IBM Spectrum Protect server attempts to store group, node, or file space data together and on as few tape volumes as possible.
  • During restore processing for data on tape storage, the IBM Spectrum Protect server might attempt to use multiple tape volume mounts, depending on the number of tape volumes in use. By default, the IBM Spectrum Protect server attempts to use up to four processes to restore data from tape volumes. The number of volumes limits the number of processes.
  • To help free up space and allow for the ingestion of newly-offloaded data to preempt data restore operations, specify the REMOVERESTOREDCOPYBEFORELIFETIMEEND=YES setting on the DEFINE STGPOOL or UPDATE STGPOOL commands of the cold-data-cache storage pool. By specifying REMOVERESTOREDCOPYBEFORELIFETIMEEND=YES, IBM Spectrum Protect can remove certain restored data copies (that are eligible for early deletion according to defined conditions) to create space for new data offloads.
  • By default, the MAXSCRATCH parameter on the DEFINE STGPOOL command is set to 5000 for a cold-data-cache storage pool. This parameter controls the maximum number of scratch file volumes that can be created in the storage pool during data ingestion and restore operations. By default, the device class that is created when you define the cold-data-cache storage pool has a volume size of 10 GB for an overall default capacity of 50,000 GB. If a larger capacity is needed, use the UPDATE STGPOOL command to increase the MAXSCRATCH parameter value for the cold-data-cache storage pool. The maximum value for this parameter is 9999. If more capacity is needed, you can also increase the cold-data-cache storage pool's device class volume size by issuing the UPDATE DEVCLASS command.

What to do next

  • Monitor used space within the cold-data-cache storage pool. If the storage pool frequently runs out of space, then the disk read or tape write performance might be too slow to handle the target data ingestion workload.