Before you back up object client data to a cold-data-cache storage pool, size the
cold-data-cache storage pool. The cold-data-cache storage pool acts as the initial
disk-based storage location for data from an object client that is offloaded to IBM
Spectrum Protect for archiving to tape storage. By correctly sizing
the cold-data-cache storage pool, you can help to improve the throughput of archive operations,
reduce the risk of archive failures, and ensure that enough storage capacity is available for data
ingestion and restore operations.
Before you begin
An object client must be an IBM Spectrum Protect
Plus server. For instructions about setting up
IBM Spectrum Protect
Plus as an object client to the IBM
Spectrum Protect server, see Offloading data from IBM Spectrum Protect Plus.
About this task
Data that is offloaded from an object client is stored temporarily on disk in file volumes that
are specified for the cold-data-cache storage pool. Then, data is migrated to the next storage pool
that is defined on the DEFINE STGPOOL command for the cold-data-cache storage
pool. After the data is migrated to a tape storage pool, the data is deleted from the
cold-data-cache storage pool. Similarly, during a restore operation, the object data is restored
temporarily to the cold-data-cache pool before the data can be read by an object client.
Consider the following guidelines for running migration processes on cold-data-cache storage
pools:
- Data becomes eligible for migration from the cold-data-cache storage pool as file volumes become
full or are closed.
- Processes to ingest new data and migrate eligible data to next storage pools can occur in
parallel. As the data is migrated, it is deleted from the cold-data-cache storage pool. You can
configure the number of parallel processes by specifying the MIGPROCESS
parameter on the DEFINE STGPOOL command for the cold-data-cache storage pool. The
number of parallel processes might be limited by the number of drives that are available for
migration on the tape storage pool.
- Migration performance can be limited by the throughput capability of the tape storage pool
drives. For example, throughput rates of 300-400 MBs per second are common with LTO-8 tape drives and volumes during migration.
If
IBM Spectrum Protect
Plus issues a restore request to
restore the offloaded data that was archived to tape storage,
IBM
Spectrum Protect server copies the data from the tape storage pool
to the cold-data-cache storage pool temporarily. The data can then be restored by
IBM Spectrum Protect
Plus. Requested data is stored on the cold-data-cache
storage pool for a specified number of days before deletion.
Tip: The tape storage pool
is defined as a next storage pool by specifying the NEXTSTGPOOL parameter on
the DEFINE STGPOOL command for the cold-data-cache storage pool.
To accommodate both newly-offloaded data and data copies that are staged for restore operations
back to the object client, adequate space must be provisioned for the cold-data-cache storage pool.
The IBM
Spectrum Protect server reads and writes to the
cold-data-cache storage pool predominantly in 256 KB blocks.
Procedure
To size and tune the cold-data-cache storage pool, follow the guidelines:
-
Use the tsmdiskperf.pl Perl script as a benchmarking tool to help you to
size the cold-data-cache storage pool. Benchmark the directory paths to be used for the
cold-data-cache storage pool with an overlapped, sequential read-and-write workload with a 256 KB
block size.
To run the script, issue the following command: perl tsmdiskperf.pl
workload=stgpool fslist=directory_list where directory_list is a
comma-separated list of directory paths.
Ensure that the data ingestion rate that is
obtainable for these directory locations satisfies the speed requirements for data-ingestion
operations in your environment.
For benchmarking tools and sample benchmarking tests,
see the
IBM Spectrum Protect Blueprints. The
benchmarking tool
tsmdiskperf.pl is available in the
Blueprint
configuration scripts package.
- Ensure that the cold-data-cache storage pool is large enough to hold the daily volume of
offloaded data. In this way, if an issue with the next tape storage pool prevents or slows
migration, sufficient space is available to contain the daily workload and to avoid
failures.
You can also size the cold-data-cache storage pool to be smaller than the daily
workload. However, archive failures might occur if the cold-data-cache storage pool runs out of
space. In this case, retry the offload operation.
- Where possible, optimize disk system performance by configuring the disk system for
random read/write operations rather than sequential read/write operations.
- Use RAID 5, RAID 6, or other disk protection for the cold-data-cache directory file
system disks to avoid data loss.
- On the DEFINE STGPOOL or UPDATE STGPOOL commands
for the cold-data-cache storage pool, set the MIGPROCESS parameter value to
match the number of tape drives from the next tape storage pool that can be used for migration
activities. To optimize migration performance and ensure that the cold-data-cache storage pool
releases space as quickly as possible, set the MIGPROCESS parameter with as
high a value as possible. You can enter a value in the range 1 - 999.
Tip: When you specify the MIGPROCESS parameter, consider other uses
of the tape storage pool that might compete for resources. For example, you might use the tape
storage pool to back up the IBM
Spectrum Protect
database.
- For optimal throughput for the object client node that is running the backup and restore
operations to the cold-data-cache storage pool, set the MAXNUMMP parameter on
the REGISTER NODE or UPDATE NODE commands to a value of at
least 100.
Tip: This parameter limits how many mount points a node can use on the server. The
IBM
Spectrum Protect object agent can distribute backup and
restore data movement across as many as 100 sessions for a single client node.
- On the tape storage pool, set the COLLOCATE parameter to match your
requirements. By default, group level collocation is used for sequential-access storage pools. If no
collocation groups exist on the server, collocation by node is used by default. Each migration
process from the cold-data-cache storage pool attempts to use a drive on the next tape storage pool,
if available. When collocation is used, the IBM
Spectrum Protect server attempts to store group, node, or file space data together and on as few tape volumes as
possible.
- During restore processing for data on tape storage, the IBM
Spectrum Protect server might attempt to use multiple tape volume
mounts, depending on the number of tape volumes in use. By default, the IBM
Spectrum Protect server attempts to use up to four processes to
restore data from tape volumes. The number of volumes limits the number of processes.
- To help free up space and allow for the ingestion of newly-offloaded data to preempt data
restore operations, specify the REMOVERESTOREDCOPYBEFORELIFETIMEEND=YES setting
on the DEFINE STGPOOL or UPDATE STGPOOL commands of the
cold-data-cache storage pool. By specifying
REMOVERESTOREDCOPYBEFORELIFETIMEEND=YES, IBM
Spectrum Protect can remove certain restored data copies (that are
eligible for early deletion according to defined conditions) to create space for new data
offloads.
- By default, the MAXSCRATCH parameter on the DEFINE
STGPOOL command is set to 5000 for a cold-data-cache storage pool. This parameter controls
the maximum number of scratch file volumes that can be created in the storage pool during data
ingestion and restore operations. By default, the device class that is created when you define the
cold-data-cache storage pool has a volume size of 10 GB for an overall default capacity of 50,000
GB. If a larger capacity is needed, use the UPDATE STGPOOL command to increase
the MAXSCRATCH parameter value for the cold-data-cache storage pool. The
maximum value for this parameter is 9999. If more capacity is needed, you can also increase the
cold-data-cache storage pool's device class volume size by issuing the UPDATE
DEVCLASS command.
What to do next
- Monitor used space within the cold-data-cache storage pool. If the storage pool frequently runs
out of space, then the disk read or tape write performance might be too slow to handle the target
data ingestion workload.