Determining whether tiering is appropriate for your storage environment

Before you tier data from disk storage on directory-container storage pools to cloud or tape storage, review the information about storage types and determine which type best meets your business requirements. In general, cloud and tape are appropriate for long-term storage of data that is infrequently accessed and that does not require quick retrieval.

Consider the advantages and disadvantages of each storage type:

Disk storage on directory-container storage pools
Disk storage on directory-container storage pools (subsequently referred to as disk storage) is useful for data that must be frequently accessed and quickly retrieved. However, disk storage is typically more expensive than cloud or tape storage. For this reason, some users tier their data to cloud or tape storage to achieve space savings and reduce costs. The amount of space savings depends on how well the data is deduplicated on disk. If the data is efficiently deduplicated on disk, you might not achieve the expected space savings by tiering the data.
Tip: To determine whether data is effectively deduplicated on disk, take one of the following actions:
  • Generate data deduplication statistics by using the GENERATE DEDUPSTATS command. Then, view the statistics by using the QUERY DEDUPSTATS command.
  • From the administrative client, run the SELECT * from SUMMARY command. For examples, see Viewing data deduplication statistics.
In general, if the data deduplication ratio is 4:1 or better, the data is efficiently deduplicated. For example, if you have 100 KB of data and reduce the size to 25 KB, the data deduplication is considered efficient. If the ratio is less than 4:1, deduplication is not efficient.
If the data on disk storage is not efficiently deduplicated, and there is no requirement to quickly retrieve the data, consider tiering the data to cloud or tape storage frequently with a brief tiering delay.
Cloud storage
Cloud storage can be more scalable and cost effective than disk storage. However, the time that is required to retrieve data from the cloud is typically longer than from disk. For cloud tiering, you must find a cloud provider that uses the Microsoft Azure cloud computing system or a cloud computing system that uses the Simple Storage Service (S3) protocol, such as IBM Cloud™ Object Storage or the Amazon Simple Storage Service (Amazon S3). For the latest information about cloud object storage services that are supported, see technote 2000915.
Tape storage
Tape storage can be more scalable and cost effective than disk storage, but the time that is required to retrieve the data from tape is typically longer than from disk. For small file workloads (with an average file size of 50 KB or smaller), the process of tiering data to tape can take more time than a typical tiering window allows for. You must also consider the effort that is required to share tape drives and coordinate access to tape volumes.
Tip: To calculate average file size, use the IBM® Db2® Command Line Processor to run the following commands:
db2 "connect to tsmdb1"
db2 "set schema tsmdb1"
db2 "select avg(logical_size) from sc_all_objects for read only with ur
Before you tier data to tape, review the following flowchart. The conditions in the flowchart might not apply to all storage environments. Additional considerations, which are not in the flowchart, might influence your decision. Make the decision that helps you meet your business requirements.
The flowchart reflects the following decision points: 1. If the data must be retrieved quickly, tiering data to tape might not be appropriate. 2. If data deduplication on disk is highly successful, tiering data to tape might not help you achieve the expected space savings. 3. If the data must be accessed frequently, tiering data to tape might not be appropriate because accessing data from disk is typically faster. 4. If the data includes many small files, tiering data to tape might not be appropriate because tiering operations might be slow. However, if none of the previously listed conditions are true, tiering data to tape might help to optimize storage and reduce costs. You must balance the benefits against the additional effort that is associated with sharing tape drives and coordinating access to tape volumes.
If you decide to tier data to cloud or tape, you must also select a tiering option. The following options are available:
Tier data by age
When data meets a specified age threshold, the data is tiered.
Tip: In the Operations Center, to tier data by age, specify the All data option.
Tier data by state
When data meets a specified age threshold, only inactive data is tiered.
Tip: In the Operations Center, to tier data by state, specify the Inactive data option.
To select a tiering option, review the flowchart.
The flowchart depicts client types and describes how to tier data for each client type. IBM Spectrum Protect™ backup-archive client: For frequently accessed data, tier by state. For infrequently accessed data, tier by age. IBM Spectrum Protect for Databases: Data Protection for Microsoft SQL Server: For frequently accessed data, tier by state. For infrequently accessed data, tier by age. IBM Spectrum Protect for Databases: Data Protection for Oracle: For all data, tier by age. IBM Spectrum Protect for Databases: Data Protection for SAP: For all data, specify either option; in both cases, data is tiered by age. IBM Spectrum Protect HSM for Windows: For all data, specify either option; in both cases, data is tiered by age. IBM Spectrum Protect for Mail: Data Protection for IBM Domino: For frequently accessed data, tier by state. For infrequently accessed data, tier by age. IBM Spectrum Protect for Mail: Data Protection for Microsoft Exchange Server: For frequently accessed data, tier by state. For infrequently accessed data, tier by age. IBM Spectrum Protect for Space Management: For all data, tier by age. IBM Spectrum Protect for Virtual Environments: For production environments, do not tier data. For test and development environments, tier data by age. IBM Spectrum Protect Plus: For all data, tier by age.

For more detailed guidelines, see Detailed guidelines about tiering by age versus tiering by state.