Data reduction pools

Data reduction pools reduce the amount of data that is stored on internal drives and external storage systems through data reduction techniques including compression, deduplication, and thin provisioning. A volume in a data reduction pool can be configured to use different capacity savings methods simultaneously. Data reduction pools reclaim previously used capacity that is no longer needed by host systems. Support for the host SCSI unmap command is enabled by default.

To estimate the potential capacity savings that data reduction can provide on the system, use the Data Reduction Estimation Tool (DRET). This tool analyzes existing user workloads that are being migrated to a new system. The tool scans target workloads on all attached storage arrays, consolidates these results, and generates an estimate of potential data reduction savings for the entire system.

For more information about DRET, see https://www.ibm.com/support/pages/node/6217841. For more information about Comprestimator, see https://www.ibm.com/support/pages/node/6209688.

If you use external storage systems that support data reduction technologies, you can also configure data reduction on the storage systems. The storage system can reclaim that freed storage and reorganize the data on other volumes to more efficiently use the capacity. For standard-provisioned volumes, the system fully controls storage on these storage systems. When a volume is deleted, capacity is freed on the system and can be reallocated; the storage system is not aware of this freed space. However, if the storage system uses compression, thin-provisioning, or deduplication, the storage system controls the use of the usable capacity. In this configuration, when capacity is freed, the system notifies the storage system that capacity is no longer needed. The storage system can then reuse that capacity or free it as reclaimable capacity. The system also supports reclaimable capacity from certain internal drives, such as the 15 TB tier 1 flash drives, which can improve performance on these types of drives.

When you create a data reduction pool, ensure that the usable capacity of the pool includes overhead capacity. Overhead capacity is an amount of usable capacity that contains the metadata for tracking unmap and reclaim operations within the pool. A general guideline is to ensure that the provisioned capacity with the data reduction pool does not exceed 85% of the total usable capacity of the data reduction pool. Table 1 includes the minimum data reduction pool capacity that is required to be able to create a volume within the pool.
Table 1. Minimum overhead capacity requirements for data reduction pools
Extent size (in gigabytes) Overhead capacity requirements (in terabytes)1
1 GB or smaller 1.1 TB
2 GB 2.1 TB
4 GB 4.2 TB
8 GB 8.5 TB
1Standard-provisioned volumes are not included into the minimum overhead capacity values. When you are planning usable capacity for data reduction pools, determine the usable capacity that is needed for any standard-provisioned volumes first, then ensure that the minimum usable capacity values for the data reduction pools are included.
Note: If your system contains self-compressed drives, ensure that volumes in data reduction pools are created with compression enabled or with no data savings. If data reduction pools are created as thin-provisioned volumes without compression, the system cannot calculate accurate usable capacity.

For more information on creating data reduction pools using command-line interface, see mkmdiskgrp command.