Start of change

Using Huffman compression to compress your data

With IBM z14®, Db2 can use Huffman compression to reduce the space required for a table by using data compression. Compressing the data in a table space can significantly reduce the amount of disk space that is needed to store data and can help improve buffer pool performance.

Before you begin

FL 504The TS_COMPRESSION_TYPE subsystem parameter specifies the default compression algorithm that is used to compress Db2 data at the subsystem level.

FL 509 The COMPRESS option of the ALTER TABLESPACE, CREATE TABLE, and CREATE TABLESPACE statements can define the compression algorithm that is used to compress Db2 data at the table space or partition level.

You can use the DSN1COMP utility to determine how well compression of your data will work.

To enable Huffman compression:
  • The Db2 subsystem or data sharing group must be running on IBM z14 or later
  • The function level must be V12R1M504 or later
  • The table space to be compressed is a universal table space
  • Any of the following are true:
    • The table space is defined with the COMPRESS YES clause and the TS_COMPRESSION_TYPE subsystem parameter is set to HUFFMAN
    • FL 509 The table space is defined with the COMPRESS YES HUFFMAN clause

Table spaces that are defined with the ORGANIZE BY HASH clause cannot be compressed.

About this task

When you compress data, bit strings that occur frequently are replaced by shorter strings. Information about the mapping of bit strings to their replacements is stored in a compression dictionary. Huffman compression uses IBM z14 hardware to compress data before it is stored and to decompress the data that is retrieved from a page in the buffer pool. In many cases, using Huffman compression can significantly reduce the amount of disk space needed to store data, but the compression ratio that you achieve depends on the characteristics of your data.

With compressed data, you might see some of the following performance benefits, depending on the SQL workload and the amount of compression:
  • Higher buffer pool hit ratios
  • Fewer I/Os
  • Fewer getpage operations

Procedure

If the requirements for Huffman compression are met, the following actions will compress Db2 data using IBM z14 Huffman compression:

  • Run the REORG TABLESPACE utility without the KEEPDICTIONARY option.
    A new compression dictionary (using Huffman compression) will be built for each partition included in the REORG. The REORG utility output will show compression statistics.
  • Add new records to a new (or almost empty) partition of a partition by range table space using INSERT or MERGE operations.
    If no compression dictionary already exists, and the amount of data in the tables space reaches a threshold determined by Db2, a compression dictionary is created. After the compression dictionary is built, Db2 uses it to compress all subsequent data added to the table space.
  • Run the LOAD utility with REPLACE.
    When enough data has been loaded to build a complete dictionary, a Huffman dictionary is created. Original records remain uncompressed in the page set while new records are compressed with the new dictionary. Since the dictionary is built on a potentially small portion of the loaded data, the dictionary efficiency might not be optimal.
  • Notes:
    • The increased number of rows per page (due to space savings of more efficient compression) might have an impact on locking granularity if page-level locking is used.
    • Very short rows might not have significant space savings from switching from Fixed-length compression to Huffman compression.
    • Db2 will tolerate mixed dictionary types in the same table space.
    • If Db2 moves to a lower star (*) function level than V12R1M504, all new REORG, LOAD, or insert operations that produce compression dictionaries will produce fixed-length dictionaries. Any existing Huffman dictionaries will continue to be used for compression and expansion of any associated partitions.
    • If Db2 moves to a star (*) function level between V12R1M504 and V12R1M508, all new REORG, LOAD, or insert operations will produce Huffman dictionaries for table spaces that are defined to use Huffman compression.

End of change