Determining the effectiveness of compression

Determine the effectiveness of your compressed data by learning how much space can be saved and how much processing the compression of your data requires. Analyzing compression effectiveness can help you decide if the performance overhead of compression is worth the space savings.

About this task

Before compressing data, you can use the DSN1COMP stand-alone utility to estimate how well the data can be compressed. After data is compressed, you can use compression reports and catalog statistics to determine how effectively it was compressed.

Procedure

To find the effectiveness of data compression:

  • Use the DSN1COMP stand-alone utility to find out how much space can be saved and how much processing the compression of your data requires. Run DSN1COMP on a data set that contains a table space, a table space partition, or an image copy.
    DSN1COMP generates a report of compression statistics in message DSN1940I but does not compress the data.

    Start of changeIf DSN1COMP runs on IBM z14® or newer hardware, the default report contains compression estimates for both fixed length compression and Huffman compression, which requires the hardware compression support in IBM z14 or newer hardware. You can also specify the COMPTYPE parameter to specify which estimates to include the report. When DSN1COMP runs on IBM z13® or earlier hardware the default report includes estimates for fixed-length compression only. However, if you specify COMPTYPE(HUFFMAN) or COMPTYPE(ALL) the report contains the expected estimates for Huffman compression after migration to IBM z14 hardware.End of change

  • Examine the compression reports after you use REORG or LOAD to build the compression dictionary and compress the data.
    Both utilities issue a report message (DSNU234I or DSNU244I). The report message gives information about how well the data is compressed and how much space is saved. (REORG with the KEEPDICTIONARY option does not produce the report.)
  • Query catalog tables to find information about data compression
    • PAGESAVE column of the SYSIBM.SYSTABLEPART tells you the percentage of pages that are saved by compressing the data.
    • PCTROWCOMP columns of SYSIBM.SYSTABLES and SYSIBM.SYSTABSTATS tells you the percentage of the rows that were compressed in the table or partition the last time RUNSTATS was run. Use the RUNSTATS utility to update these catalog columns.
    • Start of changeFL 509 COMPRESS_USED column of SYSIBM.SYSTABLEPART tells you the algorithm used to compress the data in the table space or partition.
      The following operations or utilities update this catalog column:
      • Insert or merge operations that create a compression dictionary
      • LOAD SHRLEVEL NONE or SHRLEVEL REFERENCE
      • REORG TABLESPACE
      • RUNSTATS TABLESPACE
      • RECOVER to point-in-time
      • REPAIR CATALOG
      Note: The following conditions might result in Db2 not updating the catalog column (no warning is issued in these cases):
      • An insert or merge operation that creates a compression dictionary also holds exclusive usage on an object or accesses an object that is started with UT access.
      • An insert operation, a merge operation, or a utility encounters an internal error when attempting to update this catalog column. Db2 does not retry the update or wait to update this column. The operation or utility does not fail due to this error.
      You can use the following methods to determine if Db2 did not update this catalog column:
      • If an insert or merge operation created a compression dictionary, find the associated DSNU241I message output in the console. Check the values of the SYSTABLEPART.COMPRESS column and TS_COMPRESSION_TYPE subsystem parameter at the time the DSNU241I message is output. Note that there are cases where Db2 will use fixed-length compression even when Huffman compression is defined.
      • If the LOAD or REORG utility created a compression dictionary, find the associated DSNU234I or DSNU244I message in the utility output and compare that output to the value of the SYSTABLEPART.COMPRESS_USED column.
      • Run the DSN1PRNT utility on the corresponding table space or partition data set and compare the HPGZLD value to the value of the SYSTABLEPART.COMPRESS_USED column.
      End of change