DB2 Version 9.7 for Linux, UNIX, and Windows

Row compression

Row compression, sometimes referred to as deep compression, compresses data rows by replacing patterns of values that repeat across rows with shorter symbol strings. Of the various data compression techniques available in DB2® Version 9.7, row compression offers the most dramatic possibilities for storage savings.

The main benefit of using row compression is that you can store data in less space, which can yield significant savings in storage costs. Also, because you use storage at a slower rate, future expenditures for additional storage can be delayed.

In addition to the cost savings, compression can improve performance. Many queries against compressed data can be performed with fewer I/O operations because each read from disk brings in more data. Similarly, more data can be cached in the buffer pool, increasing buffer pool hit ratios. (However, there is a trade-off in the form of extra CPU cycles needed to compress and decompress data.) The storage savings and performance impact of data row compression are tied to the characteristics of the data within the database, the layout and tuning of the database, and application workload. The query optimizer includes decompression cost in its cost model.

Finally, because row compression can reduce the size of a database, backup and restore operations use less space and run faster.

The remainder of this topic discusses the following points:

How compression works

Row compression uses a static dictionary-based compression algorithm to compress data by row. The dictionary is used to map repeated byte patterns from table rows to much smaller symbols; these symbols then replace the longer byte patterns in the table rows. The compression dictionary is stored along with the table data rows in the data object portions of the table.

What data gets compressed?

Data stored in base table rows and log records is eligible for row compression. In addition, the data in XML storage objects is eligible for compression. LOB data that you place inline in a table row can be compressed; however storage objects for long data objects are not compressed.

Restriction: Data in XML columns that you created with DB2 Version 9.5 or DB2 Version 9.1 cannot be compressed. However, XML columns that you add using DB2 Version 9.7 to a table without XML columns that you created with an earlier release of the product can be compressed. If a table that you created in an earlier release already has one or more XML columns, and you want to add a compressed XML column using DB2 Version 9.7, you must use the ADMIN_MOVE_TABLE stored procedure to migrate the table before you can use compression.

Turning row compression on or off

To use row compression, you must have a license for the DB2 Storage Optimization Feature. You compress table data by setting the COMPRESS attribute of the table to YES. You can set this attribute when you create the table by including the COMPRESS YES option on the CREATE TABLE statement; you can also alter an existing table to use compression using the same option on the ALTER TABLE statement. After you enable compression, operations that add data to the table, such as an INSERT, LOAD INSERT, or IMPORT INSERT operation can use row compression. In addition, index compression is enabled for the table; indexes for the table are created as compressed indexes unless you specify otherwise and if they are the types of indexes that can be compressed.
Important: When you enable row compression for a table, you enable it for the entire table, even if a table comprises more than one table partition.

To disable compression for a table, use the ALTER TABLE statement with the COMPRESS NO option; rows that you subsequently add are not compressed. To decompress the entire table, you must perform a table reorganization with the REORG TABLE command.

If you enable theDB2 Storage Optimization Feature, compression for temporary tables is enabled automatically . You cannot enable or disable compression for temporary tables.

Effects of UPDATE activity on logs and compressed tables

Depending upon UPDATE activity and the positioning of update changes within a data row, log usage might increase. For information about update logging and column ordering, see Ordering columns to minimize update logging.

If a row increases in size, the new version of the row might not fit on the current data page. Rather, the new image of the row is stored on an overflow page. To minimize the creation of pointer-overflow records, increase the percentage of each page that is to be left as free space after a reorganization using the PCTFREE option on the ALTER TABLE statement. For example, if PCTFREE was set to 5% before you enabled compression, you might change it to 10% when you enable compression. This recommendation is especially important for data that is heavily updated.

Row compression for temporary tables

Compression for temporary table is enabled automatically with the DB2 Storage Optimization Feature. When executing queries, the DB2 optimizer considers the storage savings and the impact to query performance that compression of temporary tables offers to determine whether it is worthwhile to use compression. If it is worthwhile, compression is used automatically. The minimum size that a table must be before compression is used is larger for temporary tables than for regular tables.

You can use the explain facility or the db2pd tool to see whether the optimizer chose to use compression for temporary tables.

Reclaiming space freed by compression

You can reclaim space that has been freed by compressing data. For more information, see Reclaimable storage.