Row compression, sometimes referred to as deep compression, compresses data rows by replacing patterns of values that repeat across rows with shorter symbol strings. Of the various data compression techniques available in DB2® Version 9.7, row compression offers the most dramatic possibilities for storage savings.
The main benefit of using row compression is that you can store data in less space, which can yield significant savings in storage costs. Also, because you use storage at a slower rate, future expenditures for additional storage can be delayed.
In addition to the cost savings, compression can improve performance. Many queries against compressed data can be performed with fewer I/O operations because each read from disk brings in more data. Similarly, more data can be cached in the buffer pool, increasing buffer pool hit ratios. (However, there is a trade-off in the form of extra CPU cycles needed to compress and decompress data.) The storage savings and performance impact of data row compression are tied to the characteristics of the data within the database, the layout and tuning of the database, and application workload. The query optimizer includes decompression cost in its cost model.
Finally, because row compression can reduce the size of a database, backup and restore operations use less space and run faster.
The remainder of this topic discusses the following points:
Row compression uses a static dictionary-based compression algorithm to compress data by row. The dictionary is used to map repeated byte patterns from table rows to much smaller symbols; these symbols then replace the longer byte patterns in the table rows. The compression dictionary is stored along with the table data rows in the data object portions of the table.
Data stored in base table rows and log records is eligible for row compression. In addition, the data in XML storage objects is eligible for compression. LOB data that you place inline in a table row can be compressed; however storage objects for long data objects are not compressed.
To disable compression for a table, use the ALTER TABLE statement with the COMPRESS NO option; rows that you subsequently add are not compressed. To decompress the entire table, you must perform a table reorganization with the REORG TABLE command.
If you enable theDB2 Storage Optimization Feature, compression for temporary tables is enabled automatically . You cannot enable or disable compression for temporary tables.
Depending upon UPDATE activity and the positioning of update changes within a data row, log usage might increase. For information about update logging and column ordering, see Ordering columns to minimize update logging.
If a row increases in size, the new version of the row might not fit on the current data page. Rather, the new image of the row is stored on an overflow page. To minimize the creation of pointer-overflow records, increase the percentage of each page that is to be left as free space after a reorganization using the PCTFREE option on the ALTER TABLE statement. For example, if PCTFREE was set to 5% before you enabled compression, you might change it to 10% when you enable compression. This recommendation is especially important for data that is heavily updated.
Compression for temporary table is enabled automatically with the DB2 Storage Optimization Feature. When executing queries, the DB2 optimizer considers the storage savings and the impact to query performance that compression of temporary tables offers to determine whether it is worthwhile to use compression. If it is worthwhile, compression is used automatically. The minimum size that a table must be before compression is used is larger for temporary tables than for regular tables.
You can use the explain facility or the db2pd tool to see whether the optimizer chose to use compression for temporary tables.