Compression dictionaries
Table-level compression dictionaries
To build table-level dictionaries, entire rows in the table are scanned for repeated patterns. The database manager then builds a compression dictionary, assigning short, numeric keys to those repeated patterns. In general, text strings provide greater opportunities for compression than numeric data; compressing numeric data involves replacing one number with another. Depending on the size of the numbers that are being replaced, the storage savings might not be as significant as those achieved by compressing text.
When a table-level dictionary is first created, it is built using a sample of data in the table. The dictionary is static, which means that it is not updated again unless you explicitly cause the dictionary to be rebuilt by using a classic, offline table reorganization. Even if you rebuild the dictionary, the dictionary reflects only a sample of the data from the entire table.
The table-level compression dictionary is stored in hidden rows in the same object that they apply to and is cached in memory for quick access. This dictionary does not occupy much space. Even for extremely large tables, the compression dictionary typically occupies only approximately 100 KB.
Page-level compression dictionaries
Adaptive compression uses page-level dictionaries in addition to table-level dictionaries. However, unlike table-level dictionaries, page-level dictionaries are automatically created or re-created as the database manager fills pages. Like table-level compression dictionaries, page-level dictionaries are stored in hidden rows within the table.
Column compression dictionaries
A column compression dictionary is used to compress data in a column of a column-organized table. When you load data into a column-organized table, the first phase is the analyze phase, which is unique to column-organized tables. The analyze phase occurs only if column compression dictionaries must be built, which happens during a LOAD REPLACE operation, a LOAD REPLACE RESETDICTIONARY operation, a LOAD REPLACE RESETDICTIONARYONLY operation, or a LOAD INSERT operation (if the column-organized table is empty). The load utility analyzes the input data to determine the best encoding schemes for building column compression dictionaries.
If column compression dictionaries exist when you add a column to a table, the column compression dictionary for the new column contains only the default value for the new column. That means that any nondefault values that you later add to that column remain uncompressed. If column compression dictionaries do not exist when you add a column to the table, dictionaries for the new column and pre-existing columns are created when automatic dictionary creation (ADC) is triggered for the table. Although added columns generally do not compress as well as pre-existing columns of the same table, new columns can still benefit from page-level compression.