Table-level compression dictionary creation
Table-level compression dictionaries for tables that you enable for adaptive or classic row compression can be built automatically or manually. Tables that you enable for adaptive compression include page-level data dictionaries, which are always automatically created.
Automatic dictionary creation
- You set the COMPRESS attribute for the table by using the CREATE TABLE or ALTER TABLE statement with the COMPRESS YES ADAPTIVE or COMPRESS YES STATIC clause.
- A table-level compression dictionary does not already exist for the table.
- The table contains sufficient data for constructing a dictionary of repeated data.
- A compression dictionary is not yet created, because the table is empty.
- Data is inserted into the table by using insert or load operations and remains uncompressed.
- As more data is inserted or loaded into the table, the data remains uncompressed.
- After a threshold is reached, dictionary creation is triggered automatically if the COMPRESS attribute is set to YES ADAPTIVE or YES STATIC.
- The dictionary is created.
- The dictionary is appended to the table.
- From this point forward, table-level compression is enabled, and the rows that are later inserted or added are compressed by the table-level compression dictionary.
- You set the COMPRESS attribute on the table to YES ADAPTIVE or YES STATIC.
- A compression dictionary does not exist within that XML storage object.
- There is sufficient data in the XML storage object.
The mechanism for creating table-level compression dictionaries for temporary tables is similar to the mechanism that is used for permanent tables. However, the database manager automatically makes the determination whether to use classic row compression for temporary tables, based on factors such as query complexity and the size of the result set.
Manual dictionary creation
Although dictionaries are created automatically when compression-enabled tables grow to a sufficient size, you can also force a table-level compression dictionary to be created if none exists by using the REORG TABLE command with the RESETDICTIONARY parameter. This command forces the creation of a compression dictionary if there is at least one row of data in the table. Table reorganization is an offline operation; one benefit of using automatic dictionary creation is that the table remains online as the dictionary is built.
Instead of using the REORG TABLE command to force the creation of a new dictionary, you can also use the INSPECT command with the ROWCOMPESTIMATE parameter. This command creates a compression dictionary if the table does not already have one. The advantage of this approach over performing a table reorganization is that the table remains online. Rows that you add later are subject to compression; however, rows that existed before you ran the INSPECT command remain uncompressed until you perform a table reorganization. However, if compression is enabled, automatic dictionary creation will usually take place shortly after you activate compression, likely before you even have a chance to use the INSPECT command.
Resetting compression dictionaries
Whether a table-level compression dictionary is created automatically or manually, the dictionary is static; after it is built, it does not change. As you add or update rows, they are compressed based on the data that exists in the compression dictionary. For many situations, this behavior is appropriate. Consider, for example, a table in a database that is used for maintaining customer accounts for a city water utility. Such a table might have columns such as STREET_ADDRESS, CITY, PROVINCE, TELEPHONE_NUM, POSTAL_CODE, and ACCOUNT_TYPE. If a compression dictionary is built with data from a such table, even if it is only a modestly sized table, there is likely sufficient repetitive information for classic row compression to yield significant space savings. Much of the data might be common from customer to customer, for example, the values of the CITY, POSTAL_CODE, or PROVINCE column or portions of the value in the STREET_ADDRESS or TELEPHONE_NUM column.
- A master table is used to accumulate data on a month-by-month basis.
- Each month, a new set of records is loaded into the table.