LOAD INSERT and LOAD REPLACE operations on tables for which compression is enabled can trigger the creation of compression dictionaries. Depending on what type of row compression a table uses, dictionary creation happens in different ways.
Classic row compression uses a single table-level compression dictionary to compress data. Adaptive compression uses multiple page-level compression dictionaries to compress individual pages of data, along with the table-level compression dictionaries used in classic row compression.
Table-level dictionaries are created automatically for both LOAD INSERT and LOAD REPLACE operations if no dictionary exists; however, if a table-level dictionary does exist, by default, the dictionary is not updated. More specifically, LOAD REPLACE operations assume the KEEPDICTIONARY option by default. You can specify the RESETDICTIONARY option to remove the existing table-level dictionary and create a new one.
LOAD INSERT always follows the behavior implied by the KEEPDICTIONARY option.
When building table-level dictionaries for non-XML data, the load utility uses the data that exists in the target table to build the dictionaries, under the assumption that this preexisting data is representative of the type of data that will be stored in that table. In cases where there is insufficient preexisting data in the target table, the load utility builds the dictionaries once it has sampled enough input data. In this situation, the load utility uses only the input data to build the dictionary.
For XML data, the load utility samples incoming data only.
When dictionaries are created for range-partitioned tables, each partition is treated like an individual table. There will not be any cross-partition dictionaries and dictionary creation does not occur on partitions already containing dictionaries. For table data, the dictionary generated for each partition is based on the preexisting table data (and, if necessary, the loaded data) in that partition only. In Version 9.7 Fix Pack 1 and later, if the preexisting data in a partition is less than the minimum threshold, the dictionary is generated based only on the loaded data. For XML data, the dictionary generated for each partition is based the data being loaded into that partition.
A LOAD REPLACE that uses the KEEPDICTIONARY option keeps the existing dictionaries and uses them to compress the loaded data, as long as the target table has the COMPRESS attribute enabled. If dictionaries do not exist, the load utility generates new ones (provided the data that is being loaded into the table surpasses a predetermined threshold for table rows or XML documents stored in the default XML storage object) for tables with the COMPRESS attribute enabled. Since the data in the target table is replaced, the load utility uses only the input data to build the dictionaries. After a dictionary has been created, it is inserted into the table and the load operation continues.
While some operations related to the building of dictionaries can affect the CPU utilization by the LOAD command, load operations are generally I/O bound. That is, much of the time spent waiting for the load to complete is taken up waiting for data to be written to disk. The increased load on the CPU caused by dictionary creation generally does not increase the elapsed time required to perform the load; indeed, because data is written in compressed format, I/O times can actually decrease as compared to loading data into uncompressed tables.