Best practices for table partitioning in a warehouse environment (Updated September 2019)

Table or Range partitioning is a powerful feature of Db2 that facilitates good database design principles that can help lead to easier maintenance operations, increased data availability and more optimized queries.

Note: Table partitioning is a feature of row store tables only.

But why is table partitioning so good in a warehousing environment? Here are some reasons:

  1. Range-specific maintenance operations
    • Where data partitions (ranges of data within a table) are placed in individual table spaces, maintenance operations can be targeted at active data only.
    • Many Db2 commands, for example REORG and BACKUP, can be specified to execute against specific table spaces or data partitions. This can significantly reduce maintenance times.
  2. Data lifecycle management
    • Table partitioning large fact tables by date means that older data in a table can be detached as an online operation. This can help eliminate the need to perform costly delete statements.
    • In addition, as data ages, it can be moved as an online operation to less costly storage. In Db2 V11.5 this multi-temperature data management is facilitated by the new storage groups feature.
  3. Partition elimination
    • Range Partitioning benefits queries where the query spans one or a subset of the range partitioning key. The Db2 optimizer can then eliminate entire data partitions from the query, and this reduction in rows/read (I/O) can help increase query performance.
  4. Local indexes
    • Local indexes can help to significantly reduce index maintenance and increase query performance where significant sorting is not required.
    • In addition, local indexes can be placed in separate table spaces which provides more flexibility in building a backup schedule and a recovery strategy. For example, in a restore scenario you have the choice between restore or rebuild of indexes.
  5. Backup perfomance
    • Backup performance can be improved by backing up just those table spaces (table ranges) that are active.
    • By balancing the average size of your table spaces, parallelism within the BACKUP operations can also increase, helping to reduce the elapsed time of your backup operations.