Data organization schemes
With the introduction of table partitioning, a Db2® database offers a three-level data organization scheme. There are three clauses of the CREATE TABLE statement that include an algorithm to indicate how the data is to be organized.
- DISTRIBUTE BY to spread data evenly across database partitions (to enable intraquery parallelism and to balance the load across each database partition) (database partitioning)
- PARTITION BY to group rows with similar values of a single dimension in the same data partition (table partitioning)
- ORGANIZE BY to group rows with similar values on multiple dimensions in the same table extent (multidimensional clustering) or to group rows according to the time of the insert operation (insert time clustering table).
This syntax allows consistency between the clauses and allows for future algorithms of data organization. Each of these clauses can be used in isolation or in combination with one another. By combining the DISTRIBUTE BY and PARTITION BY clauses of the CREATE TABLE statement data can be spread across database partitions spanning multiple table spaces. This approach allows for similar behavior to the Informix® Dynamic Server and Informix Extended Parallel Server hybrid.
In a single table, you can combine the clauses used in each data organization scheme to create more sophisticated partitioning schemes. For example, partitioned database environments are not only compatible, but also complementary to table partitioning.


The salient distinction between multidimensional clustering (MDC) and table partitioning is multi-dimension versus single dimension. MDC is suitable to cubes (that is, tables with multiple dimensions), and table partitioning works well if there is a single dimension which is central to the database design, such as a DATE column. MDC and table partitioning are complementary when both of these conditions are met. This is demonstrated in Figure 3.

There is another data organization scheme which cannot be used with any of the schemes that were listed previously. This scheme is ORGANIZE BY KEY SEQUENCE. It is used to insert each record into a row that was reserved for that record at the time of table creation (Range-clustered table).
Data organization terminology
- Database partitioning
- A data organization scheme in which table data is divided across multiple database partitions based on the hash values in one or more distribution key columns of the table, and based on the use of a distribution map of the database partitions. Data from a given table is distributed based on the specifications provided in the DISTRIBUTE BY HASH clause of the CREATE TABLE statement.
- Database partition
- A portion of a database on a database partition server consisting of its own user data, indexes, configuration file, and transaction logs. Database partitions can be logical or physical.
- Table partitioning
- A data organization scheme in which table data is divided across multiple data partitions according to values in one or more partitioning columns of the table. Data from a given table is partitioned into multiple storage objects based on the specifications provided in the PARTITION BY clause of the CREATE TABLE statement. These storage objects can be in different table spaces.
- Data partition
- A set of table rows, stored separately from other sets of rows, grouped by the specifications provided in the PARTITION BY RANGE clause of the CREATE TABLE statement.
- Multidimensional clustering (MDC)
- A table whose data is physically organized into blocks along one or more dimensions, or clustering keys, specified in the ORGANIZE BY DIMENSIONS clause.
- Insert time clustering (ITC)
- A table whose data is physically clustered based on row insert time, specified by the ORGANIZE BY INSERT TIME clause.
Benefits of each data organization scheme
Understanding the benefits of each data organization scheme can help you to determine the best approach when planning, designing, or reassessing your database system requirements. Table 1 provides a high-level view of common customer requirements and shows how the various data organization schemes can help you to meet those requirements.
Issue | Recommended scheme | Explanation |
---|---|---|
Data roll-out | Table partitioning | Uses detach to roll out large amounts of data with minimal disruption |
Parallel query execution (query performance) | Database Partitioning Feature | Provides query parallelism for improved query performance |
Data partition elimination (query performance) | Table partitioning | Provides data partition elimination for improved query performance |
Maximization of query performance | Both | Maximum query performance when used together: query parallelism and data partition elimination are complementary |
Heavy administrator workload | Database Partitioning Feature | Execute many tasks for each database partition |
Issue | Recommended scheme | Explanation |
---|---|---|
Data availability during roll-out | Table partitioning | Use the DETACH PARTITION clause to roll out large amounts of data with minimal disruption. |
Query performance | Both | MDC is best for querying multiple dimensions. Table partitioning helps through data partition elimination. |
Minimal reorganization | MDC | MDC maintains clustering, which reduces the need to reorganize. |