Database partitioning applied to data flows and data flow operators

To improve the performance of data flows and data flow operators, the Design Studio supports hash partitioning by using the existing DB2® table spaces, DB2 partitioning groups, and information that you specify in the Design Studio for tables that are generated at run time.

An effectively partitioned database optimizes query performance and data handling. Optimal query and data handling facilitate database scalability and can allow for massive amounts of parallel processing.

In the Design Studio, you can specify database partitioning information at the data flow level, at the data flow operator level, or at both levels. If you specify partitioning information at both levels, the information that you specify for any operator in the data flow overrides the information that you specify for the data flow. Partitioning information at the operator level includes distribution keys that further fine tune and improve performance. You select the key columns from the input port schema for the operator.

Specifying partitioning information for a data flow generates partition-aware SQL code for that data flow at run time. You can specify partitioning information for tables that are created by the partition-aware operators. These tables include those that data station operators use and the internal staging tables that the data station, distinct, key lookup, splitter, and table source operators generate. You can also specify partition-related information for bulk load target and file import operators.

Because the internal staging tables are generated only when required, the partitioning information that you specify is used only when an internal staging table is generated. To verify whether the code generator used your partitioning information, you can open the Code Generation page and look for the table name.

You can specify partitioning information for data flows and data flow operators by:

Specifying database partitioning information for a data flow
Specifying staging table settings for the data station, distinct, key lookup, splitter, and table source data flow operators
Specifying partition options for the bulk load target and file import data flow operators

Feedback | Information roadmap

https://www.ibm.com/docs/en/db2/10.1.0?topic=ddf-database-partitioning-applied-data-flows-data-flow-operators