Information icon IBM InfoSphere Information Server, Version 8.5
space Feedback

Partitioning method

Use this property to specify how to partition the data for parallel reads.

Each of these methods is an expression that determines how the records are divided into subsets for each processing node. The method that you use depends on how your data is distributed.

Both methods use the following information in their calculations:
  • Total number of processing nodes
  • Number of the current processing node
  • Integer partitioning column value specified in the Column name property

This property is available only if you set Enable partitioning to Yes.

The following methods are available:
Modulus
The connector adds a modulus expression as a prefix to the WHERE clause of the SQL statement. This expression divides the value of the partitioning column by the total number of nodes or logical processors. When the remainder of this operation matches the number for the current processing node, the current node receives the record. The result is that each node receives a different subset of records.

This is the recommended choice when your data is not evenly distributed, such as: 1,2,3,4,5,6,7,8,9,10,20,40,80.

Min/max range
The connector adds a range expression as a prefix to the WHERE clause of the SQL statement. This expression checks the value of the partitioning column to determine whether it is within the unique range of values for each node or logical processor. The result is that each node receives a different subset of records. This is the default value.

This is the recommended choice when your data is evenly distributed, such as: 1,2,3,4,5,6,7,8,9,10,11,12,13.


PDFThis topic is also in the IBM InfoSphere DataStage and QualityStage Connectivity Guide for ODBC.

Update timestamp Last updated: 2011-3-2