Specifying hash keys

Hash keys specify the criteria used to determine the partition into which the hash partitioner assigns a record. The hash partitioner guarantees to assign all records with identical hash keys to the same partition.

The hash partitioner lets you set a primary key and multiple secondary keys. You must define a single primary key, and you have the option of defining as many secondary keys as required by your job. Note, however, that each record field can be used only once as a key. Therefore, the total number of primary and secondary keys must be less than or equal to the total number of fields in the record.

The data type of a partitioning key might be any InfoSphere® DataStage® data type except raw, subrecord, tagged aggregate, or vector.

By default, the hash partitioner uses a case sensitive hash function for strings. You can override this default to perform case insensitive hashing on string fields. In this case, records containing string keys which differ only in case are assigned to the same partition.