Generating surrogate keys

To generate surrogate keys, add a Surrogate Key Generator stage to a job with a single output link to another stage.

If you want to pass input columns to the next stage in the job, the Surrogate Key Generator stage can also have an input link.

  1. Open the Surrogate Key Generator stage editor.
  2. On the Stage page, define the stage properties:
    1. Click the Properties tab.
    2. Type a name for the surrogate key column in the Generated Output Column Name property.
    3. Type or browse for the source name.
    4. Select the source type.
    5. If the source type is a database sequence, define the Database Type properties.
    6. If the key source is a flat file, specify how keys are generated:
      • To generate keys in sequence from the highest value that was last used, set the Generate Key from Last Highest Value property to Yes. Any gaps in the key range are ignored.
      • To specify a value to initialize the key source, add the File Initial Value property to the Options group, and specify the start value for key generation.
      • To control the block size for key ranges, add the File Block Size property to the Options group, set this property to User specified, and specify a value for the block size.
      Note: When the Generate Key from Last Highest Value property is used, there may be gaps between job runs in multiple of 1000.

      The default system block size is 1000. This means the surrogate key generator allocates 1000 keys each time a job is run. If the number of records processed is less than 1000, the highest value that was last used is 1000. When the job is run the next time, the key range from 1001 to 2000 is allocated. Also there is a option called User-Specified Block Size. This option can be used if you know the maximum number of records to be processed. If you use this option along with a one-node configuration you will get these numbers in sequence.

    7. If there is no input link, add the Number of Records property to the Options group, and specify how many records to generate.
    8. Optional: On the Advanced tab, change the processing settings for the stage.
  3. Optional: If the stage has an input link, on the Input page, define the input data:
    1. On the Partitioning tab, change the partition settings for the input link.
    2. On the Advanced tab, change the buffer settings for the input link.
  4. On the Output page, define the output data:
    1. On the Mapping tab, map the surrogate key column to the output link.
      If the stage has an input link, you can also map input columns to the output link.
    2. Optional: On the Advanced tab, change the buffer settings for the output link.
  5. Click OK to save your changes and to close the Surrogate Key Generator stage editor.