Make Subrecord stage in DataStage

The Make Subrecord stage combines specified vectors in an input data set into a vector of subrecords whose columns have the names and data types of the original vectors. You can specify the vector columns to be made into a vector of subrecords and the name of the new subrecord.

The Make Subrecord stage is a restructure stage. It can have a single input link and a single output link.

Shows how four separate columns are combined into a single subrecord

The length of the subrecord vector that is created by this operator equals the length of the longest vector column from which it is created. If a variable-length vector column is used in subrecord creation, the subrecord vector is also of variable length.

Vectors that are smaller than the largest combined vector are padded with default values: NULL for nullable columns and the corresponding type-dependent value for nonnullable columns. When the Make Subrecord stage encounters mismatched vector lengths, it writes a warning to the job log.

You can also use the stage to make a simple subrecord rather than a vector of subrecords. If your input columns are simple data types instead of vectors, the data is used to build a vector of subrecords of length 1 - effectively a simple subrecord.

Shows how four columns can be combined into a vector of subrecords

When you double-click the Make Subrecord stage, the properties panel opens. The stage editor has three tabs:

  • Stage tab. This tab is always present and is used to specify general information about the stage.
  • Input tab. This tab is where you specify the details about the single input set from which you are selecting records.
  • Output tab. This tab is where you specify details about the processed data being output from the stage.

The Split Subrecord stage performs the inverse operation. See "Split Subrecord Stage."

Input tab

Use the Input tab to specify details about the incoming data sets. The Make Subrecord stage expects one incoming data set.

Specify an optional description of the input link in the Description section. In the Partitions section, specify how incoming data is partitioned before the data is converted. In the Columns section, specify the column definitions of incoming data. In the Advanced section, you can change the default buffering settings for the input link.

You can specify the partitioning method for the Make Subrecord stage. For more information, see Partitioning and collecting data in DataStage.

Output tab

Use the Output tab to specify details about data output from the Make Subrecord stage. The Make Subrecord stage can have only one output link.

Use the Columns section to specify the column definitions of the data. In the Advanced section, you can change the default buffering settings for the output link.