Reference: Stage Editor user interface
The Parallel job stage editors all use a generic user interface.
The exception to that are the Transformer, Lookup, Shared Container, and Complex Flat File stages.
The following table lists the available stage types and gives a quick guide to their function:
Stage | Type | Function |
---|---|---|
Data Set | File | Allows you to read data from or write data to a persistent data set. |
Sequential File | File | Allows you to read data from or write data to one or more flat files. |
File Set | File | Allows you to read data from or write data to a file set. File sets enable you to spread data across a set of files referenced by a single control file. |
Lookup File Set | File | Allows you to create a lookup file set or reference one for a lookup. |
External Source | File | Allows you to read data that is output from one or more source programs. |
External Target | File | Allows you to write data to one or more source programs. |
Complex Flat File | File | Allows you to read or write complex flat files on a mainframe machine. |
SAS Data Set | File | Allows you to read data from or write data to a parallel SAS data set in conjunction with an SAS stage. |
DB2 Enterprise | Database | Allows you to read data from and write data to a DB2 database. |
Oracle Enterprise | Database | Allows you to read data from and write data to a Oracle database. |
Teradata Enterprise | Database | Allows you to read data from and write data to a Teradata database. |
Informix Enterprise | Database | Allows you to read data from and write data to an Informix database. |
Transformer | Processing | Handles extracted data, performs any conversions required, and passes data to another active stage or a stage that writes data to a target database or file. |
BASIC Transformer | Processing | Same as Transformer stage, but gives access to DataStage BASIC functions. |
Aggregator | Processing | Classifies incoming data into groups, computes totals and other summary functions for each group, and passes them to another stage in the job. |
Join | Processing | Performs join operations on two or more data sets input to the stage and then outputs the resulting data set. |
Merge | Processing | Combines a sorted master data set with one or more sorted update data sets. |
Lookup | Processing | Used to perform lookup operations on a data set read into memory from any other Parallel job stage that can output data or provided by one of the database stages that support reference output links. It can also perform a look up on a lookup table contained in a Lookup File Set stage. |
Sort | Processing | Sorts input columns. |
Funnel | Processing | Copies multiple input data sets to a single output data set. |
Remove Duplicates | Processing | Takes a single sorted data set as input, removes all duplicate records, and writes the results to an output data set. |
Compress | Processing | Uses the UNIX compress or GZIP utility to compress a data set. It converts a data set from a sequence of records into a stream of raw binary data. |
Expand | Processing | Uses the UNIX uncompress or GZIP utility to expand a data set. It converts a previously compressed data set back into a sequence of records from a stream of raw binary data. |
Copy | Processing | Copies a single input data set to a number of output data sets. |
Modify | Processing | Alters the record schema of its input data set. |
Filter | Processing | Transfers, unmodified, the records of the input data set which satisfy requirements that you specify and filters out all other records. |
External Filter | Processing | Allows you to specify a UNIX command that acts as a filter on the data you are processing. |
Change Capture | Processing | Takes two input data sets, denoted before and after, and outputs a single data set whose records represent the changes made to the before data set to obtain the after data set. |
Change Apply | Processing | Takes the change data set, that contains the changes in the before and after data sets, from the Change Capture stage and applies the encoded change operations to a before data set to compute an after data set. |
Difference | Processing | Performs a record-by-record comparison of two input data sets, which are different versions of the same data set. |
Compare | Processing | Performs a column-by-column comparison of records in two pre-sorted input data sets. |
Encode | Processing | Encodes a data set using a UNIX encoding command that you supply. |
Decode | Processing | Decodes a data set using a UNIX decoding command that you supply. |
Switch | Processing | Takes a single data set as input and assigns each input record to an output data set based on the value of a selector field. |
SAS | Processing | Allows you to execute part or all of an SAS application in parallel. |
Generic | Processing | Lets you incorporate an Orchestrate Operator in your job. |
Surrogate Key | Processing | Generates one or more surrogate key columns and adds them to an existing data set. |
Column Import | Restructure | Imports data from a single column and outputs it to one or more columns. |
Column Export | Restructure | Exports data from a number of columns of different data types into a single column of data type string or binary. |
Make Subrecord | Restructure | Combines specified vectors in an input data set into a vector of sub-records whose columns have the names and data types of the original vectors. |
Split Subrecord | Restructure | Creates one new vector column for each element of the original sub-record. |
Combine Records | Restructure | Combines records, in which particular key-column values are identical, into vectors of sub-records. |
Promote Subrecord | Restructure | Promotes the columns of an input sub-record to top-level columns. |
Make Vector | Restructure | Combines specified columns of an input data record into a vector of columns of the same type. |
Split Vector | Restructure | Promotes the elements of a fixed-length vector to a set of similarly named top-level columns. |
Head | Development/ Debug | Selects the first N records from each partition of an input data set and copies the selected records to an output data set. |
Tail | Development/ Debug | Selects the last N records from each partition of an input data set and copies the selected records to an output data set. |
Sample | Development/ Debug | Samples an input data set. |
Peek | Development/ Debug | Lets you print record column values either to the job log or to a separate output link as the stage copies records from its input data set to one or more output data sets. |
Row Generator | Development/ Debug | Produces a set of mock data fitting the specified metadata. |
Column Generator | Development/ Debug | Adds columns to incoming data and generates mock data for these columns for each data row processed. |
Write Range Map | Development/ Debug | Allows you to write data to a range map. The stage can have a single input link. |
All of the stage types use the same basic stage editor, but the pages that appear when you edit the stage depend on the exact type of stage you are editing. The following sections describe all the page types and subtypes that are available. The individual descriptions of stage editors in the following chapters tell you exactly which features of the generic editor each stage type uses.