Sequential file stage

The Sequential File stage is a file stage that allows you to read data from or write data one or more flat files.

The stage can have a single input link or a single output link, and a single rejects link.
Shows a job where a sequential file is being written to and outputting a reject link

When you edit a Sequential File stage, the Sequential File stage editor appears. This is based on the generic stage editor described in "Stage Editors."

The stage executes in parallel mode if reading multiple files but executes sequentially if it is only reading one file. By default a complete file will be read by a single node (although each node might read more than one file). For fixed-width files, however, you can configure the stage to behave differently:

  • You can specify that single files can be read by multiple nodes. This can improve performance on cluster systems. See "Read From Multiple Nodes"
  • You can specify that a number of readers run on a single node. This means, for example, that a single file can be partitioned as it is read (even though the stage is constrained to running sequentially on the conductor node). See "Number Of Readers Per Node".

(These two options are mutually exclusive.)

The stage executes in parallel if writing to multiple files, but executes sequentially if writing to a single file. Each node writes to a single file, but a node can write more than one file.

When reading or writing a flat file, InfoSphere® DataStage® needs to know something about the format of the file. The information required is how the file is divided into rows and how rows are divided into columns. You specify this on the Format tab. Settings for individual columns can be overridden on the Columns tab using the Edit Column Metadata dialog box.

The stage editor has up to three pages, depending on whether you are reading or writing a file:

  • Stage Page. This is always present and is used to specify general information about the stage.
  • Input Page. This is present when you are writing to a flat file. This is where you specify details about the file or files being written to.
  • Output Page. This is present when you are reading from a flat file or have a reject link. This is where you specify details about the file or files being read from.

There are one or two special points to note about using runtime column propagation (RCP) with Sequential stages. See Using RCP with Sequential File stages for details.