File set stage

The File Set stage is a file stage that allows you to read data from or write data to a file set.

The File Set stage is a file stage. It allows you to read data from or write data to a file set. The stage can have a single input link, a single output link, and a single rejects link. It only executes in parallel mode.

What is a file set? InfoSphere® DataStage® can generate and name exported files, write them to their destination, and list the files it has generated in a file whose extension is, by convention, .fs. The data files and the file that lists them are called a file set. This capability is useful because some operating systems impose a 2 GB limit on the size of a file and you need to distribute files among nodes to prevent overruns.

The amount of data that can be stored in each destination data file is limited by the characteristics of the file system and the amount of free disk space available. The number of files created by a file set depends on:

  • The number of processing nodes in the default node pool
  • The number of disks in the export or default disk pool connected to each processing node in the default node pool
  • The size of the partitions of the data set
The File Set stage enables you to create and write to file sets, and to read data back from file sets.
Shows a job that starts by reading from a file set and concludes by writing to another file set. Both File Set stages have reject links.

Unlike data sets, file sets carry formatting information that describe the format of the files to be read or written.

When you edit a File Set stage, the File Set stage editor appears. This is based on the generic stage editor described in"Stage Editors."

The stage editor has up to three pages, depending on whether you are reading or writing a file set:

  • Stage Page. This is always present and is used to specify general information about the stage.
  • Input Page. This is present when you are writing to a file set. This is where you specify details about the file set being written to.
  • Output Page. This is present when you are reading from a file set. This is where you specify details about the file set being read from.

There are one or two special points to note about using runtime column propagation (RCP) with File Set stages. See "Using RCP With File Set Stages" for details.