Defining Sequential File Input Data

When you write data to a sequential file, the Sequential File stage has an input link. The properties of this link and the column definitions of the data are defined on the Inputs page in the Sequential File Stage dialog box.

The Inputs page has the following field and three tabs:

  • Input name. The name of the input link. Choose the link you want to edit from the Input name list. This list displays all the input links to the Sequential File stage.
  • General. Displayed by default. Contains the following parameters:
    • File name. The path name of the file the data is written to. You can enter a job parameter to represent the file created during run time. For details about how to define job parameters, see Making your jobs adaptable. You can also browse for the file. The file name will default to the link name if you do not specify one here.
    • Filter command. Here you can specify a filter program that will process the data before it is written to the file. This can be used, for example, to specify a zip program to compress the data. You can type in or browse for the filter program, and specify any command line arguments it requires in the text box. This text box is enabled only if you have selected the Stage uses filter commands check box on the Stage page General tab (see Using a Sequential File Stage). Note that, if you specify a filter command, data browsing is not available so the View Data button is disabled.
    • Description. Contains an optional description of the input link.

      The General tab also contains options that determine how the data is written to the file. These are displayed under the Update action area:

    • Overwrite existing file. This is the default option. If this option is selected, the existing file is truncated and new data records are written to the empty file.
    • Append to existing file. If you select this option, the data records are appended to the end of the existing file.
    • Backup existing file. If you select this check box, a backup copy of the existing file is taken. The new data records are written based on whether you chose to append to or overwrite the existing file.
      Note: The backup can be used to reset the file if a job is stopped or aborted at run time.
  • Format. Contains parameters that determine the format of the data in the file. There are up to four check boxes:
    • Fixed-width columns. If you select this check box, the data is written to the file in fixed-width columns. The width of each column is specified by the SQL display size (set in the Display column in the Columns grid). This option is cleared by default.
    • First line is column names. Select this check box if the first row of data in the file contains column names. This option is cleared by default, that is, the first row in the file contains data.
    • Omit last new-line. Select this check box if you want to remove the last newline character in the file. This option is cleared by default, that is, the newline character is not removed.
    • Flush after every row. This only appears if you have selected Stage uses named pipes on the Stage page. Selecting this check box causes data to be passed between the reader and writer of the pipe one record at a time.

      There are up to seven fields on the Format tab:

    • Delimiter. Only active if you have not specified fixed-width columns. Contains the delimiter that separates the data fields in the file. By default this field contains a comma. You can enter a single printable character or a decimal or hexadecimal number to represent the ASCII code for the character you want to use. Valid ASCII codes are in the range 1 to 253. Decimal values 1 through 9 must be preceded with a zero. Hexadecimal values must be prefixed with &h. Enter 000 to suppress the delimiter.
    • Quote character. Only active if you have not specified fixed-width columns. Contains the character used to enclose strings. By default this field contains a double quotation mark. You can enter a single printable character or a decimal or hexadecimal number to represent the ASCII code for the character you want to use. Valid ASCII codes are in the range 1 to 253. Decimal values 1 through 9 must be preceded with a zero. Hexadecimal values must be prefixed with &h. Enter 000 to suppress the quote character.
    • Spaces between columns. This field is only active when you select the Fixed-width columns check box. Contains a number to represent the number of spaces used between columns.
    • Default NULL string. Contains the default characters that are written to the file when a column contains an SQL null (this can be overridden for individual column definition in the Columns tab).
    • Default padding. Contains the character used to pad missing columns. This is # by default, but can be set to another character here to apply to all columns, or can be overridden for individual column definitions in the Columns tab.

      The following fields appear only if you have selected Stage uses named pipes on the Stage page:

    • Wait for reader timeout. Specifies how long the stage will wait for a connection when reading from a pipe before timing out. Recommended values are from 30 to 600 seconds. If the stage times out, an error is raised and the job is aborted.
    • Write timeout. Specifies how long the stage will attempt to write data to a pipe before timing out. Recommended values are from 30 to 600 seconds. If the stage times out, an error is raised and the job is aborted.
  • Columns. Contains the column definitions for the data on the chosen input link. In addition to the standard column definition fields (Column name, Key, SQL Type, Length, Scale, Nullable, Display, Data Element, and Description), Sequential File stage Column tabs also have the following fields:
    • Null string. Fill this in if you want to override the default setting on the Format tab for this particular column.
    • Padding. Fill this in if you want to override the default setting on the Format tab for this particular column.
    • Contains terminators. Does not apply to input links.
    • Incomplete column. Does not apply to input links.

      Note that the Scale for a sequential file column has a practical limit of 14. If values higher than this are used the results might be ambiguous.

      The SQL data type properties affect how data is written to a sequential file. The SQL display size determines the size of fixed-width columns. The SQL data type determines how the data is justified in a column: character data types are quoted and left justified, numeric data types are not quoted and are right justified. The SQL properties are in the Columns grid when you edit an input column.

Click View Data... to open the Data Browser. This enables you to look at the data associated with the input link. For a description of the Data Browser, see Using the Data Browser.