Setting options for the Fixed File node
The File tab of the Fixed File node enables you to bring data into IBM® SPSS® Modeler and to specify the position of columns and length of records. Using the data preview pane in the center of the dialog box, you can click to add arrows specifying the break points between fields.
File. Specify the name of the file. You can enter a filename or click the ellipsis button (...) to select a file. Once you have selected a file, the file path is shown and its contents are displayed with delimiters in the panel below.
The data preview pane can be used to specify column position and length. The ruler at the top of the preview window helps to measure the length of variables and to specify the break point between them. You can specify break point lines by clicking in the ruler area above the fields. Break points can be moved by dragging and can be discarded by dragging them outside of the data preview region. The ruler is designed to handle ASCII characters.
- Each break-point line automatically adds a new field to the fields table below.
- Start positions indicated by the arrows are automatically added to the Start column in the table below.
Line oriented. Select if you want to skip the new-line character at the end of each record.
Skip header lines. Specify how many lines you want to ignore at the beginning of the first record. This is useful for ignoring column headers.
Record length. Specify the number of characters in each record.
Field. All fields that you have defined for this data file are listed here. There are two ways to define fields:
- Specify fields interactively using the data preview pane above.
- Specify fields manually by adding empty field rows to the table below. Click the button to the right of the fields pane to add new fields. Then, in the empty field, enter a field name, a start position, and a length. These options will automatically add arrows to the data preview pane, which can be easily adjusted.
To remove a previously defined field, select the field in the list and click the red delete button.
Start. Specify the position of the first character in the field. For example, if the second field of a record begins on the sixteenth character, you would enter 16 as the starting point.
Length. Specify how many characters are in the longest value for each field. This determines the cutoff point for the next field.
Strip lead and trail spaces. Select to discard leading and trailing spaces in strings on import.
Invalid characters. Select Discard to remove invalid characters from the data input. Select Replace with to replace invalid characters with the specified symbol (one character only). Invalid characters are null (0) characters or any character that does not exist in the current encoding.
Encoding. Specifies the text-encoding method used. You can choose between the system default, stream default, or UTF-8.
- The system default is specified in the Windows Control Panel or, if running in distributed mode, on the server computer.
- The stream default is specified in the Stream Properties dialog box.
Decimal symbol. Select the type of decimal separator used in your data source. Stream default is the character selected from the Options tab of the stream properties dialog box. Otherwise, select either Period (.) or Comma (,) to read all data in this dialog box using the chosen character as the decimal separator.
Automatically recognize dates and times. To enable IBM SPSS Modeler to automatically attempt to recognize data entries as dates or times, select this check box. For example, this means that an entry such as 07-11-1965 will be identified as a date and 02:35:58 will be identified as a time; however, ambiguous entries such as 07111965 or 023558 will show up as integers since there are no delimiters between the numbers.
Lines to scan for type. Specify how many lines to scan for specified data types.
At any point while working in this dialog box, click Refresh to reload fields from the data source. This is useful when altering data connections to the source node or when working between tabs on the dialog box.