Setting options for the Variable File Node
You set the options on the File tab of the Variable File node dialog box.
File Specify the name of the file. You can enter a filename or click the ellipsis button (...) to select a file. The file path is shown once you select a file, and its contents are displayed with delimiters in the panel below it.
The sample text that is displayed from your data source can be copied and pasted into the following controls: EOL comment characters and user-specified delimiters. Use Ctrl-C and Ctrl-V to copy and paste.
Read field names from file Selected by default, this option treats the first row in the data file as labels for the column. If your first row is not a header, deselect to automatically give each field a generic name, such as Field1, Field2, for the number of fields in the dataset.
Specify number of fields. Specify the number of fields in each record. The number of fields can be detected automatically as long as the records are new-line terminated. You can also set a number manually.
Skip header characters. Specify how many characters you want to ignore at the beginning of the first record.
EOL comment characters. Specify characters, such as # or !, to indicate annotations in the data. Wherever one of these characters appears in the data file, everything up to but not including the next new-line character will be ignored.
Strip lead and trail spaces. Select options for discarding leading and trailing spaces in strings on import.
Invalid characters. Select Discard to remove invalid characters from the data source. Select Replace with to replace invalid characters with the specified symbol (one character only). Invalid characters are null characters or any character that does not exist in the encoding method specified.
Encoding. Specifies the text-encoding method used. You can choose between the system default, stream default, or UTF-8.
- The system default is specified in the Windows Control Panel or, if running in distributed mode, on the server computer.
- The stream default is specified in the Stream Properties dialog box.
Decimal symbol Select the type of decimal separator that is used in your data source. The Stream default is the character that is selected from the Options tab of the stream properties dialog box. Otherwise, select either Period (.) or Comma (,) to read all data in this dialog box using the chosen character as the decimal separator.
Line delimiter is newline character To use the newline character as the line delimiter, instead of a field delimiter, select this option. For example, this may be useful if there is an odd number of delimiters in a row that cause the line to wrap. Note that selecting this option means you cannot select Newline in the Delimiters list.
Delimiters. Using the check boxes listed for this control, you can specify which characters, such as the comma (,), define field boundaries in the file. You can also specify more than one delimiter, such as ", |" for records that use multiple delimiters. The default delimiter is the comma.
Select Allow multiple blank delimiters to treat multiple adjacent blank delimiter characters as a single delimiter. For example, if one data value is followed by four spaces and then another data value, this group would be treated as two fields rather than five.
Lines to scan for column and type Specify how many lines and columns to scan for specified data types.
Automatically recognize dates and times To enable IBM® SPSS® Modeler to automatically attempt to recognize data entries as dates or times, select this check box. For example, this means that an entry such as 07-11-1965 will be identified as a date and 02:35:58 will be identified as a time; however, ambiguous entries such as 07111965 or 023558 will show up as integers since there are no delimiters between the numbers.
Treat square brackets as lists If you select this check box, the data included between opening and closing square brackets is treated as a single value, even if the content includes delimiter characters such as commas and double quotes. For example, this might include two or three dimensional geospatial data, where the coordinates contained within square brackets are processed as a single list item. For more information, see Importing geospatial data into the Variable File Node
' " ab c" , "d ef " , " gh i " '
will result in 'ab c, d ef, gh
i'
). When using Include as text, quotes are treated as normal
characters, so leading and trailing spaces will be stripped naturally.At any point while you are working in this dialog box, click Refresh to reload fields from the data source. This is useful when you are altering data connections to the source node or when you are working between tabs in the dialog box.