Reading Data from a Text File

Text files represent another common source of data. Many spreadsheet programs and databases can save their contents in one of many text file formats. Comma- or tab-delimited files refer to rows of data that use commas or tabs to indicate each variable. In this example, the data is tab-delimited.

  1. From the menus choose:

    File > Import Data > Text Data

  2. Go to the Samples\English folder and select demo.txt.

    The Text Import Wizard guides you through the process of defining how the specified text file is interpreted.

    Figure 1. Text Import Wizard: Step 1 of 6
    Text Import Wizard: Step 1 of 6
  3. In Step 1, you can choose a predefined format or create a new format in the wizard. Select No.
  4. Click Next to continue.

    As stated earlier, this file uses tab-delimited formatting. Also, the variable names are defined on the top line of this file.

  5. In step 2 of the wizard, select Delimited to indicate that the file has a delimited formatting structure.
  6. Select Yes to indicate that the file includes variable names at the top of the file.
  7. Click Next to continue.
  8. In step 3, enter 2 for the line number where the first case of data begins (because variable names are on the first line).
  9. Keep the default values for the remainder of this step, and click Next to continue.

    The Data preview in Step 4 provides you with a quick way to ensure that the file is read correctly

  10. Select Tab and deselect the other options for delimiters. Space is selected by default because the file contains spaces. For this file, spaces are part of the data values, not delimiters. You need to deselect Space to read the file correctly.
  11. Select Remove leading spaces for string values. Spaces at the start of string values affect how string values are evaluated in expressions. In this file, some values for Gender have leading spaces that are not part of the value. If you do not remove those spaces, a value of " f" is treated as a different value than "f".
    Figure 2. Text Import Wizard: Step 4 of 6
    Text Import Wizard: Step 4 of 6, select delimiters and remove leading spaces
  12. Click Next to continue.

    Because the variable names are modified to conform to naming rules, step 5 gives you the opportunity to edit any undesirable names.

    Data types can be defined here as well. For example, you can change Income to dollar currency format.

    To change a data type:

  13. In the Data preview, select Income.
  14. Select Dollar from the Data format drop-down list.
    Figure 3. Change the data type
    Step 5 of the Text Import Wizard with the Income variable selected. Dollar is selected for the variable's Data format.

    The variable MaritalStatus contains both string and numeric values. Less than five percent of the values are strings. With the default setting of 95% for Percentage of values that determine the Automatic data format, the variable is treated as numeric and the string values are set to system-missing. If no data format meets the percentage value, the variable is treated as a string variable. If you change the setting to 100, all values are preserved, but all numeric values are treated as strings.

  15. Click Next to continue.
  16. Leave the default selections in the last step, and click Finish to import the data.