Applying the Model to Another Data File

Having determined that the model is reasonably good, we can now apply that model to other data files containing similar age, income, and education variables and generate a new variable that represents the predicted vehicle purchase price for each case in that file. This process is often referred to as scoring.

When we generated the model, we specified that "rules" for assigning values to cases should be saved in a text file—in the form of command syntax. We will now use the commands in that file to generate scores in another data file.

  1. Open the data file tree_score_car.sav. See the topic Sample Files for more information.
  2. Next, from the menus choose:

    File > New > Syntax

  3. In the command syntax window, type:
    INSERT FILE=
     '/temp/car_scores.sps'.

    If you used a different filename or location, make the appropriate changes.

    The INSERT command will run the commands in the specified file, which is the "rules" file that was generated when we created the model.

  4. From the command syntax window menus choose:

    Run > All

    Figure 1. Predicted values added to data file
    Predicted values added to data file

    This adds two new variables to the data file:

    • nod_001 contains the terminal node number predicted by the model for each case.
    • pre_001 contains the predicted value for vehicle purchase price for each case.

    Since we requested rules for assigning values for terminal nodes, the number of possible predicted values is the same as the number of terminal nodes, which in this case is 15. For example, every case with a predicted node number of 10 will have the same predicted vehicle purchase price: 30.56. This is, not coincidentally, the mean value reported for terminal node 10 in the original model.

    Although you would typically apply the model to data for which the value of the dependent variable is not known, in this example the data file to which we applied the model actually contains that information—and you can compare the model predictions to the actual values.

  5. From the menus choose:

    Analyze > Correlate > Bivariate...

  6. Select Price of primary vehicle and pre_001.
    Figure 2. Bivariate Correlations dialog box
    Bivariate Correlations dialog box
  7. Click OK to run the procedure.
Figure 3. Correlation of actual and predicted vehicle price
Correlation of actual and predicted vehicle price

The correlation of 0.92 indicates a very high positive correlation between actual and predicted vehicle price, which indicates that the model works well.

Next