Predictor Effects

Figure 1. Predicted probabilities
Predicted probabilities

The predicted values and predicted probabilities are saved to the working dataset, including model-predicted values for the cases that were not in the training or test samples (cases 701 to 850). The predicted value is set to the category with the highest predicted probability. The procedure provides no indication of how the predictors affect the model-predicted probability of response, but you can get a reasonable idea by looking at the distributions of predictor values by the predicted values.

From the menus, choose:

Analyze > Compare Means > Means...

Figure 2. Means dialog box
Means dialog box
  1. Select Years with current employer, Years at current address, Debt to income ratio (x100), and Credit card debt in thousands as dependent variables.
  2. Select Predicted Value as the independent variable.
  3. Click OK.
    Figure 3. Means report
    Means report

    Roughly speaking, it looks as though people who have not worked as long for the same employer, lived as long at the same address, have a higher debt to income ratio, and have a higher credit card debt are more likely to be classified as potential defaulters. Note that the relationships seen here are very rough, since the Naive Bayes procedure works with discretized scale variables; however, these relationships make intuitive sense.

    Likewise, you can use Crosstabs to examine the relationship between Level of eduction and Predicted Value. From the menus, choose:

    Analyze > Descriptive Statistics > Crosstabs...

    Figure 4. Crosstabs dialog box
    Crosstabs dialog box
  4. Select Level of education as the row variable.
  5. Select Predicted Value as the column variable.
  6. Click Cells.
    Figure 5. Crosstabs Cell Display dialog box
    Crosstabs Cell Display dialog box
  7. Select Row in the Percentages group.
  8. Click Continue.
  9. Click OK in the Crosstabs dialog box.
Figure 6. Crosstabulation of Level of education by Predicted Value
Crosstabulation of Level of education by Predicted Value

From this table, it appears that people with higher levels of education are more likely to be classified as potential defaulters. The drop in percentage for people with post-undergraduate degrees is inconclusive because of the few number of people in the category.

Figure 7. Bankloan.sav data
Bankloan.sav data

With this information in hand, you can begin to understand how the predicted probabilities are computed, though the precise contribution of each predictor is still unclear. For example, case 706 is classified as a defaulter and 703 is classified as a non-defaulter. Case 706 has a higher level of education, fewer years of employment, and fewer years at the same address, which are indicators of a potential defaulter (according to the model), but case 703 has higher debt. Exactly why a particular combination of variable values leads to a particular predicted value is not clear at all.

Next