Missing Values with CHAID

Figure 1. Credit data with missing values
Credit data with missing values

Like the credit risk example (for more information, see Using Decision Trees to Evaluate Credit Risk), this example will try to build a model to classify good and bad credit risks. The main difference is that this data file contains missing values for some independent variables used in the model.

  1. To run a Decision Tree analysis, from the menus choose:

    Analyze > Classify > Tree...

    Figure 2. Decision Tree dialog box
    Decision Tree dialog box
  2. Select Credit rating as the dependent variable.
  3. Select all of the remaining variables as independent variables. (The procedure will automatically exclude any variables that don't make a significant contribution to the final model.)
  4. For the growing method, select CHAID.

    For this example, we want to keep the tree fairly simple; so, we'll limit the tree growth by raising the minimum number of cases for the parent and child nodes.

  5. In the main Decision Tree dialog box, click Criteria.
    Figure 3. Criteria dialog box, Growth Limits tab
    Criteria dialog box, Growth Limits tab
  6. For Minimum Number of Cases, type 400 for Parent Node and 200 for Child Node.
  7. Click Continue, and then click OK to run the procedure.

Next