CHAID Subcommand (TREE command)

The CHAID subcommand sets parameters for a CHAID tree. Except where noted, all parameters also apply to Exhaustive CHAID. It is ignored for CRT and QUEST trees, and an warning is issued.

Each keyword in the subcommand is followed by an equals sign (=) and the value for that keyword.

Example

TREE risk [o] BY income age creditscore
 /METHOD TYPE=CHAID
 /CHAID ALPHASPLIT=.01 INTERVALS=age income (10) creditscore (5).

ALPHASPLIT Keyword

The ALPHASPLIT keyword specifies the significance level for splitting of nodes. An independent variable will not be used in the tree if significance level for the split statistic (chi-square or F) is less than or equal to specified value.

  • Specify a value greater than zero and less than 1.
  • The default value is 0.05.

ALPHAMERGE Keyword

The ALPHAMERGE keyword specifies the significance level for merging of predictor categories. Small values tend to result in a greater degree of merging.

  • Specify a value greater than zero and less than or equal to 1.
  • The default value is 0.05.
  • If you specify a value of 1, predictor categories are not merged.
  • ALPHAMERGE is available only for the CHAID method. For Exhaustive CHAID, the keyword is ignored, and a warning is issued.

SPLITMERGED Keyword

The SPLITMERGED keyword specifies whether predictor categories that are merged in a CHAID analysis are allowed to be resplit.

NO. Merged predictor categories cannot be resplit. This is the default.

YES. Merged predictor categories can be resplit.

CHISQUARE Keyword

For nominal dependent variables, the CHISQUARE keyword specifies the chi-square measure used in CHAID analysis. For ordinal and scale dependent variables, the keyword is ignored and a warning is issued.

PEARSON. Pearson chi-square. This is the default.

LR. Likelihood-ratio chi-square.

CONVERGE Keyword

For nominal and ordinal dependent variables, the CONVERGE keyword specifies the convergence value for estimation of the CHAID model.

  • Specify a value greater than zero and less than 1.
  • The default value is 0.05.
  • If the dependent variable is nominal or scale, this keyword is ignored, and a warning is issued.

MAXITERATIONS Keyword

For nominal and ordinal dependent variables, the MAXITERATIONS keyword specifies the maximum number of iterations for estimation of the CHAID model.

  • Specify a positive integer value.
  • The default value is 100.
  • If the dependent variable is nominal or scale, this keyword is ignored, and a warning is issued.

ADJUST Keyword

The ADJUST keyword specifies how to adjust significance values for multiple comparisons.

BONFERRONI. Significance values are adjusted using the Bonferroni method. This is the default.

NONE. Significance values are not adjusted.

INTERVALS Keyword

In CHAID analysis, scale independent (predictor) variables are always banded into discrete groups (for example, 0-10, 11-20, 21-30, and so on) prior to analysis. You can use the INTERVALS keyword to control the number of discrete intervals for scale predictors.

  • By default, each scale predictor is divided into 10 intervals that have approximately equal numbers of cases.
  • The INTERVALS keyword is ignored if the model contains no scale independent variables.

value. The specified value applies to all scale predictors. For example: INTERVALS=5. The value must be a positive integer less than or equal to 64.

varlist (value). The specified value applies to the preceding variable list only. Specify a list of variables followed by the number of intervals in parentheses. Multiple lists can be specified. For example: INTERVALS=age income (10) creditscore (5). The value must be a positive integer less than or equal to 64.

For the varlist (value) form:

  • If a variable in the list is not a scale variable and/or the variable is not specified as an independent variable, the interval specification is ignored for that variable, and a warning is issued.
  • If a variable appears in more than one list, the last specification is used.
  • For any scale variables not include in the list(s), the default number of intervals (10) is used.