CRT Criteria
The CRT growing method attempts to maximize within-node homogeneity. The extent to which a node does not represent a homogenous subset of cases is an indication of impurity. For example, a terminal node in which all cases have the same value for the dependent variable is a homogenous node that requires no further splitting because it is "pure."
You can select the method used to measure impurity and the minimum decrease in impurity required to split nodes.
Impurity Measure. For scale dependent variables, the least-squared deviation (LSD) measure of impurity is used. It is computed as the within-node variance, adjusted for any frequency weights or influence values.
For categorical (nominal, ordinal) dependent variables, you can select the impurity measure:
- Gini. Splits are found that maximize the homogeneity of child nodes with respect to the value of the dependent variable. Gini is based on squared probabilities of membership for each category of the dependent variable. It reaches its minimum (zero) when all cases in a node fall into a single category. This is the default measure.
- Twoing. Categories of the dependent variable are grouped into two subclasses. Splits are found that best separate the two groups.
- Ordered twoing. Similar to twoing except that only adjacent categories can be grouped. This measure is available only for ordinal dependent variables.
Minimum change in improvement. This is the minimum decrease in impurity required to split a node. The default is 0.0001. Higher values tend to produce trees with fewer nodes.
To Specify CRT Criteria
This feature requires the Decision Trees option.
- From the menus choose:
- For the growing method, select CRT.
- Click Criteria.
- Click the CRT tab.