Custom Costs (TREE command)
For each combination of dependent variable categories:
- Specify the actual category, predicted category, and misclassification cost, in that order.
- The cost value must be enclosed in square brackets.
- At least one cost specification must be provided, but you don’t have to specify all possible combinations. Any combination for which a cost is not specified defaults to 1.
- If multiple cost specifications are provided for a particular combination of predicted and actual categories, the last one is used.
- Costs specified for categories that do not exist in the data are ignored, and a warning is issued.
- Cost values must be non-negative.
- Category values must be consistent with the data type of the dependent variable. String and date values must be quoted. Date values must be consistent with the variable’s print format.
- Correct classifications, where the predicted category and the actual category match, are always assigned a cost of zero. A warning is issued if a nonzero cost is specified for a correct classification. For example, COSTS CUSTOM=1 1 [3] would be ignored and the cost would be set to 0.
Example
TREE risk [o] BY age income employment
/COSTS CUSTOM=3 2 [2]
3 1 [8].
- Assuming that the dependent variable is coded 1=low, 2=medium, and 3=high, the cost of misclassifying a high-risk individual as medium risk is 2.
- The cost of misclassifying a high-risk individual as low risk is 8.
- All other misclassifications, such as classifying a medium-risk individual as high risk, are assigned the default cost, 1.
- Correct classifications such as classifying a high-risk individual as high risk are always assigned a cost of 0.