Custom Costs (TREE command)

For each combination of dependent variable categories:

  • Specify the actual category, predicted category, and misclassification cost, in that order.
  • The cost value must be enclosed in square brackets.
  • At least one cost specification must be provided, but you don’t have to specify all possible combinations. Any combination for which a cost is not specified defaults to 1.
  • If multiple cost specifications are provided for a particular combination of predicted and actual categories, the last one is used.
  • Costs specified for categories that do not exist in the data are ignored, and a warning is issued.
  • Cost values must be non-negative.
  • Category values must be consistent with the data type of the dependent variable. String and date values must be quoted. Date values must be consistent with the variable’s print format.
  • Correct classifications, where the predicted category and the actual category match, are always assigned a cost of zero. A warning is issued if a nonzero cost is specified for a correct classification. For example, COSTS CUSTOM=1 1 [3] would be ignored and the cost would be set to 0.

Example

TREE risk [o] BY age income employment
 /COSTS CUSTOM=3 2 [2]
                3 1 [8]. 
  • Assuming that the dependent variable is coded 1=low, 2=medium, and 3=high, the cost of misclassifying a high-risk individual as medium risk is 2.
  • The cost of misclassifying a high-risk individual as low risk is 8.
  • All other misclassifications, such as classifying a medium-risk individual as high risk, are assigned the default cost, 1.
  • Correct classifications such as classifying a high-risk individual as high risk are always assigned a cost of 0.