Regression tree growing
Regression tree growing is done by creating a regression tree from a data set. To minimize the dispersion of target attributes, splits are selected. At the same time, target values are assigned to leaves when no further splits are required or possible.
Regression tree growing is a repetition of the following operations:
- Stopping criteria
This operation determines whether a split is done, or whether a node becomes a leaf because it is not split further. The decision is based on the subset of training data sets that correspond to the node.
No split is done if one of the conditions applies:
- The number of instances in the corresponding subset is less than a specified minimum
- The level of the current node is equal to a specified maximum
- The improvement of the dispersion of target values according to the best available split is less than a specified minimum
- Target value assignment
Target values that are assigned to leaves and internal nodes are mean target function values for the corresponding subsets of instances.
- Split selection
Candidate splits are based on the average dispersion in the subsets that is obtained after the split.