Decision tree pruning

Decision tree pruning reduces the risk of overfitting by removing overgrown subtrees that do not improve the expected accuracy on new data.

Whereas you might prevent the growing of these subtrees by specifying aggressive stopping criteria, a separate pruning process is more reliable.

Decision tree pruning uses a decision tree and a separate data set as input and produces a pruned version that ideally reduces the risk of overfitting. You can split a unique data set into a growing data set and a pruning data set. These data sets are used respectively for growing and pruning a decision tree.