Model Selection (linear-AS models)

Model selection method. Choose one of the model selection methods (details below) or Include all predictors, which simply enters all available predictors as main effects model terms. By default, Forward stepwise is used.

Forward Stepwise Selection. This starts with no effects in the model and adds and removes effects one step at a time until no more can be added or removed according to the stepwise criteria.

Best Subsets Selection. This checks "all possible" models, or at least a larger subset of the possible models than forward stepwise, to choose the best according to the best subsets criterion. Information Criterion (AICC) is based on the likelihood of the training set given the model, and is adjusted to penalize overly complex models. Adjusted R-squared is based on the fit of the training set, and is adjusted to penalize overly complex models. Overfit Prevention Criterion (ASE) is based on the fit (average squared error, or ASE) of the overfit prevention set. The overfit prevention set is a random subsample of approximately 30% of the original dataset that is not used to train the model.

The model with the greatest value of the criterion is chosen as the best model.

Note: Best subsets selection is more computationally intensive than forward stepwise selection. When best subsets is performed in conjunction with boosting, bagging, or very large datasets, it can take considerably longer to build than a standard model built using forward stepwise selection.