Auto Numeric Node Model Options

The Model tab of the Auto Numeric node enables you to specify the number of models to be saved, along with the criteria used to compare models.

Model name You can generate the model name automatically based on the target or ID field (or model type in cases where no such field is specified) or specify a custom name.

Use partitioned data. If a partition field is defined, this option ensures that data from only the training partition is used to build the model. 

Create split models. Builds a separate model for each possible value of input fields that are specified as split fields. See the topic Building Split Models for more information.

Rank models by. Specifies the criteria used to compare models.

Rank models using. If a partition is in use, you can specify whether ranks are based on the training partition or the testing partition. With large datasets, use of a partition for preliminary screening of models may greatly improve performance.

Number of models to use. Specifies the maximum number of models to be shown in the model nugget produced by the node. The top-ranking models are listed according to the specified ranking criterion. Increasing this limit will enable you to compare results for more models but may slow performance. The maximum allowable value is 100.

Calculate predictor importance. For models that produce an appropriate measure of importance, you can display a chart that indicates the relative importance of each predictor in estimating the model. Typically you will want to focus your modeling efforts on the predictors that matter most, and consider dropping or ignoring those that matter least. Note that predictor importance may extend the time needed to calculate some models, and is not recommended if you simply want a broad comparison across many different models. It is more useful once you have narrowed your analysis to a handful of models that you want to explore in greater detail. See the topic Predictor Importance for more information.

Do not keep models if. Specifies threshold values for correlation, relative error, and number of fields used. Models that fail to meet any of these criteria will be discarded and will not be listed in the summary report.

Optionally, you can configure the node to stop execution the first time a model is generated that meets all specified criteria. See the topic Automated Modeling Node Stopping Rules for more information.