Decision Tree Node Model Options

On the Model Options tab, you can choose whether to specify a name for the model, or generate a name automatically. You can also choose to obtain predictor importance information, as well as raw and adjusted propensity scores for flag targets.

Model name. You can generate the model name automatically based on the target or ID field (or model type in cases where no such field is specified) or specify a custom name.

Model Evaluation

Calculate predictor importance. For models that produce an appropriate measure of importance, you can display a chart that indicates the relative importance of each predictor in estimating the model. Typically you will want to focus your modeling efforts on the predictors that matter most, and consider dropping or ignoring those that matter least. Note that predictor importance may take longer to calculate for some models, particularly when working with large datasets, and is off by default for some models as a result. Predictor importance is not available for decision list models. See Predictor Importance for more information.

Propensity Scores

Propensity scores can be enabled in the modeling node, and on the Settings tab in the model nugget. This functionality is available only when the selected target is a flag field. See the topic Propensity Scores for more information.

Calculate raw propensity scores. Raw propensity scores are derived from the model based on the training data only. If the model predicts the true value (will respond), then the propensity is the same as P, where P is the probability of the prediction. If the model predicts the false value, then the propensity is calculated as (1 – P).

  • If you choose this option when building the model, propensity scores will be enabled in the model nugget by default. However, you can always choose to enable raw propensity scores in the model nugget whether or not you select them in the modeling node.
  • When scoring the model, raw propensity scores will be added in a field with the letters RP appended to the standard prefix. For example, if the predictions are in a field named $R-churn, the name of the propensity score field will be $RRP-churn.

Calculate adjusted propensity scores. Raw propensities are based purely on estimates given by the model, which may be overfitted, leading to over-optimistic estimates of propensity. Adjusted propensities attempt to compensate by looking at how the model performs on the test or validation partitions and adjusting the propensities to give a better estimate accordingly.

  • This setting requires that a valid partition field is present in the stream.
  • Unlike raw confidence scores, adjusted propensity scores must be calculated when building the model; otherwise, they will not be available when scoring the model nugget.
  • When scoring the model, adjusted propensity scores will be added in a field with the letters AP appended to the standard prefix. For example, if the predictions are in a field named $R-churn, the name of the propensity score field will be $RAP-churn. Adjusted propensity scores are not available for logistic regression models.
  • When calculating the adjusted propensity scores, the test or validation partition used for the calculation must not have been balanced. To avoid this, be sure the Only balance training data option is selected in any upstream Balance nodes. In addition, if a complex sample has been taken upstream this will invalidate the adjusted propensity scores.
  • Adjusted propensity scores are not available for "boosted" tree and rule set models. See the topic Boosted C5.0 Models for more information.

Based on. For adjusted propensity scores to be computed, a partition field must be present in the stream. You can specify whether to use the testing or validation partition for this computation. For best results, the testing or validation partition should include at least as many records as the partition used to train the original model.