Decision Tree/Rule Set model nugget settings

The Settings tab for a decision tree or Rule Set model nugget enables you to specify options for confidences and for SQL generation during model scoring. This tab is available only after the model nugget has been added to a stream.

Calculate confidences Select to include confidences in scoring operations. When scoring models in the database, excluding confidences enables you to generate more efficient SQL. For regression trees, confidences are not assigned.
Note: If you select the Create a model for very large datasets option on the Build Options tab - Method panel for CHAID models, this checkbox is available only in the model nuggets for categorical targets of nominal or flag.
Calculate raw propensity scores For models with a flag target (which return a yes or no prediction), you can request propensity scores that indicate the likelihood of the true outcome specified for the target field. These are in addition to other prediction and confidence values that may be generated during scoring.
Note: If you select the Create a model for very large datasets option on the Build Options tab - Method panel for CHAID models, this checkbox is available only in model nuggets with a categorical target of flag.
Calculate adjusted propensity scores Raw propensity scores are based only on the training data and may be overly optimistic due to the tendency of many models to overfit this data. Adjusted propensities attempt to compensate by evaluating model performance against a test or validation partition. This option requires that a partition field be defined in the stream and adjusted propensity scores be enabled in the modeling node before generating the model.
Note: Adjusted propensity scores are not available for boosted tree and rule set models. See the topic Boosted C5.0 Models for more information.
Rule identifier For CHAID, QUEST, and C&R Tree models, this option adds a field in the scoring output that indicates the ID for the terminal node to which each record is assigned.
Note: When this option is selected, SQL generation is not available.

Generate SQL for this model When using data from a database, SQL code can be pushed back to the database for execution, providing superior performance for many operations.

Select one of the following options to specify how SQL generation is performed.

  • Default: Score using Server Scoring Adapter (if installed) otherwise in process If connected to a database with a scoring adapter installed, generates SQL using the scoring adapter and associated user defined functions (UDF) and scores your model within the database. When no scoring adapter is available, this option fetches your data back from the database and scores it in SPSS® Modeler.
  • Score by converting to native SQL without missing value support If selected, generates native SQL to score the model within the database, without the overhead of handling missing values. This option simply sets the prediction to null ($null$) when a missing value is encountered while scoring a case.
    Note: This option is not available for CHAID models. For other model types, it is only available for decision trees (not rule sets).
  • Score by converting to native SQL with missing value support For CHAID, QUEST, and C&R Tree models, you can generate native SQL to score the model within the database with full missing value support. This means that SQL is generated so that missing values are handled as specified in the model. For example, C&R Trees use surrogate rules and biggest child fallback.
    Note: For C5.0 models, this option is only available for rule sets (not decision trees).
  • Score outside of the Database If selected, this option fetches your data back from the database and scores it in SPSS Modeler.