Selection and Scoring Rules
The Rules tab provides the ability to generate selection or classification/prediction rules in the form of command syntax, SQL, or simple (plain English) text. You can display these rules in the Viewer and/or save the rules to an external file.
Syntax. Controls the form of the selection rules in both output displayed in the Viewer and selection rules saved to an external file.
- IBM® SPSS® Statistics. Command
syntax language. Rules are expressed as a set of commands that define
a filter condition that can be used to select subsets of cases or
as
COMPUTE
statements that can be used to score cases. - SQL. Standard SQL rules are generated to select or extract records from a database or assign values to those records. The generated SQL rules do not include any table names or other data source information.
- Simple text. Plain English pseudo-code. Rules are expressed as a set of logical "if...then" statements that describe the model's classifications or predictions for each node. Rules in this form can use defined variable and value labels or variable names and data values.
Type. For IBM SPSS Statistics and SQL rules, controls the type of rules generated: selection or scoring rules.
- Assign values to cases. The rules can be used to assign the model’s predictions to cases that meet node membership criteria. A separate rule is generated for each node that meets the node membership criteria.
- Select cases. The rules can be used to select cases that meet node membership criteria. For IBM SPSS Statistics and SQL rules, a single rule is generated to select all cases that meet the selection criteria.
Include surrogates in IBM SPSS Statistics and SQL rules. For CRT and QUEST, you can include surrogate predictors from the model in the rules. Rules that include surrogates can be quite complex. In general, if you just want to derive conceptual information about your tree, exclude surrogates. If some cases have incomplete independent variable (predictor) data and you want rules that mimic your tree, include surrogates. See the topic Surrogates for more information.
Nodes. Controls the scope of the generated rules. A separate rule is generated for each node included in the scope.
- All terminal nodes. Generates rules for each terminal node.
- Best terminal nodes. Generates rules for the top n terminal nodes based on index values. If the number exceeds the number of terminal nodes in the tree, rules are generated for all terminal nodes. (See note below.)
- Best terminal nodes up to a specified percentage of cases. Generates rules for terminal nodes for the top n percentage of cases based on index values. (See note below.)
- Terminal nodes whose index value meets or exceeds a cutoff value. Generates rules for all terminal nodes with an index value greater than or equal to the specified value. An index value greater than 100 means that the percentage of cases in the target category in that node exceeds the percentage in the root node. (See note below.)
- All nodes. Generates rules for all nodes.
Note 1: Node selection based on index values is available only for categorical dependent variables with defined target categories. If you have specified multiple target categories, a separate set of rules is generated for each target category.
Note 2: For IBM SPSS Statistics and SQL rules for selecting cases (not rules for assigning values), All nodes and All terminal nodes will effectively generate a rule that selects all cases used in the analysis.
Export rules to a file. Saves the rules in an external text file.
You can also generate and save selection or scoring rules interactively, based on selected nodes in the final tree model. See the topic Case Selection and Scoring Rules for more information.
Note: If you apply rules in the form of command syntax to another data file, that data file must contain variables with the same names as the independent variables included in the final model, measured in the same metric, with the same user-defined missing values (if any).
To Specify Selection or Scoring Rules
This feature requires the Decision Trees option.
- From the menus choose:
- In the main Decision Tree dialog, click Output.
- Click the Rules tab.