C&R Tree, CHAID, QUEST, and C5.0 decision tree model nuggets

Decision tree model nuggets represent the tree structures for predicting a particular output field discovered by one of the decision tree modeling nodes (C&R Tree, CHAID, QUEST or C5.0). Tree models can be generated directly from the tree-building node, or indirectly from the interactive tree builder. See the topic The Interactive Tree Builder for more information.

In the model nugget, different options are available depending on the objective specified on the modeling node:

Scoring Tree Models

When you run a stream containing a tree model nugget, the specific result depends on the type of tree.

  • For classification trees (categorical target), two new fields, containing the predicted value and the confidence for each record, are added to the data. The prediction is based on the most frequent category for the terminal node to which the record is assigned; if a majority of respondents in a given node is yes, the prediction for all records assigned to that node is yes.
  • For regression trees, only predicted values are generated; confidences are not assigned.
  • Optionally, for CHAID, QUEST, and C&R Tree models, an additional field can be added that indicates the ID for the node to which each record is assigned.

The new field names are derived from the model name by adding prefixes. For C&R Tree, CHAID, and QUEST, the prefixes are $R- for the prediction field, $RC- for the confidence field, and $RI- for the node identifier field. For C5.0 trees, the prefixes are $C- for the prediction field and $CC- for the confidence field. If multiple tree model nodes are present, the new field names will include numbers in the prefix to distinguish them if necessary—for example, $R1- and $RC1-, and $R2-.

Working with Tree Model Nuggets

You can save or export information related to the model in a number of ways.

Note: Many of these options are also available from the tree builder window.

From either the tree builder or a tree model nugget, you can:

  • Generate a Filter or Select node based on the current tree. See Generating Filter and Select Nodes for more information.
  • Generate a Rule Set nugget that represents the tree structure as a set of rules defining the terminal branches of the tree. See Generating a Rule Set from a Decision Tree for more information.
  • In addition, for tree model nuggets only, you can export the model in PMML format. See The models palette for more information. If the model includes any custom splits, this information is not preserved in the exported PMML. (The split is preserved, but the fact that it is custom rather than chosen by the algorithm is not.)
  • Generate a graph based on a selected part of the current tree. Note that this only works for a nugget when it is attached to other nodes in a stream. See Generating Graphs for more information.
  • For boosted C5.0 models only, you can choose Single Decision Tree (Canvas) or Single Decision Tree (GM Palette) to create a new single rule set derived from the currently selected rule. See the topic Boosted C5.0 Models for more information.
Note: Although the Build Rule node was replaced by the C&R Tree node, decision tree nodes in existing streams that were originally created using a Build Rule node will still function properly.