Random Trees model nugget output
After you create a Random Trees model, the following information is available in the Output viewer:
Model information table
- The name of the target field that is selected in either the Type node or the Random Trees node Fields tab.
- The model building method - Random Trees.
- The number of predictors input into the model.
The additional details that are shown in the table depend on whether you build a classification or regression model, and if the model is built to handle imbalanced data:
- Classification model (default settings)
- Model accuracy
- Misclassification rule
- Classification model (Handle imbalanced data selected)
- Gmean
- True positive rate, which is subdivided into classes.
- Regression model
- Root mean squared error
- Relative error
- Variance explained
Records Summary
The summary shows how many records were used to fit the model, and how many were excluded. Both the number of records and the percentage of the whole number are shown. If the model was built to include frequency weight, the unweighted number of records that are included and excluded is also shown.
Predictor Importance
The Predictor Importance graph shows the importance of the top 10 inputs (predictors) in the model as a bar chart.
If there are more than 10 fields in the chart, you can change the selection of predictors that are included in the chart by using the slider beneath the chart. The indicator marks on the slider are a fixed width, and each mark on the slider represents 10 fields. You can move the indicator marks along the slider to display the next or previous 10 fields, ordered by predictor importance.
You can double-click the chart to open a separate dialog box in which you can edit the graph size. When you close this separate editing dialog box, the changes are applied to the chart that is displayed in the Output tab.
Top Decision Rules table
By default, this interactive table displays the statistics of the top rules, which are sorted by interestingness.
You can double-click the table to open a separate dialog box in which you can edit the rule information that is shown in the table. The information that is displayed, and the options that are available in the dialog box, depend on the data type of the target; such as, categorical, or continuous.
- The details of how the rule is applied and made up
- If the results are in the most frequent category
- Rule accuracy
- Trees accuracy
- Interestingness index
- P(A(t)) is the trees accuracy
- P(B(t)) is the rule accuracy
- P(B(t)|A(t)) represents correct predictions by both the trees and the node
- The remaining piece of the formula represents incorrect predictions by both the trees and the node.
- Top decision rules The top five decision rules, which are sorted by the interestingness index.
- All rules The table contains all of the rules that are produced by the model but shows only 20 rules per page. When you select this layout, you can search for a rule by using the additional options of Find rule by ID and Page.
In addition, for a categorical target, you can alter the rule table layout by using the Top rules by category option. The top five decision rules are sorted by the percentage of total records for a Target category that you select.
If you change the layout of the rules table, you can copy the modified rules table back to the Output viewer by clicking the Copy to Viewer button at the upper left of the dialog box.
Confusion Matrix
For classification models, the confusion matrix shows the number of predicted results versus the actual observed results, including the proportion of correct predictions.