Exploring a decision tree visualization

A decision tree visualization is used to illustrate how underlying data predicts a chosen target and highlights key insights about the decision tree.

About this task

The predictive strength of a decision tree determines the degree to which the decisions represented by each branch that is shown in the tree, predicts the value of the target.

Decision trees have a single target. If the target field of the decision tree is continuous, then the key insight indicators highlight unusually high or low groups. If the target field of the decision tree is categorical, then the key insight is the mode of the node. The mode of the node is the most frequently occurring category or categories of the target field within the group.

To improve performance, due to number of rows in the data source, the analysis is based on a representative sample of the entire data.

When you review a decision tree:

If you want to see all the drivers, use either the Tree diagram tab or the Rules tab.
If you want to focus on key drivers, use the Tree sunburst tab.

To edit or add key drivers, click the More icon on the target field.

Insights are different depending on the type of your target. If you are predicting a continuous measure, for example income, age, or profit, then the decision tree shows within the node the average value of the target given the conditions so far within the group that is represented by the node. For example, if you have a tree that is predicting income and you have a branch that has gender and then city. If you follow the path from male to Chicago, then the value that is in the Chicago node, is the average income of males in Chicago.

Procedure

If you have a continuous measure, the following example illustrates the decision tree.
The color shows whether the value of the node is associated with high, medium, or low values of the target. The color of the node is based on the average of the target for the measure. The higher the average value of the target for a node, the darker the color.

For example, shown next is the detailed visualization for Total Claim Amount on automobile insurance policies. A strong predictor for a high claim amount is claims that originate from policy holders who live in a suburban location, drive a luxury car, and are employed. A predictor of low claim amount is claims that originate from policy holders who live in a rural location.

The minimap helps you move around the areas of the tree. The minimap is especially useful if there are many nodes.
In this example, the top five highest target values are indicated with a number. You can choose between the following options:
- Full tree. No highest, or lowest values are indicated specifically.
- Top five highest target values. The top five highest target values are shown.
- Top five lowest target values. The five lowest target values are shown.
If you have a categorical measure, select the category for which you want to see the top five or lowest five targets from the Top 5 nodes for: menu or from the Bottom 5 nodes for: menu.
In case you zoomed in too far, the top five or bottom five nodes are not visible.
If you have a categorical measure, the following example illustrates the decision tree.

The color shows which field value or values are represented the most.

In the Tree sunburst tab, you can see that if the measures within the decision tree are strong predictors for a target value or target values, then the colors prevail in that node. The non-significant values are left out.

For example, shown next is the detailed visualization of the marital status in the Tree sunburst tab. It shows that being employed is a strong predictor for being married.

In the Tree diagram tab, the nodes visually show the distribution of the people by marital status.