Text View

The Text View shows an overview of the relevant clusters, their size, and characteristics of their fields. The fields in each cluster are highlighted by bold face font.

The following figure shows the Text View of the Clustering Visualizer:

Figure 1. The Text View of the Clustering Visualizer
This graphic shows the Text View of the Clustering Visualizer.

Each row in the table describes one cluster by showing how the field values are distributed for the records in this cluster.

The background color of a row represents a particular information of the corresponding cluster, for example, the size of a cluster. You can also specify the background color to represent the average values of a specific field, or aggregations of all record values in the class of a specific field. You can specify the background color and the sorting criteria on the Color Coding page of the Properties notebook.

In the graphic above, the clusters are sorted by size. The largest cluster is displayed on top of the window. The legend shows that the cluster colors are determined by their average salaries.

In every cluster for the 15 most relevant fields, the name and the state is shown. The most relevant fields depend on the field-sorting mode that is defined in the Sorting properties. The state of the fields depends on the following field types:
Categorical
A categorical string is displayed. For example, Marital_Status is predominantly m.

The state is determined by the most frequent field value in the cluster. For example, looking at cluster 5 in the figure above, the contribution of the field MARITAL_STATUS to the cluster description is MARITAL_STATUS is predominantly s.

Continuous numerical
A value is mapped to low, medium, or high, or to the border numbers of the field specification.

The state is determined by the interval with the highest frequencies in that cluster related to the frequencies in the whole population. If statistical information is available, the labels low, medium, or high are used to determine the state. For example, in the figure above in cluster 1, the state for the field Salary is determined as high.

Discrete numerical
The numeric value is displayed as a string. For example, Year_1st_Policy is predominantly 1991.
For categorical and discrete numerical values, the fields are described by their values having the highest frequency (modal value).

For continuous numerical fields, the fields are described by the intervals having the highest frequency. If statistical information is available, the labels low, medium, or high are used to determine the state.



Feedback | Information roadmap

https://www.ibm.com/docs/en/db2/10.5.0?topic=visualizer-text-view