Creating a Distribution Graph

During data mining, it is often useful to explore the data by creating visual summaries. IBM® SPSS® Modeler offers several different types of graphs to choose from, depending on the kind of data that you want to summarize. For example, to find out what proportion of the patients responded to each drug, use a Distribution node.

Figure 1. Sample of available graphs
Sample of available graphs

Add a Distribution node to the stream and connect it to the Source node, then double-click the node to edit options for display.

Select Drug as the target field whose distribution you want to show. Then, click Run from the dialog box.

Figure 2. Selecting drug as the target field
Selecting drug as the target field
Figure 3. Distribution of response to drug type
Distribution of response to drug type

The resulting graph helps you see the "shape" of the data. It shows that patients responded to drug Y most often and to drugs B and C least often.

Figure 4. Results of a data audit
Results of a data audit
Figure 5. Data Audit node
Data Audit node

Alternatively, you can attach and execute a Data Audit node for a quick glance at distributions and histograms for all fields at once. The Data Audit node is available on the Output tab.