Stream-building overview

Data mining using IBM® SPSS® Modeler focuses on the process of running data through a series of nodes, referred to as a stream. This series of nodes represents operations to be performed on the data, while links between the nodes indicate the direction of data flow. Typically, you use a data stream to read data into IBM SPSS Modeler, run it through a series of manipulations, and then send it to a destination, such as a table or a viewer.

For example, suppose that you want to open a data source, add a new field, select records based on values in the new field, and then display the results in a table. In this case, your data stream would consist of four nodes:

A Variable File node, which you set up to read the data from the data source.
A Derive node, which you use to add the new, calculated field to the data set.
A Select node, which you use to set up selection criteria to exclude records from the data stream.
A Table node, which you use to display the results of your manipulations onscreen.