Introduction to Modeling

A model is a set of rules, formulas, or equations that can be used to predict an outcome based on a set of input fields or variables. For example, a financial institution might use a model to predict whether loan applicants are likely to be good or bad risks, based on information that is already known about past applicants.

The ability to predict an outcome is the central goal of predictive analytics, and understanding the modeling process is the key to using IBM® SPSS® Modeler.

Figure 1. A simple decision tree model
A simple decision tree model

This example uses a decision tree model, which classifies records (and predicts a response) using a series of decision rules, for example:

IF income = Medium 
AND cards <5
THEN -> 'Good'

While this example uses a CHAID (Chi-squared Automatic Interaction Detection) model, it is intended as a general introduction, and most of the concepts apply broadly to other modeling types in IBM SPSS Modeler.

To understand any model, you first need to understand the data that go into it. The data in this example contain information about the customers of a bank. The following fields are used:

Field name Description
Credit_rating Credit rating: 0=Bad, 1=Good, 9=missing values
Age Age in years
Income Income level: 1=Low, 2=Medium, 3=High
Credit_cards Number of credit cards held: 1=Less than five, 2=Five or more
Education Level of education: 1=High school, 2=College
Car_loans Number of car loans taken out: 1=None or one, 2=More than two

The bank maintains a database of historical information on customers who have taken out loans with the bank, including whether or not they repaid the loans (Credit rating = Good) or defaulted (Credit rating = Bad). Using this existing data, the bank wants to build a model that will enable them to predict how likely future loan applicants are to default on the loan.

Using a decision tree model, you can analyze the characteristics of the two groups of customers and predict the likelihood of loan defaults.

This example uses the stream named modelingintro.str, available in the Demos folder under the streams subfolder. The data file is tree_credit.sav. See the topic Demos Folder for more information.

Let's take a look at the stream.

  1. Choose the following from the main menu:

    File > Open Stream

  2. Click the gold nugget icon on the toolbar of the Open dialog box and choose the Demos folder.
  3. Double-click the streams folder.
  4. Double-click the file named modelingintro.str.

Next