# Introduction to modeling

A model is a set of rules, formulas, or equations that can be used to predict an outcome based on a set of input fields or variables. For example, a financial institution might use a model to predict whether loan applicants are likely to be good or bad risks, based on information that is already known about them.

Video disclaimer: Some minor steps and graphical elements in these videos might differ from your platform.

The ability to predict an outcome is the central goal of predictive analytics, and understanding the modeling process is the key to using flows in Watson Studio.

The model in this example shows how a bank can predict if future loan applicants might default on their loans. These customers previously took loans from the bank, so the customers’ data is stored in the bank's database. The model uses the customers’ data to determine how likely they are to default.

An important part of any model is the data that goes into it. The bank maintains a database of historical information on customers, including whether they repaid the loans (Credit rating = Good) or defaulted (Credit rating = Bad). The bank wants to use this existing data to build the model. The following fields are used:

Field name Description
Credit_rating Credit rating: 0=Bad, 1=Good, 9=missing values
Age Age in years
Income Income level: 1=Low, 2=Medium, 3=High
Credit_cards Number of credit cards held: 1=Less than five, 2=Five or more
Education Level of education: 1=High school, 2=College
Car_loans Number of car loans taken out: 1=None or one, 2=More than two

This example uses a decision tree model, which classifies records (and predicts a response) by using a series of decision rules.

For example, this decision rule classifies a record as having a good credit rating when the income falls in the medium range and the number of credit cards are less than 5.

``````IF income = Medium
AND cards <5
THEN -> 'Good'``````

Using a decision tree model, you can analyze the characteristics of the two groups of customers and predict the likelihood of loan defaults.

While this example uses a CHAID (Chi-squared Automatic Interaction Detection) model, it is intended as a general introduction, and most of the concepts apply broadly to other modeling types in SPSS Modeler.

## Sample files

This example uses the flow that is named Introduction to Modeling, available in the example project . The data file that is used in this example project is tree_credit.csv.

To open the Introduction to Modeling flow, follow these steps:

1. Open the Example Project.
2. Scroll down to the Modeler flows section, click View all, and select the Introduction to Modeling flow.

The Introduction to Modeling flow demonstrates the basic steps that you need to do to build, browse, evaluate, and score the model. Read the following lessons to learn more about each step.