Data mining — Building classification models

Before you can use the Classification mining function, you must collect historical data about the past. This data must be in a table or a view. Each record in the table or the view describes the properties of an entity for which you know the outcome. For example, an entity might be a customer who has responded to a mailing campaign or a car that was produced on time.

Based on the historical data, the Classification mining function determines which properties distinguish the entities with a certain outcome, for example, having responded to a mailing campaign, from entities that have another outcome, for example, not having responded. For example, the Classification mining function might find out that young and single customers are more likely to respond to a mailing campaign than other customers.

The Classification mining function stores this information in a classification model. This step is called training of a model.

You can build a classification model by using the BuildClasModel procedure.

Syntax

IDMMX.BuildClasModel(<modelName>,
                     <inputTable>,
                     <targetColumn>)

Input parameters

With the BuildClasModel procedure, you must specify the following parameters:

<modelName>

The name of the model that you want to build.

Depending on the Easy Mining procedure that you are using, the generated model is stored in the table IDMMX.ClassifModels. If a model with the same name already exists, the previous model is replaced with the new model.

This parameter is of type VARCHAR. Its size is 240.

<inputTable>

The name of the input table or the input view.

The values in the columns of the input table are used to determine the distinguishing properties for each value of the target column.

The columns of the input table that are unlikely to be useful to create a model are ignored by the Easy Mining procedure. These are, for example, key columns.

This parameter is of type VARCHAR. Its size is 240.

<targetColumn>

The name of the column whose values are to be predicted.

The target column must be categorical.

For information about the valid SQL types of categorical and numerical fields, see Mining field types.

This parameter is of type VARCHAR. Its size is 128.

Example

You might want to build the model BANK.BANKCARD_CLASMODEL to predict the BANKCARD column of the BANK.BANKCUSTOMERS table.

Use the following command to run the Easy Mining procedure:

DB2 "call IDMMX.BuildClasModel('BANK.BANKCARD_CLASMODEL',
                               'BANK.BANKCUSTOMERS_TRAIN',
                               'BANKCARD')"