You can create
a mining model that predicts the value of
a target field.
Before you begin
- Place the Predictor operator in the canvas.
- Connect its
input port to the database source table to be analyzed
containing the column used as a target field for the prediction.
Procedure
To define a Predictor
operator, complete the following steps:
- In the canvas, click the operator to select it. A black box highlights a selected operator. After you select an
operator, the property pages for that operator appear in the Properties view
beneath the canvas.
- Use the tabs on the left side of the Properties view
to navigate each operator's property pages.
- Optional: Specify the operator's general
properties.
- In the Properties view, click
the General tab.
- You
can modify the following fields:
- To rename the
operator, enter a name in the Label field.
- You can add a description for the operator in the Description field.
- Optional: Specify
the model name.
- In the Properties view,
click the Model Name tab.
- In the Prefix field, enter the
prefix that is used as the database schema for the view created by
the operator and also as the prefix for the model created by the operator.
- In the Model name field,
enter
the model name that is used as a key to lookup the model in Db2® tables and is also displayed
when browsing the model with IM Visualization. If a clustering
model with the same name already exists in the Db2 database, the existing model is replaced.
- Optional: Specify the
mining settings.
- In the Properties view,
click the Mining
settings tab.
- In the Target
column list, select
the column for which to predict values (must select a value)
- The Optional Parameters option
is for advanced users only. With optional parameter strings, you can
modify default parameters of the Easy Mining procedures. For supported
optional parameters, refer to the Intelligent Miner® Easy Mining Procedures
documentation.
- Additionally, you can
specify classification or regression
specific mining settings:
If a categorical target column
is selected, the classification function is activated to predict target
values. If the target column is numeric, the regression function is
used.
- Classification specific settings
- Tree specific settings:
- To use the decision tree
algorithm, select Tree from
the Algorithm drop-down list. With the Tree algorithm,
you can specify the following parameters:
- Maximum
purity
- Enter the maximum purity value for the decision tree
that is built
by tree classification.
- Maximum depth
- Enter the maximum depth for the decision tree.
- Minimum number of records per leaf node
- Enter
the minimum number of records that must be in each leaf
node of the generated decision tree. You can customize the binary
decision tree by specifying the minimum number of records per internal
node.
- Naive Bayes specific
settings
- To use the Naive Bayes algorithm, select Naive
Bayes from
the Algorithm drop-down list. With the Naive
Bayes algorithm you can specify the following parameter:
- Probability threshold
- Enter the threshold value that is
to be used whenever a probability
of zero is encountered in the model equation.
- Logistic Regression specific settings
- To
use the Logistic Regression algorithm, select Logistic
Regression from the Algorithm drop-down
list. With the Logistic Regression algorithm, parameters cannot be
specified.
- Regression
specific settings
- Algorithm
- By default,
the Transform Regression algorithm is used. You can
change this default setting by specifying the Linear Regression algorithm,
the Polynomial Regression algorithm, or the Radial Basis Function
(RBF) algorithm.
- Algorithm settings
- Specify the values for algorithm-specific parameters.
- By
default, all classifications have an equal weight. The
cost matrix is automatically adjusted so that values of the target
field that appear less frequently are not discriminated. For the tree
classification algorithm, you can also manually adjust the cost matrix.
- In the Properties view, click the Cost
Matrix tab.
- You can either
click the:
- Add a new value icon
to add actual values
to the cost matrix manually.
- Load values from
categorical column icon
to load the values from a table.
- Optional: Specify the column properties.
- In the Properties view, click the Column
Properties tab.
- To modify
any of the column properties, select a column
and click the edit icon. The Column Property Editor window opens.
- To modify the SQL type, select a value from the SQL
Type list.
- To modify the field type, select a
value from the Field
Type list.
- To modify the field usage type, select
a value from the Field
Usage Type list.
- Click Finish.