Defining a Predictor operator

You can create a mining model that predicts the value of a target field.

Before you begin

  • Place the Predictor operator in the canvas.
  • Connect its input port to the database source table to be analyzed containing the column used as a target field for the prediction.

Procedure

To define a Predictor operator, complete the following steps:

  1. In the canvas, click the operator to select it. A black box highlights a selected operator. After you select an operator, the property pages for that operator appear in the Properties view beneath the canvas.
  2. Use the tabs on the left side of the Properties view to navigate each operator's property pages.
  3. Optional: Specify the operator's general properties.
    1. In the Properties view, click the General tab.
    2. You can modify the following fields:
      • To rename the operator, enter a name in the Label field.
      • You can add a description for the operator in the Description field.
  4. Optional: Specify the model name.
    1. In the Properties view, click the Model Name tab.
    2. In the Prefix field, enter the prefix that is used as the database schema for the view created by the operator and also as the prefix for the model created by the operator.
    3. In the Model name field, enter the model name that is used as a key to lookup the model in Db2® tables and is also displayed when browsing the model with IM Visualization. If a clustering model with the same name already exists in the Db2 database, the existing model is replaced.
  5. Optional: Specify the mining settings.
    1. In the Properties view, click the Mining settings tab.
    2. In the Target column list, select the column for which to predict values (must select a value)
    3. The Optional Parameters option is for advanced users only. With optional parameter strings, you can modify default parameters of the Easy Mining procedures. For supported optional parameters, refer to the Intelligent Miner® Easy Mining Procedures documentation.
    4. Additionally, you can specify classification or regression specific mining settings:

      If a categorical target column is selected, the classification function is activated to predict target values. If the target column is numeric, the regression function is used.

      Classification specific settings
      Tree specific settings:
      To use the decision tree algorithm, select Tree from the Algorithm drop-down list. With the Tree algorithm, you can specify the following parameters:
      Maximum purity
      Enter the maximum purity value for the decision tree that is built by tree classification.
      Maximum depth
      Enter the maximum depth for the decision tree.
      Minimum number of records per leaf node
      Enter the minimum number of records that must be in each leaf node of the generated decision tree. You can customize the binary decision tree by specifying the minimum number of records per internal node.
      Naive Bayes specific settings
      To use the Naive Bayes algorithm, select Naive Bayes from the Algorithm drop-down list. With the Naive Bayes algorithm you can specify the following parameter:
      Probability threshold
      Enter the threshold value that is to be used whenever a probability of zero is encountered in the model equation.
      Logistic Regression specific settings
      To use the Logistic Regression algorithm, select Logistic Regression from the Algorithm drop-down list. With the Logistic Regression algorithm, parameters cannot be specified.
      Regression specific settings
      Algorithm
      By default, the Transform Regression algorithm is used. You can change this default setting by specifying the Linear Regression algorithm, the Polynomial Regression algorithm, or the Radial Basis Function (RBF) algorithm.
      Algorithm settings
      Specify the values for algorithm-specific parameters.
  6. By default, all classifications have an equal weight. The cost matrix is automatically adjusted so that values of the target field that appear less frequently are not discriminated. For the tree classification algorithm, you can also manually adjust the cost matrix.
    1. In the Properties view, click the Cost Matrix tab.
    2. You can either click the:
      • Add a new value icon to add actual values to the cost matrix manually.
      • Load values from categorical column icon to load the values from a table.
  7. Optional: Specify the column properties.
    1. In the Properties view, click the Column Properties tab.
    2. To modify any of the column properties, select a column and click the edit icon. The Column Property Editor window opens.
      1. To modify the SQL type, select a value from the SQL Type list.
      2. To modify the field type, select a value from the Field Type list.
      3. To modify the field usage type, select a value from the Field Usage Type list.
      4. Click Finish.


Feedback