Specifying mining data

In this exercise, the sample script defines the mining data. You can rename columns and define aliases.

Use the following command to run the sample script bankingCreateMiningData.db2 to create the DM_MiningData values and to insert them into the table IDMMX.MiningData:

db2 -stf bankingCreateMiningData.db2

The first part of the sample script bankingCreateMiningData.db2 creates a DM_MiningData value and inserts it into the table IDMMX.MiningData. The column name and type information is extracted from DB2® catalog tables:

INSERT INTO IDMMX."MININGDATA"("ID","MININGDATA")VALUES (
   'AliasEqualToColumn',
   IDMMX.DM_MiningData()..DM_defMiningData('BANKCUSTOMERS'));

The next part of the sample script defines column aliases that differ from the column names in the input data source using method DM_setFldAlias.

INSERT INTO IDMMX."MININGDATA"("ID","MININGDATA")VALUES (
   'AliasDifferentToColumn',
   IDMMX.DM_MiningData()..DM_defMiningData('BANKCUSTOMERS')
   ..DM_setFldAlias('CLIENT_ID','CUSTNO')
   ..DM_setFldAlias('PROFESSION','JOB')
   ..DM_setFldAlias('NBR_YEARS_CLI','LOYALTY'));

The next statement creates a view to use only a sample of customers as input data. This example defines a repeatable random sample of 10 percent. The sample must be repeatable to support multiple passes during model training:

CREATE VIEW "CUSTSAMPLE"
   AS SELECT * FROM "BANKCUSTOMERS" TABLESAMPLE SYSTEM(10) 
                                    REPEATABLE(3);
   
INSERT INTO IDMMX."MININGDATA" VALUES (
   'CustomerSample',
   IDMMX.DM_MiningData()..DM_defMiningData('CUSTSAMPLE'));

The following statement uses a predicate (WHERE clause) to specify the list of input records:

INSERT INTO IDMMX."MININGDATA" VALUES (
   'MarriedCustomers',
   IDMMX.DM_MiningData()..DM_defMiningData('CUSTSAMPLE')
   ..DM_setWhereClause('MARITAL_STATUS=''married'''));

Feedback | Information roadmap