In this exercise, the sample script defines the mining data. You can rename columns and define aliases.
db2 -stf bankingCreateMiningData.db2
The first part of the sample script bankingCreateMiningData.db2 creates
a DM_MiningData value and inserts it into the table IDMMX.MiningData.
The column name and type information is extracted from DB2® catalog tables: INSERT INTO IDMMX."MININGDATA"("ID","MININGDATA")VALUES (
'AliasEqualToColumn',
IDMMX.DM_MiningData()..DM_defMiningData('BANKCUSTOMERS'));
The next part of the sample script defines column aliases
that differ from the column names in the input data source using method DM_setFldAlias.
INSERT INTO IDMMX."MININGDATA"("ID","MININGDATA")VALUES (
'AliasDifferentToColumn',
IDMMX.DM_MiningData()..DM_defMiningData('BANKCUSTOMERS')
..DM_setFldAlias('CLIENT_ID','CUSTNO')
..DM_setFldAlias('PROFESSION','JOB')
..DM_setFldAlias('NBR_YEARS_CLI','LOYALTY'));
The next statement creates a view to use only a sample
of customers as input data. This example defines a repeatable random
sample of 10 percent. The sample must be repeatable to support multiple
passes during model training: CREATE VIEW "CUSTSAMPLE"
AS SELECT * FROM "BANKCUSTOMERS" TABLESAMPLE SYSTEM(10)
REPEATABLE(3);
INSERT INTO IDMMX."MININGDATA" VALUES (
'CustomerSample',
IDMMX.DM_MiningData()..DM_defMiningData('CUSTSAMPLE'));
The following statement uses a predicate (WHERE clause)
to specify the list of input records: INSERT INTO IDMMX."MININGDATA" VALUES (
'MarriedCustomers',
IDMMX.DM_MiningData()..DM_defMiningData('CUSTSAMPLE')
..DM_setWhereClause('MARITAL_STATUS=''married'''));