Column properties defined in column properties tables
Instead of defining column properties in parameters, you can create a column properties table that is associated with a specified input table.
This way, you can specify the column properties for input tables that are frequently used in algorithms. The method ensures consistency in how the data is handled for each algorithmic call.
To create and manage a columns properties table, you can use the following procedures:
- COLUMN_PROPERTIES
- SET_COLUMN_PROPERTIES
- GET_COLUMN_LIST
A column properties table has the following columns:
| Column name | Column type | Description |
|---|---|---|
| COLNO | INTEGER | The index of the column in the input table |
| NAME | VARCHAR(128) | The name of column |
| SQLTYPE | VARCHAR(128) | The SQL type of the column Derived from the system catalog |
| LENGTH | INTEGER | The length of the column Derived from the system catalog |
| SCALE | SMALLINT | The scale of the column Derived from the system catalog |
| TYPE | VARCHAR(64) | The mining type, that is, nom for nominal and cont for continuousThe mining type, like nom and nominal for type nominal, and cont and continuous for type continuous |
| ROLE | VARCHAR(64) | A specific role of the column, for example, target or id |
| WEIGHT | FLOAT | The weight of the column |
| CARDINALITY | BIGINT | Column statistics Might be NULL |
| MINIMUM | DOUBLE | Column statistics Might be NULL |
| MAXIMUM | DOUBLE | Column statistics Might be NULL |
| MEAN | DOUBLE | Column statistics Might be NULL |
| VARIANCE | DOUBLE | Column statistics Might be NULL |
| NUMVALID | BIGINT | Column statistics Might be NULL |
| NUMINVALID | BIGINT | Column statistics Might be NULL |
| NUMMISSING | BIGINT | Column statistics Might be NULL |
Specifying a column properties table for an algorithm
The following example shows how to call an algorithm by supplying a specified column properties table.
CALL IDAX.COLUMN_PROPERTIES('intable=SAMPLES.CUSTOMER_CHURN, outtable=CUSTOMER_CHURN_COLPROPTABLE');
CALL IDAX.KMEANS('intable=SAMPLES.CUSTOMER_CHURN, colPropertiesTable=CUSTOMER_CHURN_COLPROPTABLE, incolumn=AVG_SPENT_RETAIN_PM:ignore; CENSOR:nom, id=cust_id, k=3, model=cc_km3, outtable=cc_km3_out');
The colPropertiesTable table contains properties for the input table columns. The incolumn parameter, however, has higher priority and overrides the settings from the column properties table. In the example, the age is treated as continuous regardless of the value that is specified in the column properties table.