Data mining — Logical data specifications

Logical data specifications

You must define logical data specifications for the data that you want to use for training runs. For each field in the physical data, a logical definition is required that enables the field to be used in the training run. This logical definition is contained in the logical data specification.

The logical data specification contains generic settings for the fields to be used for a training run. It does not include data from your source tables. The logical data specification is independent of the data mining function that you are using.

Defining the logical data specification is the next step in the model-building process. In this step, you create values of the data type DM_LogicalDataSpec, perform operations on them, or both.

Intelligent Miner® can create a default logical data specification from the input data definition. You can modify the default logical data specification or you can write your own. For example, you might want to create generic data mining settings. Generic field definitions and settings are independent of physical data. This means that you can use these generic data mining settings for different source tables with different layouts but with similar content. For example, you might want to use different sources for a customer segmentation with the Clustering mining function.

The DM_LogicalDataSpec data type defines the input fields that are used by training runs. An input field is defined by a name and a type. The type defines how a mining algorithm handles that field. You can use the following types:

DM_Categorical
DM_Numerical

By default, these types map to DB2® source data types.

The logical data specification can also contain name mappings or taxonomies.