Defining settings that are specific to mining functions or algorithms

Settings can be specific to mining functions or to algorithms.

Settings that are specific to mining functions include parameters that are shared by various mining algorithms of that mining function.

Settings that are specific to mining algorithms include parameters that are only respected by a specific implementation of a mining function, for example, by Distribution-based Clustering.

Defining settings that are specific to mining functions

In Intelligent Miner®, you can specify settings that are specific to mining functions by using various methods. You can use different methods for different parameters. For example, if you are using the Clustering mining function, you can set the maximum number of clusters that you want the training run to produce by using the method DM_setMaxNumClus.

Defining algorithm-specific settings

You can set algorithm-specific settings within each mining function by using the method DM_setAlgorithm. This method provides the following versions:
DM_setAlgorithm( algorithmName )
This version activates a particular algorithm.
DM_setAlgorithm( algorithmName, algorithmParameters )
This version activates a particular algorithm and specifies one or more algorithm-specific parameters.
For example, the following command activates the clustering algorithm Kohonen and specifies that the algorithm performs three passes over the training data:
IDMMX.DM_ClusSettings()..DM_setAlgorithm('Kohonen','<NumPasses>3</NumPasses>') 
Appropriate get methods are also available.

Specifying field definitions

In a settings object, you can also specify field definitions that are specific to mining functions or mining algorithms such as the role of a field in a training run. For example, you might want to specify one of the following field definitions:
  • In a Clustering settings definition, you can specify whether a field is treated as active or as supplementary. In the other mining functions, you can specify whether a field is treated as active or as inactive.
  • In a Classification or Regression settings definition, you must specify which field is the target field.
  • In an Associations or Sequence Rules settings definition, you might want to specify a group field and an item field.
  • In Sequence Rules settings, you must specify a sequence field and a group field. The sequence field contains the transaction group ID. The group field contains the transaction ID or the timestamp.

For the Distribution-based Clustering algorithm, you can set field definitions only by using the DM_setDClusFldPar method. This method gets a field name and a keyword identifying the parameter to be set as its first parameters, followed by the parameter value.

Because the field definitions are specific to a mining function or to a mining algorithm, they are part of the settings definition. They are therefore not part of the logical data specification.

When you create mining settings, you must first include a logical data specification. Fields for which you create definitions as part of the settings specification must have a matching field name in the logical data specification. Some field definitions in the settings specification are required, for example, the target field for a Classification setting. Other field definitions are optional, for example, the definition of a field as active or supplementary in a Clustering settings specification. For the optional field definitions, default values are assumed.

Adding or removing fields in settings

Because the set of fields is defined in the logical data specification, you cannot directly remove fields from settings or add them to settings. For example, if you want to add a field to clustering settings, follow these steps:
  1. Extract the logical data specification from the clustering settings value by using the method DM_getClusDataSpec.
  2. Add the new field to the logical data specification by using the method DM_addDataSpecFld.
  3. Add the logical data specification to the clustering settings again by using the method DM_useClusDataSpec.