Setting the field role

The role of a field specifies how it is used in model building—for example, whether a field is an input or target (the thing being predicted).

Note: The Partition, Frequency and Record ID roles can each be applied to a single field only.

The following roles are available:

Input. The field will be used as an input to machine learning (a predictor field).

Target. The field will be used as an output or target for machine learning (one of the fields that the model will try to predict).

Both. The field will be used as both an input and an output by the Apriori node. All other modeling nodes will ignore the field.

None. The field will be ignored by machine learning. Fields whose measurement level has been set to Typeless are automatically set to None in the Role column.

Partition. Indicates a field used to partition the data into separate samples for training, testing, and (optional) validation purposes. The field must be an instantiated set type with two or three possible values (as defined in the Field Values dialog box). The first value represents the training sample, the second represents the testing sample, and the third (if present) represents the validation sample. Any additional values are ignored, and flag fields cannot be used. Note that to use the partition in an analysis, partitioning must be enabled on the Model Options tab in the appropriate model-building or analysis node. Records with null values for the partition field are excluded from the analysis when partitioning is enabled. If multiple partition fields have been defined in the stream, a single partition field must be specified on the Fields tab in each applicable modeling node. If a suitable field doesn't already exist in your data, you can create one using a Partition node or Derive node. See the topic Partition Node for more information.

Split. (Nominal, ordinal and flag fields only) Specifies that a model is to be built for each possible value of the field.

Frequency. (Numeric fields only) Setting this role enables the field value to be used as a frequency weighting factor for the record. This feature is supported by C&R Tree, CHAID, QUEST and Linear models only; all other nodes ignore this role. Frequency weighting is enabled by means of the Use frequency weight option on the Fields tab of those modeling nodes that support the feature.

Record ID. The field will be used as the unique record identifier. This feature is ignored by most nodes; however it is supported by Linear models, and is required for the IBM Netezza in-database mining nodes.