IBM InfoSphere Streams Version 4.1.1

Operator Classification

Primitive operator image not displayed. Problem loading file: ../../image/tk$com.ibm.streams.mining/op$com.ibm.streams.mining.scoring$Classification.svg

The Classification operator calculates the predicted class and the confidence for each tuple in the input stream and automatically assigns those values to output stream attributes. To accommodate those values, the output stream schema must contain exactly two attributes that do not have an explicit assignment in the output attributes section or do not have the same name as that of an input stream attribute. The data types of these attributes must be rstring and float64. The attributes can be located anywhere in the output stream schema.

The Classification operator is declared as follows:
stream <stream-schema> stream-name = Classification(){
  param
    model : "<PMML-document-filename>" ;
    <mapping-parameter_1> : <output-attribute-expr_1>
    ...
    <mapping-parameter_n> : <output-attribute-expr_n>;
}

An example of the Classification operator is as follows:

stream <rstring client_id, int32 age, rstring gender,
rstring predictedClass, float64 confidence>
resultClassification = Classification (data){
  param
    model : "../models/naive_bayes.pmml";
    client_id : "CLIENT_ID";
    age : "AGE";
    gender : "GENDER"
}

In the example above, the predicted class is assigned to output stream attribute predictedClass and the confidence is assigned to output stream attribute confidence.

Behavior in a consistent region

  • Use of this operator in a consistent is not supported. If the operator is in a consistent region, it emits a warning when you compile the streams processing application.
  • The operator does not support checkpoint and reset. Therefore, the operator might produce incorrect results when the application fails.

Summary

Ports
This operator has 2 input ports and 1 output port.
Windowing
This operator does not accept any windowing configurations.
Parameters
This operator supports arbitrary parameters in addition to 1 specific parameter.

Required: model

Metrics
This operator does not report any metrics.

Properties

Implementation
C++
Threading
Always - Operator always provides a single threaded execution context.

Input Ports

Ports (0)
Properties

Ports (1)
Properties

Output Ports

Assignments
This operator requires that assignments made to output attributes be input stream attributes.
Ports (0)

Properties

Parameters

This operator supports arbitrary parameters in addition to 1 specific parameter.
model

This mandatory parameter specifies the path name of a file that contains the PMML mining model that is used for scoring the data stream. The path name can be either absolute or relative. If it is relative, the path name is rooted in the data subdirectory of the directory where the application source code file is located. This file must be readable by both the SPL compiler at compile time and by IBM InfoSphere Streams at run time. It must contain a valid PMML document for the operator type.

Properties

Libraries

No description for library.
Command: ../../Common/DmsLibInfo.pl