Operator Classification
SPL standard and specialized toolkits > com.ibm.streams.mining 2.0.0 > com.ibm.streams.mining.scoring > Classification
The Classification operator calculates the predicted class and the confidence for each tuple in the input stream and automatically assigns those values to output stream attributes. To accommodate those values, the output stream schema must contain exactly two attributes that do not have an explicit assignment in the output attributes section or do not have the same name as that of an input stream attribute. The data types of these attributes must be rstring and float64. The attributes can be located anywhere in the output stream schema.
stream <stream-schema> stream-name = Classification(){
param
model : "<PMML-document-filename>" ;
<mapping-parameter_1> : <output-attribute-expr_1>
...
<mapping-parameter_n> : <output-attribute-expr_n>;
}
An example of the Classification operator is as follows:
stream <rstring client_id, int32 age, rstring gender,
rstring predictedClass, float64 confidence>
resultClassification = Classification (data){
param
model : "../models/naive_bayes.pmml";
client_id : "CLIENT_ID";
age : "AGE";
gender : "GENDER"
}
In the example above, the predicted class is assigned to output stream attribute predictedClass and the confidence is assigned to output stream attribute confidence.
Behavior in a consistent region
- Use of this operator in a consistent is not supported. If the operator is in a consistent region, it emits a warning when you compile the streams processing application.
- The operator does not support checkpoint and reset. Therefore, the operator might produce incorrect results when the application fails.
Summary
- Ports
- This operator has 2 input ports and 1 output port.
- Windowing
- This operator does not accept any windowing configurations.
- Parameters
- This operator supports arbitrary parameters in addition to 1 specific parameter.
Required: model
- Metrics
- This operator does not report any metrics.
Properties
- Implementation
- C++
- Threading
- Always - Operator always provides a single threaded execution context.
- Ports (0)
-
- Properties
-
- Optional: false
- ControlPort: false
- TupleMutationAllowed: false
- WindowingMode: NonWindowed
- WindowPunctuationInputMode: Oblivious
- Ports (1)
-
- Properties
-
- Optional: true
- ControlPort: false
- TupleMutationAllowed: false
- WindowingMode: NonWindowed
- WindowPunctuationInputMode: Oblivious
- Assignments
- This operator requires that assignments made to output attributes be input stream attributes.
- Ports (0)
-
- Properties
-
- Optional: false
- TupleMutationAllowed: true
- WindowPunctuationOutputMode: Preserving
- model
This mandatory parameter specifies the path name of a file that contains the PMML mining model that is used for scoring the data stream. The path name can be either absolute or relative. If it is relative, the path name is rooted in the data subdirectory of the directory where the application source code file is located. This file must be readable by both the SPL compiler at compile time and by IBM InfoSphere Streams at run time. It must contain a valid PMML document for the operator type.
- Properties
-
- Type: rstring
- Cardinality: 1
- Optional: false
- ExpressionMode: AttributeFree