IBM Streams 4.2.1

Operator Regression

Primitive operator image not displayed. Problem loading file: ../../image/tk$com.ibm.streams.mining/op$com.ibm.streams.mining.scoring$Regression.svg

The Regression operator calculates the predicted value and the predicted standard deviation for each tuple in the input stream and automatically assigns those values to output stream attributes. To accommodate those values, the output stream schema must contain exactly two attributes that do not have an explicit assignment in the output attributes section and do not have the same name as that of an input stream attribute. The data types of these attributes must both be float64. The attributes can be located anywhere in the output schema. The predicted value is assigned to the first of these two attributes, and the predicted standard deviation is assigned to the second of these two attributes.

The Regression operator is declared as follows:
stream <stream-schema> stream-name = Regression(){
  param
    model : "<PMML-document-filename>";
    <mapping-parameter_1> : <output-attribute-expr_1>;
    …
    <mapping-parameter_n> : <output-attribute-expr_n>;
}
An example of the Regression operator is as follows:
stream <rstring client_id, int32 age, rstring gender,
float64 predictedVal, float64 predictedStdDev>
resultRegression = Regression (data){
  param
    model : "../models/linreg.pmml";
    client_id : "CLIENT_ID";
    age : "AGE";
    gender : "GENDER";
}

In the example above, the predicted value is assigned to output stream attribute predictedVal and the predicted standard deviation is assigned to output stream attribute predictStdDev.

Behavior in a consistent region

  • Use of this operator in a consistent region is not supported. If the operator is in a consistent region, a warning occurs when you compile the streams processing application.
  • The operator does not support checkpoint and reset. Therefore, the operator might produce incorrect results when the application fails.

Summary

Ports
This operator has 2 input ports and 1 output port.
Windowing
This operator does not accept any windowing configurations.
Parameters
This operator supports arbitrary parameters in addition to 1 specific parameter.

Required: model

Metrics
This operator does not report any metrics.

Properties

Implementation
C++
Threading
Always - Operator always provides a single threaded execution context.

Input Ports

Ports (0)
Properties

Ports (1)
Properties

Output Ports

Assignments
This operator requires that assignments made to output attributes be input stream attributes.
Ports (0)

Properties

Parameters

This operator supports arbitrary parameters in addition to 1 specific parameter.

Required: model

model

This mandatory parameter specifies the path name of a file that contains the PMML mining model that is used for scoring the data stream. The path name can be either absolute or relative. If it is relative, the path name is rooted in the data subdirectory of the directory where the application source code file is located. This file must be readable by both the SPL compiler at compile time and by IBM InfoSphere Streams at run time. It must contain a valid PMML document for the operator type.

Properties

Libraries

No description for library.
Command: ../../Common/DmsLibInfo.pl