Operator GAMScorer

Primitive operator image not displayed. Problem loading file: ../../image/tk$com.ibm.streams.timeseries/op$com.ibm.streams.timeseries.modeling$GAMScorer.svg

The GAMScorer operator applies a generalized additive model to score the input time series values.

The GAMScorer operator applies the generalized additive model (GAM) that is specified in a Predictive Model Markup Language (PMML) file to a stream of time series data.

You can use operator parameters to specify the mapping between the attributes of the input tuples and the covariates in the generalized additive model. Each model has its own modelID (an unsigned 64-bit integer). The operator can manage several generalized additive models at the same time; input tuples are assigned to the correct model by using the modelID value.

The GAMScorer operator uses a Predictive Model Markup Language (PMML) file to load and monitor the GAM models. The operator supports an optional control port to send control signals to control the behavior of the model and an optional monitor output port to monitor the GAM models.

Behavior in a consistent region

  • The operator is not supported in a consistent region. A warning occurs when you compile your streams processing application.
  • The operator cannot be the start of a consistent region. A warning occurs when you compile your streams processing application.

Exceptions

The GAMScorer operator throws an exception in the following cases:

  • The attributes that are provided in the spl2pmml map do not exist in the PMML file.
  • The attributes that are provided in the spl2pmml map do not exist in the input tuple.
  • The covariates defined in the PMML file do not exist in the spl2pmml map.
  • The PMML file does not exist or is malformed.
  • The GAMScorer operator receives a tuple with a modelID attribute value which does not exist in the modelID2file parameter.
  • A Load signal is sent to the control port and the modelID and pmmlModel parameters are not specified.
  • A Load signal is sent to the control port and an invalid PMML model is specified. For example the file that describes the model uses a wrong format, the file specifies parameters that are not defined, or the files does not contain values of coefficients for parameters of the model.

Examples

The following example uses the GAMScorer operator to predict a target time series from four input time series:


use com.ibm.streams.timeseries.modeling::GAMScorer;

composite Main {
	type Input = tuple<uint64 modelID, rstring condition, float64 x1, 
		float64 x2, float64 x3, float64 x4, float64 y, float64 filtered_y>;
	graph	
		stream<Input> In = FileSource() 
		{
			param
				file: "test_data.dat";
				format: csv;
		}
			
		stream<In, tuple<float64 y_hat>> Out = GAMScorer(In) {
			param			
				modelID2file: { 1ul : "test_pmml.xml" };				
				spl2pmml: { 
					"condition" : "Type", 
					"x1" : "X1",
					"x2" : "X2",
					"x3" : "X3",
					"x4" : "X4"
				};
				modelID: modelID;
			output
				Out:
				y_hat = predictedTimeSeries();
		}
			
		() as Sink = Custom(Out) 
		{
			logic 
				onTuple Out: println(Out);
		} 
	
		() as writer = FileSink(Out)   {
			param
	        	file : "Out.csv";
	            format: csv;
		}
}

Summary

Ports
This operator has 2 input ports and 2 output ports.
Windowing
This operator does not accept any windowing configurations.
Parameters
This operator supports 6 parameters.

Required: modelID, modelID2file, spl2pmml

Optional: controlSignal, pmmlModel, quantify

Metrics
This operator does not report any metrics.

Properties

Implementation
C++
Threading
Never - Operator never provides a single threaded execution context.

Input Ports

Ports (0)

Consumes the time series data and applies the generalized additive model (GAM) to it. The spl2pmml parameter specifies the mapping between input attributes on this port and the covariates in the GAM.

Properties

Ports (1)

This port accepts control signals to control the behaviour of the operator. The controlSignal parameter specifies the name of the attribute on this port that contains the control signal type. The pmmlModel parameter specifies the name of the PMML model being loaded. The modelID parameter specifies the name of the input attribute that contains the model ID.

Properties

Output Ports

Assignments
This operator allows any SPL expression of the correct type to be assigned to output attributes.
Output Functions
GAMScorerFunc
<any T> T AsIs(T v)

The default function for output attributes. By default, this function assigns the output attribute to the value of the input attribute with the same name.

float64 predictedTimeSeries()

Returns the predicted time series value.

GAMModel
rstring getModel()

Returns PMML model of GAM.

<any T> T AsIs(T v)

The default function for output attributes. By default, this function assigns the output attribute to the value of the input attribute with the same name.

Ports (0)

This port submits a tuple that contains the score for the time series data that is received on input port 0. This port submits a tuple whenever time series data is scored against the model. Custom output functions are used to specify the value of the output tuple attributes. The output tuple attributes whose assignments are not specified are assigned from input attributes.

Properties

Ports (1)

This port submits a tuple that contains the current PMML model. This port submits a tuple each time a monitor signal is consumed on the input control port. The getModel() output function is assigned to the output attribute that will hold the PMML model. The expected type of the attribute is rstring.

Properties

Parameters

This operator supports 6 parameters.

Required: modelID, modelID2file, spl2pmml

Optional: controlSignal, pmmlModel, quantify

controlSignal

Specifies the name of the attribute in the control port, which holds the control signal. The supported control signals are: TSSignal.Monitor, TSSignal.Load, TSSignal.Suspend and TSSignal.Resume.

Properties

modelID

Specifies the name of the SPL input attribute that contains the modelID value.

Properties

modelID2file

Specifies the name of the SPL input attribute that contains the modelID parameter value.

Properties

pmmlModel

Specifies the name of the PMML model. If this parameter is not specified, by default, the pmmlModel attribute is used. If the default attribute or the pmmlModel parameter is not provided, the operator throws an exception.

Properties

quantify

Specifies the size of the internal lookup tables that the GAMScorer operator uses to look up tuples whose attributes are mapped to the covariates in the model. The value of this parameter needs to be strictly greater than 0. If not specified, then the model is evaluated precisely and no lookup is performed.

Properties

spl2pmml

Specifies a mapping between attributes in the input tuples and the covariates in the generalized additive model (GAM). If two input ports are provided, then the attributes in the map must exist in both the input ports. If the parameter value is an empty map, then the generalized additive model does not have any covariates and therefore represents a constant function.

Properties

Code Templates

GAMScorer

stream<${schema}> ${outputStream} = GAMScorer(${inputStream}) 
{
	param
		modelID2file:	${modelID2fileExpression};
		spl2pmml:		${spl2pmmlExpression};
		modelID:		${modelIDExpression};

	output
		${outputStream}: ${outputExpression};
}

      

Libraries

tsaModelingLibrary
Library Name: modeling
Library Path: ../../../impl/lib
Include Path: ../../../impl/include
libxml2
Library Name: xml2 -lz -lm
Library Path: /usr/lib64
Include Path: /usr/include/libxml2