Operator GMM

Primitive operator image not displayed. Problem loading file: ../../image/tk$com.ibm.streams.timeseries/op$com.ibm.streams.timeseries.modeling$GMM.svg

The GMM operator uses a Gaussian mixture model to estimate the probability density function (a smoothed histogram) of a time series. The GMM operator is used for probability estimation and outlier or anomaly detection.

Behavior in a consistent region

  • The operator cannot be the start of a consistent region. A warning occurs when you compile your streams processing application.

Exceptions

The GMM operator hrows an exception if either the trainingSize or mixtures parameter value is 0.

Dependencies

The GMM operator requires that the boost library is installed in your operating system.

Examples

Summary

Ports
This operator has 1 input port and 1 output port.
Windowing
This operator does not accept any windowing configurations.
Parameters
This operator supports 5 parameters.

Required: inputTimeSeries, trainingSize

Optional: mixtures, multivariateMode, partitionBy

Metrics
This operator does not report any metrics.

Properties

Implementation
C++
Threading
Always - Operator always provides a single threaded execution context.

Input Ports

Ports (0)

This port consumes the timeseries to be estimated by the GMM operator. The inputTimeSeries parameter specifies the name of the attribute on this port that contains the time series data. The accepted data types are float64 and list<float64>.

Properties

Output Ports

Assignments
This operator allows any SPL expression of the correct type to be assigned to output attributes.
Output Functions
GMMCOF
<any T> T probability()

This function returns the probability of the specified input sample.

<any T> T outlierProbability()

This output function returns the probability that the current data is an outlier.

<any T> T PDFValue()

Returns the computed PDF value of the given data point.

<any T> T AsIs(T v)

The default function for output attributes. By default, this function assigns the output attribute to the value of the input attribute with the same name.

Ports (0)

This port submits a tuple that contains the estimated probability density function of the timeseries. This port submits a tuple each time an probability density function is estimated. Custom output functions are used to specify the submitted data. The output tuple attributes whose assignments are not specified are assigned from input attributes.

Properties

Parameters

This operator supports 5 parameters.

Required: inputTimeSeries, trainingSize

Optional: mixtures, multivariateMode, partitionBy

inputTimeSeries

Specifies the time series attribute in the input tuple.

Properties

mixtures

Specifies an attribute expression that specifies the number of mixtures. The default value is 1u.

Properties

multivariateMode

Specifies whether the GMM operator treats the input timeseries as a univariate or a multivariate entity. If this parameter is set to false, the input timeseries is treated as a univariate timeseries and the operator produces probabilities for every element in the list. If this parameter is set to true, the operator treats the input timeseries as a multivariate entity and produces single output probability for the entire list. The default value is false.

Properties

partitionBy

Specifies the name of the attribute that contains the key values that are associated with the time series values in the input tuple.

Properties

trainingSize

Specifies how many samples from the beginning are to be used to train the Gaussian mixture model.

Properties

Code Templates

GMM


stream<${schema}> ${outputStream} = GMM(${inputStream}) 
{
	param
		inputTimeSeries: ${timeSeriesExpression};
		trainingSize: ${trainingSize};
	output
		${outputStream}: ${outputExpression};
}
	
      

Libraries

No description for library.
Library Name: modeling
Library Path: ../../../impl/lib
Include Path: ../../../impl/include
No description for library.
Library Name: boost_serialization