Operator AnomalyDetector

Primitive operator image not displayed. Problem loading file: ../../image/tk$com.ibm.streams.timeseries/op$com.ibm.streams.timeseries.analysis$AnomalyDetector.svg

The AnomalyDetector operator can detect anomalous subsequences in an incoming data stream.

The AnomalyDetector operator uses an incoming time series to store a reference time series in memory. The size of that reference is determined by the referenceLength parameters. That reference time series is then segmented into subsequences using the sliding window. The length of subsequence is determined by the patternLength parameter and the sliding window shift is determined by the stepSize parameter. Anomaly detection uses a nearest neighbor classification technique, where the number of neighbors is specified by patternCount.

Behavior in a consistent region

  • The operator cannot be the start of a consistent region. An error occurs when you compile your streams processing application.

Summary

Ports
This operator has 1 input port and 1 output port.
Windowing
This operator does not accept any windowing configurations.
Parameters
This operator supports 8 parameters.

Required: inputTimeseries, patternLength, referenceLength

Optional: confidence, inputTimestamp, partitionBy, patternCount, stepSize

Metrics
This operator does not report any metrics.

Properties

Implementation
C++
Threading
Never - Operator never provides a single threaded execution context.

Input Ports

Ports (0)

Consumes timeseries data for performing online anomaly detection. The inputTimeSeries parameter specifies the name of the attribute on this port that contains the time series data. The accepted data type is float64.

Properties

Output Ports

Assignments
This operator allows any SPL expression of the correct type to be assigned to output attributes.
Output Functions
dataFcns
float64 getScore()

Returns a value that reflects how anomalous the current input data pattern is to the reference (or historical) data. The larger the score is, the more anomalous the current subsequence is. The range of values that the score can take are relative to the values of the input data.

list<float64> getSubsequence()

Returns the anomalous subsequence. This output function will only return a value if the calculated score is greater than the value set by the confidence parameter.

<any T> T getStartTime()

Returns the starting time of the anomalous subsequence. This output function can only be used if the inputTimestamp parameter is defined. This output function will only return a value if the calculated score is greater than the value set by the confidence parameter. The supported return types are SPL::timestamp and uint64, based on the type of the inputTimestamp parameter.

<any T> T getEndTime()

Returns the end of time of the anomalous subsequence. This output function will only return a value if the calculated score is greater than the value set by the confidence parameter. This output function can only be used if the inputTimestamp parameter is defined. The supported return types are SPL::timestamp and uint64, based on the type of the inputTimestamp parameter.

<any T> T AsIs()

The default function for output attributes. By default, this function assigns the output attribute to the value of the input attribute with the same name.

Ports (0)

Submits a tuple each time an anomalous score exceeds the value set by the confidence parameter. Custom output functions are used to specify the submitted data. The output tuple attributes whose assignments are not specified are assigned from input attributes.

Properties

Parameters

Required: inputTimeseries, patternLength, referenceLength

Optional: confidence, inputTimestamp, partitionBy, patternCount, stepSize

confidence

The confidence threshold to be used in detecting anomalous time series area. Only time series segments whose anomaly scores is above the confidence value will be displayed. The default value is 0f.

Properties

inputTimeseries

Specifies the input attribute containing the time series data.

Properties

inputTimestamp

Specifies the input attribute containing timestamp data. If this parameter is not specified then no timestamp information will be available. The supported data types are SPL::timestamp and uint64.

Properties

partitionBy

Specifies the name of the attribute that contains the key values that are associated with the time series values in the input tuple.

Properties

patternCount

This determines the number of time series patterns to be used in anomalous time series segments. The maximum allowable of value of patternCount is given by (referenceLength - patternLength)/stepSize. The default value is 5u.

Properties

patternLength

The length of a time series pattern that is the basis of the analysis. The length determines the shape of the basic pattern of analysis. The value of patternLength should be smaller than referenceLength. Time series patterns are subset (segment) of the reference time series at various positions along the reference time series.

Properties

referenceLength

The length of the reference time series stored in memory from which time series patterns are extracted.

Properties

stepSize

The length of the sliding window shift along the reference time series to select time series patterns. The stepSize should be strictly smaller than referenceLength - patternLength. The default value is 1u.

Properties

Libraries

No description for library.
Library Name: analytics
Library Path: ../../../impl/lib
Include Path: ../../../impl/include
No description for library.
Library Name: tsatapi
Library Path: ../../../impl/lib
Include Path: ../../../impl/include