IBM InfoSphere Streams Version 4.1.1

Operator AnomalyDetector

Primitive operator image not displayed. Problem loading file: ../../image/tk$com.ibm.streams.timeseries/op$com.ibm.streams.timeseries.analysis$AnomalyDetector.svg

The AnomalyDetector operator can detect anomalous subsequences in an incoming data stream.

The AnomalyDetector operator uses an incoming time series to store a reference time series in memory. The size of that reference is determined by the referenceLength parameters. That reference time series is then segmented into subsequences using the sliding window. The length of subsequence is determined by the patternLength parameter and the sliding window shift is determined by the stepSize parameter. Anomaly detection uses a nearest neighbor classification technique, where the number of neighbors is specified by patternCount.

Behavior in a consistent region

  • The operator cannot be the start of a consistent region. An error occurs when you compile your streams processing application.

Summary

Ports
This operator has 1 input port and 1 output port.
Windowing
This operator does not accept any windowing configurations.
Parameters
This operator supports 7 parameters.

Required: referenceLength, inputTimeseries, patternLength

Optional: stepSize, confidence, patternCount, inputTimestamp

Metrics
This operator does not report any metrics.

Properties

Implementation
C++
Threading
Never - Operator never provides a single threaded execution context.

Input Ports

Ports (0)

Consumes timeseries data for performing online anomaly detection. The inputTimeSeries parameter specifies the name of the attribute on this port that contains the time series data. The accepted data type is float64.

Properties

Output Ports

Assignments
This operator allows any SPL expression of the correct type to be assigned to output attributes.
Output Functions
dataFcns
float64 getScore()

Returns the anomaly score of the patternscore of anomaly score of the pattern.

list<float64> getSubsequence()

Returns the anomalous subsequence.

<any T> T getStartTime()

Returns the starting time of the anomalous subsequence. This output function can only be used if the inputTimestamp parameter is defined. The supported return types are SPL::timestamp and uint64, based on the type of the inputTimestamp parameter.

<any T> T getEndTime()

Returns the end of time of the anomalous subsequence. This output function can only be used if the inputTimestamp parameter is defined. The supported return types are SPL::timestamp and uint64, based on the type of the inputTimestamp parameter.

<any T> T AsIs()

The default function for output attributes. By default, this function assigns the output attribute to the value of the input attribute with the same name.

Ports (0)

Submits a tuple each time an anomalous score exceeds the value set by the confidence parameter. Custom output functions are used to specify the submitted data. The output tuple attributes whose assignments are not specified are assigned from input attributes.

Properties

Parameters

stepSize

The length of the sliding window shift along the reference time series to select time series patterns. The stepSize should be strictly smaller than referenceLength - patternLength. The default value is 1u.

Properties

referenceLength

The length of the reference time series stored in memory from which time series patterns are extracted.

Properties

confidence

The confidence threshold to be used in detecting anomalous time series area. Only time series segments whose anomaly scores is above the confidence value will be displayed. The default value is 0f.

Properties

patternCount

This determines the number of time series patterns to be used in anomalous time series segments. The maximum allowable of value of patternCount is given by (referenceLength - patternLength)/stepSize. The default value is 5u.

Properties

inputTimeseries

Specifies the input attribute containing the time series data.

Properties

inputTimestamp

Specifies the input attribute containing timestamp data. If this parameter is not specified then no timestamp information will be available. The supported data types are SPL::timestamp and uint64.

Properties

patternLength

The length of a time series pattern that is the basis of the analysis. The length determines the shape of the basic pattern of analysis. The value of patternLength should be smaller than referenceLength. Time series patterns are subset (segment) of the reference time series at various positions along the reference time series.

Properties

Libraries

No description for library.
Library Name: analytics
Library Path: ../../../impl/lib
Include Path: ../../../impl/include
No description for library.
Library Name: tsatapi
Library Path: ../../../impl/lib
Include Path: ../../../impl/include