Operator IncrementalInterpolate

Primitive operator image not displayed. Problem loading file: ../../image/tk$com.ibm.streams.timeseries/op$com.ibm.streams.timeseries.preprocessing$IncrementalInterpolate.svg

The IncrementalInterpolate operator calculates missing values in a time series.

A specific value in a time series indicates that there are missing values in the time series data. You can configure this value or use the default value to represent the missing values in a time series. You can specify the algorithm that the IncrementalInterpolate operator uses to calculate the missing values.

For example, if you use -1111111111.1 to indicate missing values in the following time series data and use the last algorithm to calculate the missing values, the IncrementalInterpolate operator replaces each occurrence of -1111111111.1 with the value that is at the same position in the previous time series.

Original time series data:

1.1,2.2,3.3,4.4
5.5,-1111111111.1,7.7,8.8
Interpolated time series data:

1.1,2.2,3.3,4.4
5.5,2.2,7.7,8.8 

Behavior in a consistent region

  • The IncrementalInterpolate operator is not supported in a consistent region. A warning occurs when you compile your streams processing application.
  • The operator cannot be the start of a consistent region. Ann error occurs when you compile your streams processing application.

Exceptions

The IncrementalInterpolate operator does not throw any exceptions.

Examples

The following example shows how the IncrementalInterpolate operator uses the last algorithm to calculate the missing data values in a time series data:


composite Main {
  graph
    stream <int64 stampTime, list<float64> KPIs> TimeSeries  = FileSource()
    {
      param
        file   : "KPIStream.csv";
        format : csv;
    } 
    stream <list<float64> KPIs, list<float64> missingRatio>
            interpolatedTimeSeriesOut = IncrementalInterpolate (TimeSeries)
    {
      param
        inputTimeSeries: KPIs;
        algorithm        : last; 
        missingValueCode : -999999999.9;
      output
        interpolatedTimeSeriesOut : KPIs = interpolatedTimeSeries(),
        missingRatio = missingDataRatio();
    }
    () as Snk1 = FileSink(interpolatedTimeSeriesOut)
    {
      param
        file   : "iKPIStream.csv";
        format : csv;
    }
}
The following example shows a sample input file:

22.4,33.3,44.3
21.4,-99999999.9,11.3
The following example shows the time series that is generated for the above sample file:

22.4,33.3,44.3
21.4,33.3,11.3

Summary

Ports
This operator has 1 input port and 1 output port.
Windowing
This operator does not accept any windowing configurations.
Parameters
This operator supports 3 parameters.

Required: inputTimeSeries

Optional: algorithm, missingValueCode

Metrics
This operator reports 1 metric.

Properties

Implementation
C++
Threading
Always - Operator always provides a single threaded execution context.

Input Ports

Ports (0)

This port consumes data for training and interpolating. The inputTimeSeries parameter specifies the name of the attribute on this port that contains the time series data. The accepted data types are float64 and list<float64>.

Properties

Output Ports

Assignments
This operator allows any SPL expression of the correct type to be assigned to output attributes.
Output Functions
InterpolateFunctions
<any T> T interpolatedTimeSeries()

Returns the interpolated time series as sequence of multivariate (list<float64>) values or as sequence of univariate (float64) values. If the return type is float64, the tuple contains either the original time series value or the interpolated value, depending on whether interpolation was performed. If the return type is list<float64>, the entire time series is returned with the missing values replaced by the interpolated values.

<any T> T missingDataRatio()

Returns the ratio of the number of missing values in the input time series to the number of elements in the input time series.

<any T> T AsIs(T v)

The default function for output attributes. By default, this function assigns the output attribute to the value of the input attribute with the same name.

Ports (0)

This port submits a tuple that contains the original time series with missing values replaced by interpolated values. This port submits a tuple for each input tuple. Custom output functions are used to specify the value of the output tuple attributes. The output tuple attributes whose assignments are not specified are assigned from input attributes.

Properties

Parameters

This operator supports 3 parameters.

Required: inputTimeSeries

Optional: algorithm, missingValueCode

algorithm

Specifies the algorithm that the IncrementalInterpolate operator uses to calculate the missing data values in the time series data. The following options are supported. Each of these options use training data that is used as a reference to predict missing values in a time series. The default value is last.

  • last: The option replicates the most recent observed value for input time series. If the input is a multivariate time series, the value last observed to the corresponding index is replaced for the missing value. The minimum training data size (input time series corresponding to the dimension without missing values) is one.
  • average: This option calculates the missing value in the time series by using the moving average method. The training data for this algorithm is the list of all the elements in the input time series. The training data is a list of all the input time series that is received by the operator so far. The minimum size of training data is five, which implies that the operator has received at least five time series with no missing values.
  • predictive: This option calculates the missing value by using the prediction algorithm. The training data is a list of all the input time series that is received by the operator so far. The minimum size of training data is five, which implies that the operator has received at least five time series with no missing values.
Properties

inputTimeSeries

Specifies the name of the attribute that contains the time series data in the input tuple. The supported data types are float64 and list<float64>.

Properties

missingValueCode

Specifies the data value that indicates there are missing data values in the time series data. The default is set to -1000000000.0. If the value specified in the missingValueCode parameter occurs in the input time series, the IncrementalInterpolate operator interpolates the data values by using the algorithm that is specified in the algorithm parameter.

For example, if the missingValueCode parameter is set to -99999999.9, the IncrementalInterpolate operator calculates the data values for all occurrences of -99999999.9 in the following time series data:


    21.4,-99999999.9,11.3,22.6,-99999999.9, 5.2
Properties

Metrics

nValuesInterpolated - Counter

The total number of missing data values that have been replaced.

Libraries

No description for library.
Library Name: tsatapi
Library Path: ../../../impl/lib
Include Path: ../../../impl/include