Operator ARIMA
SPL standard and specialized toolkits > com.ibm.streams.timeseries 3.2.0 > com.ibm.streams.timeseries.modeling > ARIMA
DEPRECATED: The com.ibm.streams.timeseries.modeling.ARIMA operator is deprecated and is replaced by the com.ibm.streams.timeseries.modeling.ARIMA2 operator. The deprecated operator might be removed in a future release.
The ARIMA operator implements the autoregressive integrated moving average (ARIMA) modeling algorithm. It is a widely used algorithm for time series forecasting. It can be used for short-term or long-term forecasting.
The autoregressive integrated moving average algorithm consists of an autoregressive (AR) component, an integrator (I) component and moving average (MA) component. The operator initializes in real-time, based on the operator parameter values. You can specify some constraint on the model explicitly as parameter values, or let the operator estimate the model from early data by specifying only initsamples.
If the model's coefficients are explicitly specified, you should also provide the historical data and the residuals values. The historical data is the last data (in temporal order) that was used to train the model. The residuals are the difference between the training data and forecasted data. The minimum length of the historical data and residuals is the order of the model, which is the highest number of coefficients between the autoregressive part (AR) and moving average part (MA).
ARIMA is univariate operator and can ingest univariate and vector time series. ARIMA also supports expanding time series, where the number of components in the input list can increase over time. The operator accepts time series in the following format:
- A univariate time series as a tuple<float64> or tuple<timestamp timestamp, float64 value>.
- A vector time series as a tuple<list<float64>> or tuple<list<timestamp> timestamps, list<float64 values>.
The ARIMA operator provides two ways of getting the forecasted time series values and timestamp values. It can forecast a value at a single point in a future or it can provide a range of forecasts up to a point in time in the future.
The ARIMA operator supports an optional control port that you can use to re-train the model so that the model adapts to changing trends in the data and predicts time series data more accurately. You can provide the AR and MA coefficients that the model uses for initialization. The specification of the parameters for the ARIMA operator follows a particular format. Parameters with irregular indexing to value mappings, like the AR and MA parameters, use the following format:
<Parameter-name>: {<dimension-index>: { sequence of <param-index>: <value>}, ... };
For parameters with a regular index to value mapping, like means, historyData, and residuals, use the following format:
<Parameter-name>: { <dimension-index>: [ sequence of values ], ... };
Behavior in a consistent region
- The ARIMA operator is not supported in a consistent region. A warning occurs when you compile your streams processing application.
- The operator cannot be the start of a consistent region. An error occurs when you compile your streams processing application.
Exceptions
The ARIMA operator throws an exception in the following cases:
- The stepAhead, maxDimension, or MAOrder parameter value is 0.
Dependencies
The ARIMA operator requires that the boost library is installed in your operating system.
- Examples
- These examples demonstrate how to use the ARIMA operator.
Summary
- Ports
- This operator has 2 input ports and 2 output ports.
- Windowing
- This operator does not accept any windowing configurations.
- Parameters
- This operator supports 17 parameters.
Required: inputTimeSeries, stepAhead
Optional: AR, MA, inputTimestamp, residuals, historyData, means, AROrder, MAOrder, initSamples, coefficient, differentiator, partitionBy, controlSignal, inputCoefficient, retrainingConfig
- Metrics
- This operator does not report any metrics.
Properties
- Implementation
- C++
- Threading
- Never - Operator never provides a single threaded execution context.
- Ports (0)
Consumes data for training and scoring against the model. The inputTimeSeries parameter specifies the name of the attribute on this port that contains the time series data. The accepted data types are float64 and list<float64>.
- Properties
-
- Optional: false
- ControlPort: false
- TupleMutationAllowed: false
- WindowingMode: NonWindowed
- WindowPunctuationInputMode: Oblivious
- Ports (1)
Accepts control signals to control the behavior of the operator. The controlSignal parameter specifies the name of the attribute on this port that contains the control signal type. The inputCoefficient parameter specifies the attribute name on this port that contains the new input coefficients. The retrainingConfig parameter specifies the attribute name on this port that contains the configuration values used for retraining the model.
- Properties
-
- Optional: true
- ControlPort: true
- TupleMutationAllowed: false
- WindowingMode: NonWindowed
- WindowPunctuationInputMode: Oblivious
- Assignments
- This operator allows any SPL expression of the correct type to be assigned to output attributes.
- Output Functions
-
- ARIMAFunctions
-
- list<float64> forecastedTimeSeriesStep()
-
This function returns a list<float64> value, which holds the values of the forecasted time series data at step n in the future, where current time is step '0'. n is specified by the stepAhead parameter.
- list<list<float64> > forecastedAllTimeSeriesSteps()
-
This function returns a list<list<float64<< value, which is a list of forecasted time series values up to step n. n is specified by the stepAhead parameter. The size of the output time series is the same as the size of the input time series multiplied by the stepAhead parameter value.
- <any T> T AsIs(T v)
-
The default function for output attributes. By default, this function assigns the output attribute to the value of the input attribute with the same name.
- <any T> list<T> forecastedTimestamps()
-
This function returns a timestamp object, which holds the value of predicted timestamp. The supported types are list<timestamp> and list<uint64>.
- Getcoeff
-
- map<rstring,map<uint32,map<uint32,float64> > > coefficients()
-
Returns the list of AR and MA coefficients
- <any T> T AsIs(T v)
-
The default function for output attributes. By default, this function assigns the output attribute to the value of the input attribute with the same name.
- Ports (0)
-
Submits a tuple containing the forecasted value for the timeseries. This port will submit a tuple each time a forecast is calculated. Custom output functions are used to specify the value of the output tuple attributes. The output tuple attributes whose assignments are not specified are automatically assigned from input attributes.
- Properties
-
- Optional: false
- TupleMutationAllowed: false
- WindowPunctuationOutputMode: Preserving
- Ports (1)
-
Submits a tuple containing the coefficients used by the filter. This port will submit a tuple each time a Monitor signal is consumed on the input control port. The coefficients() output function is used to assign the value of the coefficients to an attribute. The expected type of the attribute is map<rstring,map<uint32,map<uint32,float64> > >
- Properties
-
- Optional: true
- TupleMutationAllowed: false
- WindowPunctuationOutputMode: Preserving
- inputTimeSeries
Specifies the name of the attribute that contains the time series data in the input tuple. The supported types are float64 and list<float64>.
- Properties
-
- Cardinality: 1
- Optional: false
- ExpressionMode: Attribute
- stepAhead
Specifies the forecast horizon in the sample. The ARIMA operator produces a single-point forecast at the specified step. The value is specified as Xu, where X is the number of steps ahead.
- Properties
-
- Type: uint32
- Cardinality: 1
- Optional: false
- ExpressionMode: Constant
- AR
This parameter of type map<uint32, list<float64>> specifies the autoregressive coefficients in the form of series of {dimension : value}, separated by commas and enclosed in curly braces. For example: AR : { 0u :[ 2.0, 0.6]}. Non-specified indexes are assumed to be zero. This parameter is mandatory if initSamples is not specified.
- Properties
-
- Type: map<uint32,map<uint32,float64> >
- Cardinality: 1
- Optional: true
- ExpressionMode: Constant
- MA
This parameter of type map<uint32, list<float64>> specifies the moving average coefficients in the form of a series of {dimension : value}, separated by commas and enclosed in curly braces. For example: MA : { 0u :[ 2.0, 0.6]}. Non-specified indexes are assumed to be zero. This parameter is mandatory if initSamples is not specified.
- Properties
-
- Type: map<uint32,map<uint32,float64> >
- Cardinality: 1
- Optional: true
- ExpressionMode: Constant
- inputTimestamp
Specifies the name of the attribute in the input stream that contains the timestamp values. The supported types are uint64 and timestamp. If the type is uint64, then the parameter value represents the number of nanoseconds since UNIX epoch.
- Properties
-
- Type
- Cardinality: 1
- Optional: true
- ExpressionMode: Attribute
- residuals
This parameter of type map<uint32, float64> specifies the residuals values for each input dimension. This parameter is mandatory if initSamples is not specified.
- Properties
-
- Type: map<uint32,list<float64> >
- Cardinality: 1
- Optional: true
- ExpressionMode: Constant
- historyData
This parameter of type map<uint32, float64> specifies the history values for each input dimension. This parameter is mandatory if initSamples is not specified.
- Properties
-
- Type: map<uint32, list<float64> >
- Cardinality: 1
- Optional: true
- ExpressionMode: Constant
- means
This parameter of type map<uint32, float64> specifies the mean values for each input dimension. This parameter is mandatory if initSamples is not specified.
- Properties
-
- Type: map<uint32, float64>
- Cardinality: 1
- Optional: true
- ExpressionMode: Constant
- AROrder
This parameter of type uint32 specifies the order that the autoregressive model uses to make the prediction. The default value is 0u.
- Properties
-
- Type: uint32
- Cardinality: 1
- Optional: true
- ExpressionMode: Constant
- MAOrder
This parameter of type uint32 specifies the order that the moving average model uses to make the prediction. The default value is 1u. This parameter is valid only when initSamples parameter is specified.
- Properties
-
- Type: uint32
- Cardinality: 1
- Optional: true
- ExpressionMode: Constant
- initSamples
The parameter of type uint32 triggers the autoregressive parameter estimation mode. This parameter value specifies the number of input time series values to be used for the initialization of model. The parameter is mandatory if the following parameters are not jointly specified: AR, MA, historyData, and means.
- Properties
-
- Type: uint32
- Cardinality: 1
- Optional: true
- ExpressionMode: Constant
- coefficient
Returns the coefficients of the model. The returned type is map<rstring,map<uint32, float64>>
- Properties
-
- Type: map<rstring,map<uint32,float64>>
- Cardinality: 1
- Optional: true
- ExpressionMode: Expression
- differentiator
This parameter of type map<uint32, float64> specifies the differentiator value for each input dimension. If this parameter is not specified, then a value of 0 is used for all dimensions.
- Properties
-
- Type: map<uint32, float64>
- Cardinality: 1
- Optional: true
- ExpressionMode: Constant
- partitionBy
Specifies the name of the attribute that contains the key values that are associated with the time series values in the input tuple.
- Properties
-
- Type: rstring
- Optional: true
- ExpressionMode: Expression
- PortScope: 0, 1
- controlSignal
Specifies the name of the attribute in the control port, which holds the control signal. The supported control signals are: TSSignal.Retrain, TSSignal.Monitor, TSSignal.Load, TSSignal.Suspend, TSSignal.Resume.
- Properties
-
- Type: enum{Monitor,Load,Retrain,RetrainAll,Suspend,Resume,UpdateParamsAll}
- Cardinality: 1
- Optional: true
- ExpressionMode: Expression
- PortScope: 1
- inputCoefficient
Specifies the name of the attribute in the control port, which ingests the coefficients that are used for loading the model. If this parameter is not specified, by default, the inputCoefficient attribute is used. If the default attribute or the inputCoefficient parameter is not provided, the operator throws an exception. If the attribute or the parameter value does not contain valid coefficients, the load operation fails and the operator logs a warning message for each failed operation. The operator continues to predict values by using the older coefficients. The supported type is map<rstring,map<uint32,float64>>.
- Properties
-
- Type: map<rstring,map<uint32,map<uint32,float64>>>
- Cardinality: 1
- Optional: true
- ExpressionMode: Expression
- PortScope: 1
- retrainingConfig
Specifies the name of the attribute in the control port, which ingests the configurations that are used for retraining the model. If this parameter is not specified, by default, the retrainingConfig attribute is used. If the default attribute or the retrainingConfig parameter is not provided, the operator throws an exception. If the attribute or the parameter does not contain valid configurations, the retrain operation fails and the operator logs a warning message for each failed operation. The operator continues to predict values by using the older configuration. The supported type is map<rstring,uint32>.
- Properties
-
- Type: map<rstring,uint32>
- Cardinality: 1
- Optional: true
- ExpressionMode: Expression
- PortScope: 1
- ARIMA
-
stream<${schema}> ${outputStream} = ARIMA(${inputStream}) { param inputTimeSeries: ${timeSeriesExpression}; initSamples: ${initSamplesExpression}; AROrder: ${AROrderExpression}; stepAhead: ${stepAheadExpression}; output ${outputStream}: ${outputExpression}; }
- No description for library.
- Library Name: utils, modeling
- Library Path: ../../../impl/lib
- Include Path: ../../../impl/include
- No description for library.
- Library Name: watfore, modeling
- Library Path: ../../../impl/lib
- Include Path: ../../../impl/include