Overview (ARIMA command)

ARIMA estimates nonseasonal and seasonal univariate ARIMA models with or without fixed regressor variables. The procedure uses a subroutine library written by Craig Ansley that produces maximum-likelihood estimates and can process time series with missing observations.

Options

Model Specification. The traditional ARIMA (p,d,q)(sp,sd,sq) model incorporates nonseasonal and seasonal parameters multiplicatively and can be specified on the MODEL subcommand. You can also specify ARIMA models and constrained ARIMA models by using the separate parameter-order subcommands P, D, Q, SP, SD, and SQ.

Parameter Specification. If you specify the model in the traditional (p,d,q) (sp,sd,sq) format on the MODEL subcommand, you can additionally specify the period length, whether a constant should be included in the model (using the keyword CONSTANT or NOCONSTANT), and whether the series should first be log transformed (using the keyword NOLOG, LG10, or LN). You can fit single or nonsequential parameters by using the separate parameter-order subcommands to specify the exact lags. You can also specify initial values for any of the parameters using the AR, MA, SAR, SMA, REG, and CON subcommands.

Iterations. You can specify termination criteria using the MXITER, MXLAMB, SSQPCT, and PAREPS subcommands.

Confidence Intervals. You can control the size of the confidence interval using the CINPCT subcommand.

Statistical Output. To display only the final parameter statistics, specify TSET PRINT=BRIEF before ARIMA. To include parameter estimates at each iteration in addition to the default output, specify TSET PRINT=DETAILED.

New Variables. To evaluate model statistics without creating new variables, specify TSET NEWVAR=NONE prior to ARIMA. This could result in faster processing time. To add new variables without erasing the values of Forecasting-generated variables, specify TSET NEWVAR=ALL. This saves all new variables generated during the current session to the active dataset and may require extra processing time.

Forecasting. When used with the PREDICT command, an ARIMA model with no regressor variables can produce forecasts and confidence limits beyond the end of the series (see PREDICT for more information).

Basic Specification

The basic specification is the dependent series name. To estimate an ARIMA model, the MODEL subcommand and/or separate parameter-order subcommands (or the APPLY subcommand) must also be specified. Otherwise, only the constant will be estimated.

  • ARIMA estimates the parameter values of a model using the parameter specifications on the MODEL subcommand and/or the separate parameter-order subcommands P, D, Q, SP, SD, and SQ.
  • A 95% confidence interval is used unless it is changed by a TSET CIN command prior to the ARIMA procedure.
  • Unless the default on TSET NEWVAR is changed prior to ARIMA, five variables are automatically created, labeled, and added to the active dataset: fitted values (FIT#1), residuals (ERR#1), lower confidence limits (LCL#1), upper confidence limits (UCL#1), and standard errors of prediction (SEP#1).
  • By default, ARIMA will iterate up to a maximum of 10 unless one of three termination criteria is met: the change in all parameters is less than the TSET CNVERGE value (the default value is 0.001); the sum-of-squares percentage change is less than 0.001%; or the Marquardt constant exceeds 109 (1.0E9).
  • At each iteration, the Marquardt constant and adjusted sum of squares are displayed. For the final estimates, the displayed results include the parameter estimates, standard errors, t ratios, estimate of residual variance, standard error of the estimate, log likelihood, Akaike’s information criterion (AIC) 1, Schwartz’s Bayesian criterion (SBC) 2, and covariance and correlation matrices.

Subcommand Order

  • Subcommands can be specified in any order.

Syntax Rules

  • VARIABLES can be specified only once.
  • Other subcommands can be specified more than once, but only the last specification of each one is executed.
  • The CONSTANT, NOCONSTANT, NOLOG, LN, and LOG specifications are optional keywords on the MODEL subcommand and are not independent subcommands.

Operations

  • If differencing is specified in models with regressors, both the dependent series and the regressors are differenced. To difference only the dependent series, use the DIFF or SDIFF function on CREATE to create a new series (see CREATE for more information).
  • When ARIMA is used with the PREDICT command to forecast values beyond the end of the series, the original series and residual variable are assigned the system-missing value after the last case in the original series.
  • The USE and PREDICT ranges cannot be exactly the same; at least one case from the USE period must precede the PREDICT period. (See USE and PREDICT for more information.)
  • If a LOG or LN transformation is specified, the residual (error) series is reported in the logged metric; it is not transformed back to the original metric. This is so the proper diagnostic checks can be done on the residuals. However, the predicted (forecast) values are transformed back to the original metric. Thus, the observed value minus the predicted value will not equal the residual value. A new residual variable in the original metric can be computed by subtracting the predicted value from the observed value.
  • Specifications on the P, D, Q, SP, SD, and SQ subcommands override specifications on the MODEL subcommand.
  • For ARIMA models with a fixed regressor, the number of forecasts and confidence intervals produced cannot exceed the number of observations for the regressor (independent) variable. Regressor series cannot be extended.
  • Models of series with imbedded missing observations can take longer to estimate.

Limitations

  • Maximum 1 VARIABLES subcommand.
  • Maximum 1 dependent series. There is no limit on the number of independent series.
  • Maximum 1 model specification.
1 Akaike, H. 1974. A new look at the statistical model identification. IEEE Transaction on Automatic Control, AC–19, 716-723.
2 Schwartz, G. 1978. Estimating the dimensions of a model. Annals of Statistics, 6, 461-464.