In a multiple linear regression, the output of the regression is a linear combination of multiple input variables. In autoregression models, the output is the future data point expressed as a linear combination of the past p data points. p is the number of lags included in the equation. An AR(1) model is defined mathematically as:
xt-1 is the past series value from one lag back
ϕ is the calculated coefficient for that lag
Alphat is white noise (such as randomness)
Delta is defined as
for an autoregressive model of order p, where p is the total number of covariates calculated for lags and μ is the process mean.
When more lags are added to the model, we add more coefficients and lag variables to the equation:
The preceding model is a second-order autoregression since it contains two lags.
The general form of an autoregressive equation for an order p is
To use autoregressive models for time-series forecasting we use the current time value and any historical data to predict the next time step. For instance, an AR model with 2 lags might predict a single time step forward like so: