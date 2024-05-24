Once the time series has been made stationary and the nature of the auto-correlations have been determined, it's possible to fit an ARIMA model. There are 3 key parameters for an ARIMA model which are typically referred to as p, d, and q.

p: the order of the Autoregressive part of ARIMA

d: the degree of differencing involved

q: the order of the Moving Average part

These are typically written in the following order: ARIMA(p, d, q). Many programming languages and packages will provide an ARIMA function that can be called with the time series to be analyzed and these three parameters. Most often the data is split into a train set and a test set so that accuracy of the model can be tested after it has been trained. It is usually not possible to tell just from looking at a time plot what values of p and q will be most appropriate for the data. However it is oftentimes possible to use the ACF and PACF plots to determine appropriate values for p and q and thus those pllots are important terms for working with ARIMA

A rough rubric for when to use AR terms in the model is when:

ACF plots show autocorrelation decaying towards zero

PACF plot cuts off quickly towards zero

ACF of a stationary series shows positive at Lag - 1

A rough rubric for when to use MA terms in the model is when:

Negatively Autocorrelated at Lag - 1

ACF that drops sharply after a few lags

PACF decreases gradually rather than suddenly

There are a few classic ARIMA model types that you may encounter.

ARIMA(1,0,0) = first-order autoregressive model: if the series is stationary and autocorrelated, perhaps it can be predicted as a multiple of its own previous value, plus a constant. If the sales of ice cream for tomorrow can be directly predicted using only the sales of ice cream from today, then that is a first-order autoregressive model.

ARIMA(0,1,0) = random walk: If the time series is not stationary, the simplest possible model for it is a random walk model. A random walk is different from a list of random numbers because the next value in the sequence is a modification of the previous value in the sequence. This is often how we model differenced values for stock prices.

ARIMA(1,1,0) = differenced first-order autoregressive model: If the errors of a random walk model are autocorrelated, perhaps the problem can be fixed by adding one lag of the dependent variable to the prediction equation--i.e., by regressing the first difference of Y on itself lagged by one period.

ARIMA(0,1,1) without constant = simple exponential smoothing models: This is used for time-series data with no seasonality or trend. It requires a single smoothing parameter that controls the rate of influence from historical observations (indicated with a coefficient value between 0 and 1). In this technique, values closer to 1 mean that the model pays little attention to past observations, while smaller values stipulate that more of the history is taken into account during predictions.

ARIMA(0,1,1) with constant = simple exponential smoothing models with growth. This is the same as simple exponential smoothing except that there is an additive constant term that makes the Y value of the time series grow as it progresses.

There are many other ways that ARIMA models can be fit of course, which is why we often calculate multiple models and compare them to see which one will provide the best fit for our data. All of these are first order models which means that they map linear processes. There are second order models which map quadratic processes and higher models that map more complex processes.