How seasonality is detected

The detection of seasonality in IBM® Planning Analytics Workspace foresting is a multi-step process that uses a wide range of possibilities and efficient, high-performance algorithms.

The following steps are a summary of the seasonality detection process:
  1. Subtract a moving average trend from the data to remove the trend.
  2. Estimate the autocorrelation function (ACF) for this untrended data.

    The algorithm requires that the seasonality is less than half the length of the data, so the upper limit on seasonality detection is half the data length.

  3. Use a customized peak-finding algorithm to find up to eight candidate values of the ACF.
  4. Fit a standard seasonal model to the original data for each candidate and use the seasonality that results in the best model.

Estimate the Autocorrelation Function

Autocorrelation for a time series measures the similarity between observations as a function of the time lag between them. For a time step k, it measures the correlation between observations at time t and observations at time t-k. It averages out the observations over all time periods t that are defined in the series.

Autocorrelation is the fundamental tool that Planning Analytics Workspace uses to work out the best seasonality. The basic idea is that the values of k that gives the largest auto-correlations are good candidates for the seasonality in forecasting. These candidates show that the values at one time are similar to those k steps ago.

However, running simple auto-correlation on the data works poorly. Because data that has a trend (as most data do) makes the best simple predictor of the data at time t be the one immediately before it. Therefore, the autocorrelation isn’t useful. So Planning Analytics Workspace removes the trend first. Because time data often has abrupt changes, Planning Analytics Workspace needs an adaptive trend estimator, and so Planning Analytics Workspace calculates a simple moving average smooth as the trend estimator. Thus the steps needed to calculate the autocorrelation function are the following steps:

  1. For a series y(t), create a simple moving average estimate of the trend m(t). The moving average estimate requires an automatically chosen parameter that determines the degree of smoothing. This degree of smoothing is determined automatically by Planning Analytics Workspace using a custom algorithm developed based on analysis of 10,000 typical business time series.
  2. Subtract the trend from the data to create untrended data u(t) = y(t) – m(t).
  3. Create the autocorrelation function ACF(k) for the untrended data u(t) with standard methodology.
  4. ACF(k) is calculated for all k greater than half the length of the series. Up to a large higher limit, greater than 1000. So if the data has length 22, then Planning Analytics Workspace looks for k in the range 0..11.

Use the Autocorrelation Function to determine candidate seasons

The ACF function generates noise. If the ACF function is strong for k=4, then it is also strong for all multiples of k – 8, 12, 16, etc. In addition, random variation is stronger for higher values of k. Therefore, Planning Analytics Workspace uses a modified peak-hunting algorithm to find the values of k that are highest relative to their neighbors, adding a bias toward smaller values of k.

Planning Analytics Workspace then fits a single seasonal exponential smoothing model for each of those detected critical values of k, and chooses the best one in the following way.

  1. Planning Analytics Workspace uses a specialized peak-finding algorithm to determine the best candidates from the ACF. Currently, Planning Analytics Workspace chooses at most eight candidates.
  2. For each k in this critical set, fit a Seasonal Holt-Winters model with period k.
  3. Pick the model that has the best Akaike Information Criterion (AIC) and use that k as seasonality.

This value is then used in the automatic fitting procedure of Planning Analytics Workspace, which fits various models. Each model that uses seasonality uses this value for the seasonality. The best model is returned. For many data sets, the best seasonality still not provides any additional predictive power. In that case, Planning Analytics Workspace returns a non-seasonal model.