Time fields in forecasting data

A time field is identified by a time icon in front of the field label in the Data pane.

You can specify time field properties by using the following properties: Data type or Represents Time.

Data type

A field is recognized as a time field if it has one of the following data types: Date, Time, or Timestamp. Data type is inherited from the data source and cannot be changed.

Date, Time, and Timestamp data types are designed to support the full range of date and time formats that are covered by the ISO 8601 basic and extended formats. The following table shows the supported data types together with an example of format and a data example for each.

Data type	Format example	Data example
Date	yyyy-mm-dd	2019-07-01
Time	hh:mm:ss	12:34:56
Timestamp	yyyy-mm-dd’T’hh:mm:ss	2019-07-01T12:34:56

Represents Time

A field is recognized as a time field if the data property Represents is set to Time. Text and Integer fields that contain time data are also recognized as time fields. Time fields are defined automatically during data import or enrichment. The possible definitions are Date, Year, Quarter, Season, Month, Week, Day, Hour, Minute, or Second.

If time fields are not recognized automatically, you can specify them as time fields. Ensure that the field values are in one of the supported formats, otherwise you might receive an Unsupported date format error.

Nested time fields

You can drag multiple time fields into the same visualization slot to specify a nested time field. For example, a field that represents Week can be dragged into the slot along with a field that represents Day to create a forecast by Days of the Week.

Week and day fields in one visualization slot

Nested fields in the slot must be in time hierarchy order. For example, Week must be placed above Day.

Nested fields cannot skip levels in the time hierarchy that would result in ambiguity. The following table describes acceptable hierarchies.

Time field	Acceptable lower fields
Year	Quarter, Month, Week, Day
Quarter	Month
Month	Day
Week	Day
Day (of Year, Month, or Week)	Hour, Time
Hour	Minute
Minute	Second

If Year is absent in the time hierarchy, then the system defaults to the current year. This can cause issues due to differences between leap and non-leap years. Consider providing the Year in such instances.

Chronological data order

Specified time fields define a chronological order for the time points in the visualization. They are used to sort the points on the visualization in chronological order when forecasting is enabled. The chronological order includes the historical points, along with the new forecasted points. Any other sorting criteria that are specified for the visualization are ignored when forecasting is enabled. For example, the first day of the week is always Sunday even if you specify Monday as the first day of the week in the custom sort order.

Any invalid time labels are moved to the beginning of the sequence and excluded from building the model and computing the forecast.

Time interval detection

Time interval detection is possible when the data is ordered chronologically. The time interval is the size of the smallest interval between any two adjacent time points, such as “2 weeks”. If varying time intervals are detected, they must all be integer multiples of the smallest interval. Otherwise, the data is deemed irregular and cannot be forecast. Missing time points that arise as a result of multiple intervals are filled in for the detected interval. Corresponding measure values are set to missing. If the number of missing values is larger than 33% of the series length, a Too many missing values error is reported.

Measure fields

One or more fields of any type can be specified as measure fields for forecasting analysis by adding them to a corresponding visualization slot. Each measure field is analyzed separately. Multiple time series can also be specified by adding a field to the Color slot, splitting the measure values by the categories of the specified field.

All measure field values that correspond to the same time point are summarized by using one of the following summarization levels: Sum, Minimum, Maximum, Average, Count, and Count distinct. The field must be numeric to support Sum, Minimum, Maximum , or Average summarization. All possible data types and summarization levels are supported for forecasting. However consider the following points:

Small number of different measure values can result in unexpected or uninformative forecasts. For example, when Count distinct summary is used.
Zero measure values can unduly influence results, especially when they represent missing measurements.

Interpolating missing values

Missing values are computed and filled in by the Linear Interpolation algorithm. The computation is based on the nearest neighbors in a chronologically ordered time series with detected time interval. The new value is (previous value + next value)/2. For example, with the values [3, 6, missing, 12], the interpolated value to replace the missing value is (6 + 12) / 2 or 9. The interpolation algorithm can also handle contiguous missing values.

Data points with missing values at the first or the last historical time points are excluded from the series before building a model. Missing values at the last historical time points get forecasted as well.