Time Series node - missing value options
Use the settings in this pane to specify how any missing values in the input data are to be replaced with an imputed value. The following replacement methods are available:
- Linear interpolation
- Replaces missing values by using a linear interpolation. The last valid value before the missing value and the first valid value after the missing value are used for the interpolation. If the first or last observation in the series has a missing value, then the two nearest non-missing values at the beginning or end of the series are used.
- Series mean
- Replaces missing values with the mean for the entire series.
- Mean of nearby points
- Replaces missing values with the mean of valid surrounding values. The span of nearby points is the number of valid values before and after the missing value that are used to compute the mean.
- Median of nearby points
- Replaces missing values with the median of valid surrounding values. The span of nearby points is the number of valid values before and after the missing value that are used to compute the median.
- Linear trend
- This option uses all non-missing observations in the series to fit a simple linear regression model, which is then used to impute the missing values.
Other settings:
- Lowest data quality score (%)
- Computes data quality measures for the time variable and for input data corresponding to each time series. If the data quality score is lower that this threshold, the corresponding time series will be discarded.