Application peak value forecast stream

The application peak value forecast stream (appl_peak_daily_forecast_timeseries.str) analyzes a defined time range of the history data for CPU seconds, calculates its peak value of MIPS, and forecast its one year peak value at an application level for the future.

CPU seconds and MIPS

The CPU seconds refer to three types of processor seconds: CP_SECONDS, IFA_ SECONDS, and IIP_ SECONDS. MIPS is calculated based on the CPU seconds, so there are three types of MIPS derived from three types of the CPU seconds.

Application level

The application level is defined by the field APPL_NAME. A different value in the field results in a different application level.

Peak values

The peak values are defined as a serial formed by the Top-N MIPS values in each day, where N is specified as the Peak Highest Rank within the range [1,24]. You can also input multiple N values to forecast multiple peak value serials in a single run. The maximum value of N is 5.

Input table

  • APPL_MIPS_UTIL_ZOS_VIEW: the source of history data for CPU seconds in each measurement interval.
  • MIPS_CAPACITY: this table is used to look up the MIPS_TOTAL_CAPACITY to calculate the MIPS.
  • APPL_MAPPING: this table is used to derive the value of MAPPING_TIME.

Output table

  • APPL_ZOS_FORECAST: the forecasted data and the related history data for MIPS of each day by each application level for each Peak Highest Rank is stored in this table.
  • FORECAST_METADATA: the execution information of peak forecasting stream for each Peak Highest Rank is stored in this table.

Input parameters

  • CMDW: the CMA database connection.
  • UID: the database user.
  • PWD: the database user's password.
  • CMASCHEMA: the schema for CMA.
  • OUTPUTMODELPATH: the model output path
  • LOGPATH: the logging path.
  • INPUTHOLIDAYFILE: the input holiday file.
  • TMPPATH: temporary data saved path.
  • DATE_START: the start date for the data that is used.
  • DATE_END: the end date for the data that is used.
  • HOUR_START: the start hour for the data that is used.
  • HOUR_END: the end hour for the data that is used.
  • PEAKRANK: the Nth highest peak rank values. Valid values are in the range [1,5]. You can also input multiple rank values.

    Note: For the DATE_END and HOUR_END parameter, the inputted value must be greater than the DATE_START and HOUR_START values, respectively.

    For PEAKRANK, use a comma (,) to separate rank values if you are using multiple rank values.

Validating inputs

The stream verifies that input values are valid. For example, the stream verifies whether the path and database are writable. If not, it stops and an error is written to the log file. The log file is stored in LOGPATH.

Forecast data generation and output

There is a Python script in the peak forecast stream that drives the process. The stream finds the peak values for each type of MIPS by each Peak Highest Rank N by each application level within each day. The peak MIPS values are selected from the MIPS values by each hour within a day. After a serial of peak values is produced, the stream analyzes the data by using the Time Serials model, outputs and stores each of the models that are trained in OUTPUTMODELPATH, and uses them to forecast one year of peak values in the future for each type of MIPS by each application level. For each related Peak Highest Rank N of the forecasted data, peak_n is labeled in the field FORECAST_LEVEL. The saved models are named APPL_NAME_<timestamp>.gm.

After the forecast values are produced, they are gathered in a temporary file that is stored in TMPPATH and output to the database. If the database already contains MIPS forecasted data with the same “peak_n”, the new values overwrite the existing values. Values that do not have the same “peak_n” as the new values are kept in the database.

Execution data generation and output

The information of the latest stream execution time and the data time range for history data and forecast data along with related Peak Highest Rank “peak_n” (represented by filed AGGREGATION in the table) are gathered when the forecasting runs. This information is written to the database after the forecasting finishes. Each running for each Peak Highest Rank makes one record in the table. If there are old records in the database, values with the same Peak Highest Rank are overwritten if new values are produced. Any existing values that do not have the same Peak Highest Rank are kept in the database.

Log file and error handling

You can find the IBM® SPSS® Modeler stream log files in the log folder that you specify in the LOGPATH parameter. The log file is named appl_peak_daily_forecast_timeseries_<<timestamp>.log.

For information about error and warning messages, see Error, warning, and installation messages.