KPIs for IBM Workload Scheduler

Find out the IBM Workload Scheduler KPIs managed by AIDA.

IBM Workload Scheduler and IBM® Z Workload Scheduler expose metrics and KPIs definitions according to the OpenMetrics standard.

KPIs definitions and data retrieval frequency are defined into a json file inside IBM Workload Scheduler. This file is retrieved by AIDA Exporter component once a day.

According to the frequency of data retrieval defined in the json file, AIDA's Exporter component retrieves the metrics through ad-hoc APIs and stores them into AIDA OpenSearch datababase.

KPIs definitions and KPIs metrics cannot be modified by AIDA users.

For details about IBM Workload Scheduler exposed metrics, see Exposing metrics to monitor your workload in the IBM Workload Scheduler User's Guide and Reference.

For details about IBM Z Workload Scheduler exposed metrics, see Exposing metrics to monitor your workload in the IBM Z Workload Scheduler Managing the Workload.

AIDA also collects a special KPI named Job history, containing the duration for each job that has been defined in IBM Workload Scheduler with the advanced analytics option enabled and for all its predecessor jobs. Every day, this KPI generates one data point for each job execution (KPI frequency = 86400 seconds) .

On a daily basis, starting from the KPIs time series, AIDA uses Machine Learning alghoritms to predict the KPIs trends.

According to the Timerange parameter in the common.env configuration file (or values.yaml file for Kubernetes deployment), KPIs current values are compared with their predicted values. Alerts can be generated, based on alerts definition rules. For details, see Alert Definitions.

IBM Workload Scheduler KPIs are grouped in the following categories:

Category KPI name HCL Workload Automation KPI metric string Metric for instances division Anomaly type Object monitored Data frequency
Jobs Number of jobs in plan by folder (by status) application_wa_JobsByFolder_jobs Job status (10) Higher/lower jobs number Folder 1 data point every 4 minutes (240 seconds)
Number of jobs in plan by workstation (by status) application_wa_JobsByWorkstation_jobs Job status (10) Higher/lower jobs number Workstation 1 data point every 4 minutes (240 seconds)
Number of jobs in plan by status application_wa_JobsInPlanCount_job Job status (10) Higher/lower jobs number All jobs in plan 1 data point every 4 minutes (240 seconds)
Number of total jobs in plan application_wa_JobsInPlanCount_job_total / Higher/lower jobs number All jobs in plan 1 data point every 4 minutes (240 seconds)
Job history (start time & duration) job_history Start time Duration Earlier/later start time Longer/shorter duration Job 1 data point per each daily job executions (86400 seconds)
Queue Available space for WA message files application_wa_msgFileFill_percent Queues (12) Finishing space for queue All queues 1 data point every 4 minutes (240 seconds)
Note:
  • Job status: WAITING, READY, RUNNING, SUCCESSFUL, ERROR, CANCELED, HELD, UNDECIDED, BLOCKED, SUPPRESS.
  • Queues: Appserverbox.msg, Courier.msg, mirrorbox.msg, Mailbox.msg, Monbox.msgn, Moncmd.msg, auditbox.msg, clbox.msg, planbox.msg, Intercom.msg, pobox messages, server.ms

IBM Workload Scheduler KPIs json file

In the KPIs json file inside IBM Workload Scheduler, each entry defines a KPI. The frequency parameter represents the frequency of the KPI data retrieval, expressed in seconds. This file cannot be modified by users.

[
  {
    "name": "Job history KPI",
    "metric_name": "job_history",
    "frequency": 86400,
    "category": "Jobs",
    "subcategory": "history",
    "labels":[
		"job"
	],
	"keyprop":"attributes",
	"keyPropValues":["duration"],
    "type":"records"
  },
  {
    "name": "Total jobs in plan",
    "metric_name": "application_wa_JobsInPlanCount_job",
    "frequency": 240,
    "category": "Jobs",
    "subcategory": "Trend",
    "type":"total"
  },
  {
    "name": "Jobs in plan by status",
    "metric_name": "application_wa_JobsInPlanCount_job",
    "frequency": 240,
    "category": "Jobs",
    "subcategory": "Trend",
    "keyprop": "jobstatus"
  },
  {
    "name": "Jobs in plan by workstation",
    "metric_name": "application_wa_JobsByWorkstation_jobs",
    "frequency": 240,
    "category": "Jobs",
    "subcategory": "Trend_by_wks",
    "keyprop": "jobstatus",
    "labels": [
      "workstation"
    ]
  },
  {
    "name": "Jobs in plan by folder",
    "metric_name": "application_wa_JobsByFolder_jobs",
    "frequency": 240,
    "category": "Jobs",
    "subcategory": "Trend_by_folder",
    "keyprop": "jobstatus",
    "labels": [
      "folder"
    ]
  },
  {
    "name": "WA Message files fill percentile",
    "metric_name": "application_wa_msgFileFill_percent",
    "frequency": 240,
    "category": "Queue",
    "subcategory": "Msg file fill",
    "keyprop": "msgfile"
  }
]