z/OS - Group home

Runtime Diagnostics Now Does More When You Need It Most

  

Authors:  @Karla Arndt and @Susan_Demkowicz​​

 

Help! Something seems very wrong with the system and I just spilled coffee all over my cheat sheet! I flipped through all my yellow stickies, but the ones I need are missing because they were so old that they lost their stickiness, floated away, and got stuck to the bottom of somebody's shoe. I need help fast! What can I use to quickly diagnose the problem that will provide me suggested next steps to resolve the issue? The answer is simple: Runtime Diagnostics!

 

Runtime Diagnostics is a “point-in-time” diagnostics tool that detects problems that could be contributing to system issues RIGHT NOW with the goal of finding the problems in sixty seconds or less. It has been available since V1.12 and as of V1.13, diagnoses seven types of problems which are documented in the redbook http://www.redbooks.ibm.com/abstracts/sg248070.html?Open and in z/OS Problem Management.

 

But, that's not all! We are pleased to announce two new diagnostic events: JES2 Health Exception and Server Health events. Even better, the JES2 Health Exception event is available NOW – starting in V2.1 with APAR OA46531. You'll have to wait just a bit for the Server Health event which is planned to be available starting in V2.2.

 

When Runtime Diagnostics is invoked with this APAR via the existing F HZR,ANALYZE command, it gathers information about the JES2 subsystem from the JES2 subsystem interface (SSI). Runtime Diagnostics analyzes the information received, determines a possible corrective action, and presents it to the caller on the system console, the hardcopy log, and optionally, to a sequential dataset. An example of a possible JES2 event appears below:

 

HZR0200I RUNTIME DIAGNOSTICS RESULT

SUMMARY: SUCCESS

REQ: 004 TARGET SYSTEM: SY1 HOME: SY1 2015/01/12

INTERVAL: 60 MINUTES

EVENTS:

FOUND: 01 - PRIORITIES: HIGH:01 MED:00 LOW:00

TYPES: JES2:01

----------------------------------------------------------------

EVENT 01: HIGH - JES2 - SYSTEM: SY1 2015/01/12

$HASP9158 JES2 PROCESSING STOPPED, $S NEEDED

ERROR: JES2 CANNOT PROCESS NEW WORK.

ACTION: $S TO ENABLE JES2 TO START PROCESSING NEW WORK.

 

Also, z/OS V2.2 plans to provide improved autonomics for health-based workload routing in a Parallel Sysplex with new z/OS Workload Manager (WLM) and XCF functions to improve availability. As part of that z/OS V2.2 feature, Runtime Diagnostics will invoke a new WLM server health query service whenever you request it to do analysis. If any servers have a current health value less than 100, it will display a SERVERHEALTH event in its output along with all the other events it finds. More information on this new event as well as example event output will be documented in z/OS V2R2 Problem Management when it is available.

As an additional feature, some of the Predictive Failure Analysis (PFA) checks invoke Runtime Diagnostics when PFA determines that the metric value is too low. The new JES2 and SERVERHEALTH analysis types are planned to automatically be integrated into that feature and be returned to PFA by Runtime Diagnostics if they exist. PFA will then include those events in the PFA check's exception report.

Runtime Diagnostics is easy to use and there when you need it, but since it isn't needed very often, it's also easy to forget! Don't be one of those people. Take a yellow sticky, write “Use Runtime Diagnostics” on it, and permanently attach it to a place where it cannot be forgotten and where there's no possibility it can lose its stickiness, float away, and get stuck to the bottom of somebody's shoe.