Exception, Threshold, and User Monitoring

The Performance Toolkit performance monitor will collect and display a lot of very detailed information on the current load of the CPU and I/O equipment, and on the users which have caused it. This is, of course, vital information for an experienced operator or a systems programmer who tries to analyze an existing bottleneck. The same information is, on the other hand, not really suited for detecting the beginning of a performance problem, one cannot seriously expect an operator or systems programmer to continuously watch a monitor screen and look for values that indicate performance degradation.

This problem is solved by special routines included in the monitor which check the collected data for indications of problems, and which generate alert messages if such indications are found:
  • Exception monitoring informs the operator if unusual conditions are found which should be investigated.
  • Threshold monitoring generates alert messages if the values of selected performance indicators for the whole system exceed pre-set limits.
  • User monitoring: Alert messages can be generated for users who either:
    • Have exceeded certain limits in their CPU load or I/O rates
    • Appear to be in either a CPU loop, an I/O loop, in a 'constant WSS' loop, or which have been idle for a long time. Looping or idle users can optionally also be forced off the system.
The alert messages generated by these functions are displayed on the basic mode screen, thus informing the user there is a potential problem to be investigated. All these functions are operative only while permanent data collection is active (command FC MONCOLL ON entered). Their operation is explained in more detail in the following paragraphs.