Example Scenario

To illustrate how SA z/OS and OMEGAMON operate together, consider the following scenario.

Suppose there is a DB2® application that should be continuously monitored. Of particular interest is the availability of primary active logs. The LOGN exception indicates that fewer primary active logs exist than specified by the respective threshold value. This is considered a critical health indicator because it can cause a DB2 hang situation if the last primary active log becomes 100% full. Such a situation can only be resolved by making one or more additional primary active logs available again.

In order to monitor this situation and react accordingly, the automation policy has to be changed. First, define the session attributes for the OMEGAMON for DB2 monitor, if they do not yet exist, to be able to establish a VTAM® connection. The OMEGAMON session is referred to by its session name. Then review the number of session operators (automation operators) that are started to handle the VTAM session traffic and add an additional one if a higher degree of parallelism is required. You need to ensure that the number of session operators and predefined NetView tasks are identical.

Next, add a new monitor resource (MTR) that periodically requests exception information from this OMEGAMON session. Add the MTR by means of a HasParent relationship to the DB2 subsystem to be monitored. This ensures that the MTR is activated when the DB2 subsystem is started, and deactivated when the DB2 subsystem is stopped. Also define the MTR via a HasMonitor relationship to the DB2 subsystem to ensure that the monitor's health status can be propagated to the application.

While the MTR is active, it uses the monitor command, INGMTRAP, to gather OMEGAMON exceptions that currently exist, based on the thresholds that are defined in the OMEGAMON for DB2 installation profile. INGMTRAP analyses all exceptions returned by OMEGAMON and filters out those exceptions that the MTR is interested in, in this example, LOGN. SA z/OS subsequently issues message ING080I to initiate exception processing.

Finally, also add a new rule to the NetView automation table (using the SA z/OS policy) that executes a REXX automation procedure to add a new log data set to the pool of primary active data sets whenever the LOGN exception is reported and the health status is CRITICAL (6). The MTR's health status is considered CRITICAL if the number of available primary active logs is equal to 1. If the LOGN exception is reported again in the next monitor interval, a second rule in the automation table sets the MTR's health status to FATAL (7), which triggers an application move because normal recovery handling doesn't seem to work anymore. In addition, an alert is sent to the operator to inform him about this situation. If the LOGN exception is no longer reported, the MTR's health status is set to NORMAL (3).

The health status assigned to the MTR by means of the automation table is propagated to the DB2 application that owns this MTR. Thus, you can see at a glance whether the DB2 subsystem is okay or not.