Dataset I/O collection tuning
High CPU usage can be an issue in the OMEGAMON subsystem address space, specifically with the KDFSCOL and KDFSMIG modules.
Excessive CPU usage is likely when both the following conditions exist:
- The KDF_FM01_VOL parameter is set to
*
and
- The KDF_FM01_SAM_CNT and KDF_MSR_TRIP_CNT parameters have been set to a very small value
You can reduce CPU usage by tuning dataset I/O collection, which allows you to get millisecond response time information at the dataset level. Use one or both of the following methods:
- Enable or disable dataset I/O collection at the volume level. In general, you should monitor only volumes for which dataset response time is a critical issue or volumes that are known to have problems.
- You can regulate dataset I/O collection by using parameters that specify when dataset level I/O monitoring starts for a volume. Apply these parameters whenever you need to monitor a large number of volumes.
You can collect dataset level I/O statistics for a device, using PARMGEN to set parameters to values that will help reduce CPU usage. The way to do this is by limiting the scope of dataset I/O monitoring.
You can reduce CPU usage by limiting your dataset I/O monitoring to specific critical volumes or jobs or by setting up monitoring to trigger only when response time is poor on a particular volume.
You can limit the volumes by specifying a volser (or volser mask) for parameter KDF_FM01_VOL, or a range of addresses using KDF_FM01_FIRST_DEV and KDF_FM01_LAST_DEV. You can even specify multiple rows of definitions if your volumes do not fit in under a volser mask or device range, as in the following example:
KDF_FM BEGIN
KDF_FM01_ROW 01
KDF_FM01_VOL "TSO*"
KDF_FM01_SAM_CNT 1
KDF_FM02_ROW 02
KDF_FM02_VOL "DVP101"
KDF_FM02_SAM_CNT 1
KDF_FM03_ROW 03
KDF_FM03_VOL "BVG288"
KDF_FM03_SAM_CNT 1
KDF_FM END
To
establish a triggering response time threshold that will cause monitoring to begin on any volume
that exceeds it, you must use a combination of the MSR
parameter (to specify the
threshold millisecond response time) and MSRTARG
(the number of times a volume must
exceed the MSR value in 100 consecutive samples for monitoring to be turned on). Once monitoring
begins for a volume, it will continue until 100 consecutive samples are taken in which the volume
does not exceed the threshold. The following example will cause monitoring to begin on a volume when
its response time exceeds 20 milliseconds 51 times in 100 consecutive
samples:
KDF_FM BEGIN
KDF_FM01_ROW 01
KDF_FM01_VOL "*"
KDF_FM01_MON_STAT MSR
KDF_FM01_SAM_CNT 20
KDF_MSR_TRIP_CNT 51
KDF_FM END
To
specify dataset I/O monitoring only for specific jobs, you should define these jobs in the
“Application Summary” workspace in the TEP and specify "I/O Monitor Status" =
"Start"
in the definition dialog. This will save resources by monitoring only the datasets
used by this job (or jobs if the definition uses a job name mask).