Typical performance review questions

Use the following questions as a basis for your own checklist when carrying out a review of performance data. Many of these questions can be answered by performance reporting packages such as CICS® Performance Analyzer or IBM Z® Decision Support for z/OS®.

Some of the questions are not strictly to do with performance. For instance, if the transaction statistics show a high frequency of transaction abends with usage of the abnormal condition program, there might be sign-on errors and, therefore, a lack of terminal operator training. This situation is not a performance problem, but is an example of the additional information that can be provided by monitoring.

What are the characteristics of your transaction workload?
1. Has the frequency of use of each transaction identifier altered?
2. Does the mix vary from one time of the day to another?
3. Should statistics be requested more frequently during the day to verify this?
A different approach must be taken:
- In systems where all messages are channeled through the same initial task and program (for user security routines, initial editing or formatting, statistical analysis, and so on)
- For conversational transactions, where a long series of message pairs is reflected by a single transaction
- In transactions where the amount of work done relies heavily on the input data.
In these cases, you must identify the function by program or data set usage, with appropriate reference to the CICS program statistics, file statistics, or other statistics. In addition, you might be able to put user tags into the monitoring data (for example, a user character field in the case of the CICS monitoring facility), which can be used as a basis for analysis by products such as CICS Performance Analyzer for z/OS, or IBM Z Decision Support.
What is the usage of the telecommunication lines?
1. Do the CICS terminal statistics indicate any increase in the number of messages on the terminals on each of the lines?
2. Does the average message length on the CICS performance class monitor reports vary for any transaction type? This can easily happen with an application where the number of lines or fields output depends on the input data.
3. Is the number of terminal errors acceptable? If you are using a terminal error program or node error program, are there any line problems?
What is the DASD usage?
1. Is the number of requests to file control increasing? Remember that CICS records the number of logical requests made. The number of physical I/O operations depends on the configuration of indexes, and on the data records per control interval and the buffer allocations.
2. Is intrapartition transient data usage increasing? Transient data involves a number of I/O operations depending on the queue mix. Review the number of requests made to see how it compares with previous runs.
3. Is auxiliary temporary storage usage increasing? Temporary storage uses control interval access, but writes the control interval out only at sync point or when the buffer is full.
What is the virtual storage usage?
1. How large are the dynamic storage areas?
2. Is the number of GETMAIN requests consistent with the number and types of tasks?
3. Is the short-on-storage (SOS) condition being reached often?
4. Have any incidents been reported of tasks being purged after deadlock timeout interval (DTIMOUT) expiry?
5. How much program loading activity is there?
6. From the monitor report data, is the use of dynamic storage by task type as expected?
7. Is storage usage similar at each execution of CICS?
8. Are there any incident reports showing that the first invocation of a function takes a lot longer than subsequent ones? This situation can occur if programs are loaded that then need to open data sets, particularly in IMS, for example. Can a change in application design rectify the problem?
What is the processor usage?
1. Is the processor usage as measured by the monitor report consistent with previous observations?
2. Are batch jobs that are planned to run, able to run successfully?
3. Is there any increase in usage of functions running at a higher priority than CICS? Include MVS readers and writers, MVS JES, and z/OS Communications Server if running above CICS, and overall I/O, because of the lower-priority regions.
What is the coupling facility usage?
1. What is the average storage usage?
2. What is the link utilization?
Do any figures indicate design, coding, or operational errors?
1. Are any of the resources heavily used? If so, was this situation expected at design time? If not, can the heavy usage be explained in terms of heavier usage of transactions?
2. Is the heavy usage associated with a particular application? If so, is there evidence of planned growth or peak periods?
3. Are browse transactions issuing more than the expected number of requests? In other words, is the count of browse requests issued by a transaction greater than what you expected users to cause?
4. Is the CICS CSAC transaction (provided by the DFHACP abnormal condition program) being used frequently? If so, is this occurring because invalid transaction identifiers are being entered? For example, errors are signaled if transaction identifiers are entered in lowercase on IBM® 3270 terminals but automatic translation of input to uppercase has not been specified.
  A high use of the DFHACP program without a corresponding count of CSAC could indicate that transactions are being entered without correct operator signon. This situation might indicate that some terminal operators need more training in using the system.

In addition, review regularly certain items in the CICS statistics, such as:

Times the MAXTASK limit is reached (transaction manager statistics)
Peak tasks (transaction class statistics)
Times cushion is released (storage manager statistics)
Storage violations (storage manager statistics)
Maximum number of RPLs posted (z/OS Communications Server statistics)
Short-on-storage count (storage manager statistics)
Wait on string total (file control statistics)
Use of DFHSHUNT log streams
Times auxiliary storage is exhausted (temporary storage statistics)
Buffer waits (temporary storage statistics)
Times string wait occurred (temporary storage statistics)
Times NOSPACE occurred (transient data global statistics)
Intrapartition buffer waits (transient data global statistics)
Intrapartition string waits (transient data global statistics)
Times the MAXSOCKETS limit is reached (TCP/IP statistics)
Pool thread waits (Db2® connection statistics)

Review the effects of and reasons for system outages and their duration. If there is a series of outages, there might be a common cause.