IBM Transaction Analysis Workbench for z/OS Can Help Identify Delays and Problems
rdeem 110000ESR2 Visits (6975)
'Dear Rocky' answers your System z monitoring and performance tuning questions.
Rocky McMahan, a Senior Software Performance Engineer in z System Software R&D, offers valuable tips to help you get more out of your z System monitoring software.
IBM Transaction Analysis Workbench for z/OS is a tool for investigating transaction problems on z/OS, especially transactions that use multiple subsystems including CICS, Db2, IMS and MQ. It exploits the historical log data collected during normal transaction processing. Transaction Analysis Workbench can correlate the log and trace data from each subsystem to provide an end-to-end lifecycle view of the transaction, to help identify where delays and problems occur.
Transaction Analysis Workbench uses a collaborative and multi-faceted approach to problem solving. The session manager provides the focal point for problem determination and where the new problem is registered. All activity associated with the problem is attached to the session, allowing all stakeholders access to the same information.
The session manager can be customized to perform most activities associated with problem determination:
A Typical Problem
The help desk has opened a problem ticket because customers are complaining about a slowdown in online shopping. The CICS-Db2 programs at the heart of the distributed web application form part of the investigation into this problem. The systems programmer must determine if there was a problem processing transaction and database requests on z/OS and, if so, identifying where the problem occurred. A Transaction Analysis Workbench session is registered and the SMF and Db2 log files are located for the time period of the problem. The investigation begins.
Batch reporting remains an important first step in problem determination. Transaction Analysis Workbench provides some basic SMF reporting to complement your favorite specialist reporting tools. For example, CICS Performance Analyzer reports can be requested from the session and the output saved in the session, making it available to everyone with an interest in the problem. Reports tailored to the types of workloads you run in production can quickly pinpoint potential problems. Figure 1 is a CICS PA report shows the top 5 worst performing CICS transactions that use Db2 and IMS.
Program #Tasks Response CPU CPU CPU Elapsed Suspend Calls Calls Elapsed
SHOP 1658 17.57423 .06942 .029315 .040108 17.47884 1.110752 950 48 .. 462.1441
SEARCH 168 91.22778 10.47707 1.477533 9.780253 81.95536 .182136 6551 69867 .. 359.0706
BASKET 198 20.91302 .21291 .003037 .210384 20.73922 .089030 27 3 .. 329.1382
ORDER 568 40.67416 2.00776 .185389 1.823447 15.11097 .155891 352 14396 .. 250.6128
COMPARE 280 92.00324 .54857 .028909 .522165 4.84645 .003765 2 3462 .. 170.1749
Figure 1: CICS PA Report
Analytics is an ideal method for visualizing problems over long or short periods of time—trends and exceptional conditions are easily identified. The log data collected for problem analysis can be fed into analytics, where charts often provide clues for identifying the problem. Dashboards provide the perfect mechanism for analyzing the various logs from all participating subsystems, often allowing you to answer the perennial question very quickly: which subsystem is the likely cause of the problem? Transaction Analysis Workbench provides log forwarding to analytics platforms including Splunk and Elastic, as well as a series of dashboards that can help you analyze the data.
The following Splunk dashboard provides a combined CICS and Db2 perspective of transaction activity. This single dashboard consists of five charts:
Figure 2:CICS-Db2 Transaction Performance Dashboard
Figure 2 above, shows the ability to click and drag over the offending time period in the chart enables you to quickly drill down to locate the offending transactions. The drill down can be seen in figure 3 below.
Figure 3: CICS-Db2 Transaction Table Row
Analytics also provides a modern graphical means for sharing information about transaction problems on z/OS with your colleagues, including application developers and managers, who may not have a deep understanding or appreciation of the complexity of transaction activity on z/OS. Operations analytics helps bring what happens on the mainframe to a wider audience in your enterprise.
Interactive Problem Determination
Perhaps the most important feature of Transaction Analysis Workbench is its interactive log browser. All logs and traces associated with the problem can be merged and viewed as a single logical browse session. Productivity aids for time navigation and record filtering help to identify rogue transactions, and once identified, “transaction tracking” then correlates and displays only those records associated with the selected transaction.
The offending CICS-Db2 transaction is quickly identified. Tracking the transaction reveals that it processed across four MRO CICS regions, and one of the regions used Db2 (see figure 4 below).
Figure 4: CICS-Db2 Transaction Tracking Interface
Typically, traces such as the CICS auxiliary trace and Db2 performance trace aren’t activated in production. However, sometimes the problem is too complex to diagnose without them.
Traces can be activated for short periods of time while attempting to reproduce the problem.
When the traces are available, the interactive session is greatly enhanced. Transaction tracking incorporates the trace events with the normal log events to provide a more complete lifecycle picture of what occurred. The SQL call that is causing the delay is easily identifiable (see figure 5 below).
Figure 5: End-to-End View of Transaction Lifecycle
Find the Problem
Transaction Analysis Workbench can analyze many types of log and trace data that are generated by the major z/OS subsystems. Information is presented in friendlier ways that make problem determination easier and relevant to more people in your enterprise.
The log browser is also educational an education tool, a teaching aid toward understanding how your applications work across the various subsystems.
Finally, Operations Analytics offers a new perspective on performance data, enabling you to quickly and more confidently answer the age-old question: where is the problem?