Welcome to the System z Management Blog, where you can read the perspectives from System z experts. This Blog provides insights into the System z solution, as well as technical details about specific IBM products.
Many systems programmers spend hours debugging problems that involve looping or hung transactions. By working together, z/OS and CICS systems programmers resolve these problems much faster by combining the information available from OMEGAMON XE on z/OS and OMEGAMON XE for CICS on z/OS.
A typical scenario is that the z/OS systems programmer identifies an address space with a high (or low) CPU Percent. Most often, this is in response to a situation alert or a reported problem. Starting with the Address Space Overview workspace in OMEGAMON... [More]
Deadlocks and timeouts are unresolved contentions in DB2. These unresolved contentions might cause incomplete transactions and might degrade the overall performance of your system. It is important to monitor and resolve these contentions. Tivoli OMEGAMON XE for DB2 Performance Expert on z/OS (OMPE) is invaluable in detecting and analyzing DB2 for z/OS deadlocks and timeouts. The following link takes you to some of IBM's recommended best practices using OMPE to identify where these are occurring then what steps to take to resolve the... [More]
The following lists my recommendations for monitoring WebSphere Message Broker or IBM Integration Bus with OMEGAMON XE for Messaging:
For monitoring WebSphere Message Broker v7 and v8, and IBM Integration Bus v9, use v7.1.0 of the WebSphere Message Broker monitoring agent. That is FMID HKQI710 on z/OS. You'll need the latest PTF levels for later broker releases, which includes PTFs UA71110 and UA69753. On distributed platforms for the same agent in ITCAM for Applications, this level is equivalent to the... [More]
ITCAM for Transactions data collectors on z/OS have filtering mechanisms for selecting portions of workloads for tracking and ignoring the rest. The overhead incurred by ITCAM for Transactions in a z/OS address space can be reduced, often significantly, through workload filtering where only units-of-work that are of interest are selected for tracking and the rest are excluded. This can be especially important where the overhead for ITCAM for Transactions to track an entire workload is considered too costly.
With ITCAM for Transactions... [More]
'Dear Rocky' answers your System z monitoring and performance tuning questions.
Rocky McMahan, a Senior Software Performance Engineer in Tivoli R&D, offers valuable tips to help you get more out of your Tivoli monitoring software.
My company is a large Healthcare organization with ~5,000 employees, providing patient care to over 7500 people, & had revenue of 8 billion dollars last year. We are in the process of migrating several of our large patient... [More]
Why monitor CSA and ECSA? Its simple, when common storage (CSA, ECSA, SQA, and ESQA) is exhausted, a system outage will occur. When ESQA is exhausted, further requests for ESQA are allocated from ECSA. When ECSA is exhausted, further requests are allocated from the much smaller CSA storage. The main reasons why common storage shortages occur:
Not enough common storage is allocated
Activity on the system exceeds previous levels, resulting in additional common storage being allocated
An address space allocates... [More]
OMEGAMON XE for zOS 5.1.1 alerts in this scenario when SP13 is not active to the zAware appliance. Then zAware could be missing SP13 having an anomaly that could cause a major issue. Subject Matter Expert is alerted when OMEGAMON sees zAware not connected for SP13. The SME checks and see the LOG STREAM was not connected to zAware at the latest IPL as it should of been. The SME connects SP13 to zAware and now OMEGAMON can alert if zAware finds an anomaly
Now with OMEGAMON XE for zOS V5.1.1 just announced you can be aware your zAware has found an issue. The US announcement letter is here:
IBM Tivoli OMEGAMON XE on z/OS V5.1.1 offers new visibility and proactive notifications of IBM System z Advanced Workload Analysis Reporter (zAware), which is an integrated, self-learning, analytics solution for IBM z/OS that helps identify unusual... [More]
For OMEGAMON and ITM customers we have provided a new sample workspace for customers to use in the e3270ui. The KOBSITEC workspace allows the ITM Situation Events to show on the e3270ui. This workspace is part of PTF UA67643. Once installed call the workspace to see the current ITM Situation status on your e3270ui. Check out the example display. Invoke the workspace with a "=KOBSITEC" on the command line. Proactive monitoring is so important to allow situations to run on each system and alert you... [More]
With OMEGAMON XE for zOS the KM5_CPU_Loop_Warn Situation allows alerting of possible cpu looping address spaces on an LPAR. This unique function allows you to find these jobs before they use lots of your valuable CPU processor time. I have seen from our default settings of this situation to many false positive results while monitoring our LPARs. What I have done to reduce greatly these false positives is to change the formula by editing the situation. I have changed the warning limit from 95% to 98%. ... [More]
Here is an issue I ran into today. Lets look at the pdf file attached:
Process Gone WIld PDF
On Page 2 - I saw two alerts this AM - One for a possible CPU Looping address space on LPAR SYS and a second one for a USS Process using a large amount of CPU time. Lets look at these issues on LPAR SYS.
On Page 3 - We go to OMEGAMON XE for zOS and look at LPAR SYS and ask which Address Space maybe Looping? It is CTG90GP. Now I can cancel this right here but I... [More]
In the attached PDF we show how OMEGAMON XE for zOS 5.1.0 can prevent a z/OS LPAR from doom by warning the USS Processes pool is close to full.
On slide 2 the e3270ui Situation Event Console workspace alerts me that the SP14 USS Process pool is over 90% full. If it gets to 100% the LPAR is in trouble and may not be able to continue to process work.
On slide 3 you can see that I then went over to look at USS Processes in use of SP14. Who is using most of the 282 processes out of the max of 300?
On slide 4 we... [More]
Check out the OMEGAMON XE on zOS 5.1.0 video showing a problem solving scenario "
Resolving a Looping CPU Job issue on zOS Issue Using OMEGAMON XE 5.1.0 Enhanced 3270 Interface Workspaces"
Video Resolve CPU Issue
Do you ever wonder who is using the most resources in your SYSPLEX right now? Now with OMEGAMON XE for zOS 5.1.0 in the Enhanced 3270 there is workspace that shows the top 10 consumers in your SYSPLEX. It lists the top 10 jobs (address spaces) currently using CPU, Real Storage, Virtual Storage, CSA, ECSA, i/O, Enqueue conflicts and worst Service Class Performance indexes. Here is the first page of the workspace, you can scroll down for the rest or scroll down in each sub panel to see all 10. Quick view of that... [More]