Historical data collection causes CPU spikes
rdeem 110000ESR2 Visits (1022)
Article Date: 01 Nov 2009
'Dear Rocky' answers your System z monitoring and performance tuning questions.
Rocky McMahan, a Senior Software Performance Engineer in Tivoli R&D, offers valuable tips to help you get more out of your Tivoli monitoring software.
Our IBM® Tivoli® OMEGAMON® XE for Mainframe Network V4.10 monitoring agent appears to spike in CPU usage once a day, and the spike appears for over 10 minutes! Any idea what could be the problem? - Spike
Dear Spike: It appears you are collecting a large amount of historical data using the Tivoli OMEGAMON XE for Mainframe Network V4.10 agent. As you can imagine, especially in a large IT shop, collecting data every five minutes (the default collection interval for the OMEGAMON XE for Mainframe Networks agent) on each of your TCP/IP connections can result in a large amount of data. Further, collecting data for each TN3270 server session and TCP/IP details can result in a very large amount of data.
Although your example applies to Tivoli OMEGAMON XE for Mainframe Networks, the following solution applies to all IBM z/OS®-based OMEGAMON agents.
Historical data collection can be configured at the Tivoli Enterprise Portal. The IBM Tivoli Monitoring-based OMEGAMON platform provides the following types of historical data collection:
The collection interval for historical data can be configured differently for short- and long-term history in Tivoli Monitoring V6.2. You can configure a short-term historical data collection interval of 1, 5, 15, 30, or 60 minutes, or 1 day. The interval for sending the data to the Tivoli Data Warehouse for long-term storage can be configured for 24 hours, one hour, or off.
If you have configured long-term history to write every 24 hours, that’s likely the cause of your CPU spike due to the overhead associated with large tables on large datasets. Change to a warehousing interval of one hour to spread the writing across 24 hours, which will reduce the duration of each CPU spike.
In addition, there is a fix available (APAR OA25646) that can significantly reduce CPU usage when writing data from a persistent data store. Our performance team has seen CPU reductions of up to 70 percent when large attribute groups in the persistent data are collected and written to the Tivoli Data Warehouse.
Computer science always seems to involve tradeoffs – and historical data collection is no different. In this case, collecting a large amount of data results in higher CPU usage. The key is to find the correct balance for your situation.