Technical Blog Post
Abstract
Common performance problem of IBM Tivoli Monitoring (ITM) Data Warehousing
Body
|
File size of the short-term history data stored in local file system grows continuously, data were not able to be warehoused fast enough. Efficient data warehousing question list 1. What data need to be collected and used? Predict and control size of historical data 1. Start with your Use Cases. Consider the use cases for each attribute group when configuring historical collection. 2. Gather ONLY the data you need from ONLY the systems you need it from. Example: 1000 identical Web Servers in a farm may need historical data for performance planning purpose from a few systems. Example: Filter the history collection for PROCESS with VSIZE>50MB. 3. Process the data only as you need it. - Standard TCR reports & ITPA use only hourly and daily summarization. - Adaptive / dynamic thresholding uses only detailed data. - Never summarize high-volume, low repeatability data unless you have use case that demands it. 4. Keep the data for ONLY as long as you need it. 5. Manage the Warehouse Database. - No substitute for DBA input. - Use the Projects Spreadsheet and keep it up to date as requirements and environment changes. - Monitor the Warehouse Database for problems . Warehouse Load Projections 1. Get the ITM Data Warehouse Load Projects spreadsheet and documentation
2. Estimate the amount of disk space and network throughput required for historical data collection and warehousing. - Total TDW inserts per hour - Total TDW MB of data inserts per hour - Total GB of data in TDW 3. Estimate the disk space required for short-term historical data on Agent. 4. Find out the largest tables that make up most of the warehouse data. Historical data location 1. Choose TEMA in favour of TEMS. 2. Performance impacts when collecting data at the TEMS: - Performance implications for the data gathering step in large scale environments with several hundred agents connected to one TEMS. - Large chunk of data warehousing from one box vs. much smaller chunk of data warehousing from multiple boxes. - All agents write to one file for each attribute groups. Slow workspace response time for shot-term historical data. - Demands large disk storage, CPU, and memory usage of the TEMS. - Reduces the number of agents managed by the TEMS. - Loses the access to the short term historical data during failover to a secondary TEMS. WPA locating 1. Typically placed on the same LAN segment as the Warehouse database to allow for the best throughput to the database. 2. Use multiple WPAs when the number of agents collecting historical data > 1500 3. Recommended deployment is to deploy one WPA on each RTEMS. - If the primary RTEMS goes down and the agent fails over to the secondary RTEMS, the agent can upload historical data through the WPA on the secondary RTEMS. - Limiting the number of agents that upload historical data through a single WPA.
Multiple WPAs 1. Multiple WPAs provide greater scalability and performance, as well as the failover mechanism. 2. All WPAs must be configured to connect to HUB TEMS, not RTEMS. 3. Use KHD_WAREHOUS_TEMS_LIST in config file of WPA to specify the list of TEMS. The agents directly connected to these TEMSes send historical data to that WPA, 4. A TEMS name must be configured in one KHD_WAREHOUS_TEMS_LIST.
How WPA collects and transfers data to the warehouse WPA tuning Exporter threads remove export buffers from the work queue. Next, they prepare SQL statements. Finally, they insert data into the warehouse by JDBC/ODBC. KHD_EXPORT_THREADS – number of exporter threads KHD_CNX_POOL_SIZE – number of database connections - Both with default of 10. - Configure these two variables to the same value. - If you have multiple WPAs, the default value can cause performance bottleneck problems at the warehouse. Consider to reduce the number of exporter threads at each proxy. 2. Work queue: KHD_QUEUE_LENGTH – Size of the work queue - Default is 1000 - Set to the number of clients that regularly upload data to the WPA or more. - If the Warehouse Proxy log files show a significant number of rejected requests, then consider increasing the value. 3. NCS listen thread CTIRA_NCSLISTEN – number of Network Computing System (NCS) listen threads allocated to process incoming RPCs. - Default is 10 and maximum is 256. - Increase this value to improve concurrency performance. - Starting from 10 NCS listen thread per single export thread. - Increase the value when numerous RPC errors are found in the WPA's log file. 4. Batch inserts Enable WPA to submit multiple execute statements to the warehouse database. KHD_BATCH_USE=Y - Using batch inserts is recommended, and is the default setting 5. Make sure the nofiles is set higher than the number of agents that are uploading data through the WPA. (AIX or Linux) # ulimit -n 6. Extend the Java heap size (AIX or Linux) KHD_JAVA_ARGS=-Xmx256m -Xmx768m 7. Disable logging to the WARREHOUSELOG by setting KHD_WHLOG_ENABLE=N. Enable the following tracing instead to record the agent, attribute group, number of rows exported, which provides useful information for performance analysis with negligible overhead. KBB_RAS1=ERROR (UNIT:khdxdbex OUTPUT) Data Warehouse tuning 1. Separate the largest tables into separate table spaces or data files. 2. “Tivoli Data Warehouse tuning” chapter of the Tivoli Management Services Warehouse and Reporting Redbook 3. “Optimizing and performance” chapter of the IBM Tivoli Monitoring: Implementation and Performance Optimization for Large Scale Environments Redbook Update the S&P configuration file to set RAS1 tracing:
Update the WPA configuration file to set RAS1 tracing:
Reference Materials Managing historical data - Administrator's guide ITM Installation Guide, chapter 18, Performance tuning - Tivoli Data Warehouse |
UID
ibm11278352

