Real-time monitoring and more cool features in the new InfoSphere Optim Performance Manager V5.1 for DB2 for Linux, UNIX, and Windows

IBM® InfoSphere® Optim® Performance Manager V5.1 for Linux®, UNIX® and Windows® is a major step forward in bringing the ability to monitor very large workloads spanning 100 plus partitions. Key to these enhancements are architectural changes including collecting state-of-the-art in-memory metrics for DB2® V9.7 or higher databases instead of relying on snapshots. Additionally InfoSphere Optim Performance Manager V5.1 provides a rich set of new monitoring features that are surfaced in the Web-based user interface like real-time monitoring, in-depth connection analysis and enhanced SQL analysis including static SQL and cross-baseline comparisons. Integration with InfoSphere Optim Query Workload Tuner assists in tuning multiple SQL statements at the same time. Enriched reporting on detailed recent performance data or on aggregated long-term historical data provide the capabilities you need for proactive tuning and capacity planning. For pureScale® or DPF performance analysis all dashboards now facilitate partition-level or member-level drill downs enabling easy viewing and comparing. Optim Performance Manager provides easy configuration of DB2 Workload Manager by deploying the newest best practices template to get up and running quickly with DB2 Workload Manager.

Ute Baumbach (bmb@de.ibm.com), Software Developer, IBM

Author photoUte Baumbach has been a software developer at the IBM lab in Germany for 18 years, where she has worked in various software development projects and roles. Most of her projects were based on DB2. For five years, she has worked as a member of the DB2 Performance Expert development team, now the Data Studio Performance Management development team. Ute is an IBM Certified Database Administrator and a Certified Application Developer for DB2 for Linux, UNIX, and Windows.



15 December 2011

Introduction

IBM InfoSphere Optim Performance Manager V5.1 helps businesses improve performance and reduce costs by giving database administrators (DBAs) and other IT staff the information needed to manage performance pro-actively to help do the following.

  • Prevent problems before they impact the business.
  • Save hours of staff time and stress.
  • Align monitoring objectives with business objectives.

Its Extended Insight capability, available as part of the InfoSphere Optim Performance Extended Edition or InfoSphere Optim Performance Manager Extended Insight, is unique in database performance monitoring, giving database professionals the visibility across the application stack to where database workloads are spending their time. It provides problem isolation to the appropriate layer of the stack, extensive introspection into database bottlenecks, and the ability to monitor response time objectives for the highest priority workloads.

InfoSphere Optim Performance Manager V5.1 was announced on 06 December 2011 and supports monitoring of DB2 for Linux, UNIX, and Windows V9 databases, including single partition, multi-partition, and pureScale databases. It is the follow-on version of Optim Performance Manager V4.1 which was introduced in the article "What's new in Optim Performance Manager Extended Edition for DB2 for Linux, UNIX, and Windows", available in the Resources section at the end of this article.

InfoSphere Optim Performance Manager offers a rich set of new monitoring features and delivers the flexibility and efficiency required to keep up with large transaction volumes from SAP accounts, to queries spanning 100 plus partitions. Central to the capabilities of V5.1 are key internal enhancements to how InfoSphere Optim Performance Manager accesses DB2 monitoring data and stores its data in the repository. When monitoring DB2 9.7 or later databases, InfoSphere Optim Performance Manager uses state-of-the-art DB2 in-memory metrics for accessing the majority of the performance data. This provides a lighter weight monitoring infrastructure than the snapshot monitoring still used on DB2 9.1 and 9.5 databases.

In addition, movement to the in-memory metrics has facilitated real-time data collection and sub-minute refreshes for live problem diagnosis.

When monitoring DB2 9.7, or later databases, InfoSphere Optim Performance Manager also uses a more efficient schema for storing performance data in the repository. This schema has been optimized to reduce the disk footprint for performance data, improve retrieval times, and facilitate rolling aggregations. The result is a single schema used for all performance data that feed the performance dashboards and reports alike.

To further reduce the disk footprint, the use of row compression is enabled and included in entitlement when InfoSphere Optim Performance Manager and DB2 Storage Optimization Feature are purchased together as part of a single offering such as DB2 Advanced Enterprise Server Edition, or InfoSphere Warehouse Advanced Enterprise Edition.

New features available in V5.1

In this article, you'll take a tour of the new and enhanced capabilities in InfoSphere Optim Performance Manager V5.1, including the following.


Automatic performance history aggregation and retention

Retaining long-term performance data is important for proactive tuning, trend detection, and capacity planning. Now it is easier than ever to retain performance data at 1 minute, 15 minute, 1 hour, and 1 day rolling aggregations. When you configure a database for monitoring, you choose how long to retain each aggregation level, as shown in Figure 1. Aggregation level 1 consists of the data as collected from the monitored database in the specified sampling interval. The sampling interval is set to 1 minute, therefore aggregation level 1 consists of data in 1 minute intervals.

Figure 1. Retention times and sampling intervals specification
screen shows inflight performance data collection settings and data retention settings

Figure 2 shows how the rolling aggregation works. The aggregation to the next level takes place as soon as the intervals are reached. For example, aggregation to level 2 takes place at 0, 15, 30, and 45 minutes past the hour. It takes all entries of aggregation level 1 within the 15 minutes and aggregates them to one single entry in aggregation level 2.

Figure 2. Rolling aggregation concept
shows how data is aggregated at various levels

The full range of retained data is viewable directly from the dashboards and from the reports. Depending on the reporting time frame you choose, InfoSphere Optim Performance Manager determines the optimal aggregation level to display on the dashboard or in the report. As shown in Figure 3, the dashboards in the time selection control let you can see the aggregation level that InfoSphere Optim Performance Manager is displaying.

Figure 3. Aggregation level of displayed data in time selection control
shows time selection

Seamless analysis of performance data from real time to any time

All inflight dashboards monitoring of DB2 9.7 or later databases that are using in-memory metrics now feature the ability to display data collected at any point in time, whether in real time, yesterday, last week, or last year. All dashboards give you access to the full range of collected information.

For example, you can switch to real-time monitoring if you want to analyze current performance problems and need to see the current activity on your database in sub-minute refresh rates. Both manual and automatic refreshes are possible on each inflight dashboard via the time selection control as shown in Figure 4. With each refresh, new monitoring data is collected from the monitored database and displayed, but not saved in the repository database.

Figure 4. Time selection control for real time mode
screen shows real-time data settings

In real-time mode, the monitored database is accessed in addition to the regular history collection. This increases the workload on the monitored database. InfoSphere Optim Performance Manger allows you to control the number of users that are allowed to do real-time monitoring of a database by introducing the new canMonitorInRealTime privilege that you can grant to users.

Access to long term historical data gives you access to performance metric trends. Expanding the duration of the viewed time frame, you can immediately see how a performance metric is affected by a configuration change or change in workload.


In-depth connection analysis

Use the new Connection Dashboard to identify and analyze your top connections and their activities and resource consumption either in real time or in history. You can identify the top connections by your selected top metric, for example CPU time.

Previously, figure 4 showed the list of the 100 highest connections by CPU time. In addition to CPU time, other key performance indicators are displayed in the grid. You can use the Choose Columns dialog to customize the grid to your needs. For example, if you list the top connections by Lock wait time, then you might be interested in other lock specific performance indicators as well and could add them to the grid, such as the number of Lock escalations, Deadlocks, Lock timeouts or Lock waits.

Select one of your top connections to analyze further. The lower part of the dashboard displays all the details of the connection arranged in different tabs, including application details, workload and service class information, server times, I/O details, row and transaction details, locking information, and utility execution. Most of the details are shown in graphical format for easy insight. Figure 5 shows you the Server Times details of the selected top CPU time connection. The different time distributions for processing times and wait times are shown in pie charts.

Figure 5. Connection dashboard
shows filter and action controls, button to launch a connection report, and displays details of top connections

(View a larger version of Figure 5.)

You can force a connection directly from the dashboard by clicking the Force button. Or you can create an in-context connection report directly from the dashboard by clicking the Connection Report link to share it with colleagues.

Click the Show Executed SQL button to analyze the top SQL statements that this connection is either executing now in real time, or has executed in the selected history time frame. The SQL Dashboard opens and lists the top individual SQL executions of the selected connection as shown in Figure 6.

Figure 6. Top SQL executions of selected connection
Statements of the selection connection only are displayed

(View a larger version of Figure 6.)


Enhanced SQL analysis

The new SQL Statements Dashboard makes it easy to identify the SQL taking the most resources, whether it is repeated short running queries or high cost analytical queries. You can view top individual executions of SQL statements (Top Individual Executions view) or aggregated metrics across executions (Execution Summary view). For both views you can identify the top SQL statements by your selected top metric, for example Execution Elapsed Time. Figure 7 shows the execution summary of the 20 highest statements by Execution Elapsed Time. In addition to the Execution Elapsed Time other key performance indicators are displayed in the grid. You can use the Choose Columns dialog to customize the grid to your needs.

Figure 7. Execution Summary view on SQL Statements Dashboard
screen includes view selection, filter and action controls, details of statements, and in-context switching between views

(View a larger version of Figure 7.)

If you monitor a DB2 V9.7 databases and therefore use the in-memory metrics for data collection, then both views collect and display static and dynamic SQL performance metrics. For the Top Individual Executions view, the data is collected from in-memory metrics functions that provide detailed information about currently executing statements. InfoSphere Optim Performance Manager calls these functions in the specified sampling rate or in real time. For the Execution Summary view, the data is collected from in-memory metrics functions that provide the statements and their aggregated execution details from the package cache.

Select one of your top SQL statements to analyze further. The lower part of the dashboard displays all the details of the statement arranged in different tabs, including the complete statement text, workload and service class information, server times, I/O details, row activities or locking information. As an example, shown previously, Figure 7 displays in the details section the statement text, the most recent identification of the statement in the package cache, and details about the most recent compilation of the statement, such as cost estimation (timerons).

You can quickly switch in context between the Top Individual Executions and Execution Summary. For example, from the Execution Summary view, select a statement and click the Show top individual executions of the selected statement link to analyze resource consumption differences across distinct executions based on different parameter values. Or from the Top Individual Executions view, select a statement and click the Show the execution summary of the selected statement link to analyze the complete resource consumption of this statement over all executions and to analyze the fraction of resource consumption of this statement compared to other statements.

You can tune your SQL statements by using the integration with InfoSphere Optim Query Workload Tuner 3.1. If you have Optim Query Workload Tuner installed on the machine from which you use the Performance Manager web interface, then you can click the Tune and Tune all buttons to either transfer a selected SQL statement or a complete list of SQL statements to Query Workload Tuner. For tuning single statements, it is still possible to use Optim Query Tuner V2.2 with fix pack 2 or later. For single statement formatting, access plan visualization, and statistics advice, you can use IBM Data Studio 3.1.

Integration with IBM InfoSphere Optim Configuration Manager lets DBAs quickly answer the question "What changed?" when there is a performance degradation. Configuration Manager helps organizations discover, manage and track their database and client inventory, configuration, and deployments. Click View Configuration Changes to launch Configuration Manager in context to view changes relevant changes so you can quickly determine whether recent changes may be the underlying cause of performance issues.

In addition to the SQL Dashboard enhancements, new reports are available to analyze static and dynamic SQL. The Top Package Report identifies the top resource consuming packages. The SQL Comparison Report lets you compare SQL performance across two time periods to identify SQL improvements and regressions across change deployments, tuning activities, or version migrations.

Drill down from a table to the SQL that uses the table

If you see hot tables in the Buffer Pool and I/O Dashboard that, for example, show a high number of rows read, then you can now select such a table and drill down to the SQL statements that use that table. Figure 8 shows the table view on the Buffer Pool and I/O Dashboard. Select the table with the highest number of rows read and click the Show SQL button.

Figure 8. Table view on the Buffer Pool and I/O Dashboard
Show SQL button in header of screen

(View a larger version of Figure 8.)

The Execution Summary view of the SQL Dashboard opens when a filter is set using the table name that you selected before. All SQL statements having the table name in the statement text are shown on the SQL Dashboard as shown in Figure 9 and can be analyzed further.

Figure 9. SQL statement using a selected table
shows list of SQL statements

(View a larger version of Figure 9.)


Enriched reporting for proactive tuning and capacity planning

InfoSphere Optim Performance Manager 5.1 introduces a set of new predefined reports. In addition, the reports available in 4.1 have been enhanced to exploit the in-memory metrics data. Most predefined reports are interactive and therefore are like multiple reports in one. For example, from the new Performance Overview report, you can launch detail reports for a package, an SQL statement, a table space or a table.

You can report over any time frame of your available historical data and you can now create the reports not just from the InfoSphere Optim Performance Manager web user interface, but also directly from the command line from any machine. This allows scheduled creation of reports. Reports can be printed or exported to xls, ppt or PDF formats.

The following are new and enhanced reports.

  • The Database Connection Report is an overview of the active database connections for a given time frame. The report displays key performance indicators, such as lock wait times, physical and logical reads and writes, and other connection statistics. This report can identify applications that are not performing well or applications that are causing problems in other database applications. You can drill down into a specific connection to view complete identification details, timing information, SQL activity, locks, cache, buffer pool, sorts, and agent-related activity.
  • The Performance Overview Report shows the overall health of the monitored database for all major metrics. You can view the complete database system or only parts of it.
  • The SQL Baseline Comparison Report lets you compare top SQL statements in the same database regarding the maximum improvement or regression, or both, within two time frames. You can drill down for a detailed analysis of a specific SQL statement. The report includes the complete statement text, general statement information, response time analysis, sort performance, I/O activity, and buffer pool activity. You can identify problems with insufficient system resources, or run InfoSphere Optim Query Workload Tuner to analyze problems with the access plan, or to evaluate inadequate optimizer statistics or indexes.
  • The Table Usage Report lets you identify hot or fast-growing tables that might cause disk contention or that might need reorganization.
  • The Top SQL Package Report allows you to identify top resource consumers by package name in a specified time interval. You can look at a graphic representation of the workload by day to identify heavy duty, critical, or rogue packages.
  • The Workload Manager Overview Report is enhanced with additional drill down into Service SuperClass, SubClass, and Work Action Set reports including details on queue time, execution time, lifetime, and histograms for advanced tuning of Workload Manager configurations. The Service SuperClass, SubClass, and Work Action Set reports can now also be launched from the Report launchpad.
  • The Top SQL and Disk Consumption reports contain additional enhancements requested by users. For example if you monitor a DB2 V9.7 database using in-memory metrics, the Top SQL report now shows static and dynamic SQL.

Figure 10 shows an example of the SQL Baseline Comparison Report. In the first part, it lists the five statements most regressed by Average CPU Time in the report interval compared to the baseline interval. The average CPU times per statement execution are shown for the report interval, and the comparing baseline interval and the changes are indicated by red bars. The second part lists the five statements most improved by Average CPU Time, and the changes are indicated by green bars. Additionally the report lists the other statements that were executed only either in the reporting interval or in the comparing baseline interval.

Figure 10. SQL Baseline Comparison report
shows areas of regression and improvements

(View a larger version of Figure 10.)

Click a statement for a detailed comparison of the performance metrics to understand why the statement regressed or improved. The details report for this statement opens and compares each single execution metric within the two intervals. As an example, Figure 11 shows the part of the details report where the execution times of a regressed statement are compared within the two intervals.

Figure 11. Baseline comparison details per statement
report interval pie chart shows statement processing and other processing time

(View a larger version of Figure 11.)

To tune a regressed statement or view the access plan for an improved statement, you can directly launch InfoSphere Optim Query Tuner or InfoSphere Optim Query Workload tuner from the details report for this statement.


Enhanced partition-level and pureScale performance analysis

All inflight dashboards facilitate partition-level analysis, enabling easy viewing of performance metrics across all partitions, a single partition, or a partition role. You can configure partition categories called roles for ease of viewing metrics across similar partitions such as coordinator, data, and Extraction, Transformation, and Loading (ETL). Similarly, if you monitor a pureScale database, you can view details across a data sharing group or for specific members.

Figure 12 shows you the partition/member picker control that you find on each inflight dashboard to select the partition, partition role or member of interest. By default the dashboards display performance metrics aggregated across all partitions or members.

Figure 12. Partition or member picker on inflight dashboard
Member picker lists all members, role: data partition, role: coordinator partition, and members 1, 2, and 0

If you use the default view and display the performance metrics across all partitions it is easy to identify skews. For any graph you can open the partition details and see the average value of this metric over the specified interval for each partition. This way you can identify the partition with the highest average value and then select this partition in the partition picker for further analysis. An example is shown in figure 13. The Transaction Throughput graph shows the transactions across all partitions for the specified interval. In the partition details you see the average value for Transactions and Failing Transactions for each partition.

Figure 13. Partition details for skew identification
shows transaction throughput graph and transaction throughput partition details

Performance metrics are displayed in grids by object on the various inflight dashboards. An object can be an SQL statement on the SQL Dashboard, a connection on the Connection Dashboard, or a buffer pool, table space or table on the Buffer Pool and I/O Dashboard. You can drill down to see the selected object across all partitions and therefore can compare the performance metrics for this object across partitions easily to identify skews. Figure 14 shows a drill-down example for a single buffer pool. The grid shows the buffer pool activity for the buffer pool IBMDEFAULTBP on each active partition. In this example more logical and physical reads take place on partition 0 than on the other partitions.

Figure 14. Buffer pool activity partition comparison
table shows buffer pool name, partition ID, page size, I/O time, pool hit ratio, logical page I/O, physical pages read

For pureScale performance analysis, new pureScale metrics are introduced in InfoSphere Optim Performance Manager 5.1. These include Cluster Caching Facility (CF) monitoring metrics like the following.

  • Group Buffer Pool Hit Ratio per connection, statement, buffer pool or table space
  • CF locking information, CF requests/time on connection or statement level
  • Page reclaim information
  • CF configuration parameters in database and database manager reports

New health alerts can notify DBAs or others of CF or member failures.

Optim Performance Manager V4.1.1 had previously introduced global monitoring metrics for pureScale CFs including the following.

  • CP CPU and memory utilization
  • Group Buffer Pool Hit Ratio per database
  • CF lock timeouts, lock escalations and transaction lock wait time per database

Easy configuration of DB2 Workload Manager

See the Resources section at the end of this article for information about how to use InfoSphere Optim Performance Manager to deploy the enhanced DB2 Workload Manager best practices template to get up and running quickly with DB2 Workload Manager for warehouse environments. Results have shown that the majority of performance improvements result from straight-forward deployments of DB2 Workload Manager. Deploy the default template, tune service class boundaries based on histograms of activity mappings, and enable governance to prevent rogue queries from taking over systems. For advanced users, the Workload Manager configuration interface now accepts any existing Workload Manager configuration without modification, and exposes all features and options for modification.


Operation and usability improvements

From up and running improvements, to new operational alerts, to internal efficiency improvements, to dashboard customizations and usability improvements, to compression enablement, this release is packed with features that you will value. Two examples of these improvements are illustrated in the following section.

Operational alerts are created if InfoSphere Optim Performance Manager detects issues during its operation that would lead to incomplete or not continuous collection of data. Operational issues are now surfaced in the web user interface on the Alert List, and can therefore be detected and solved fast. The alert details and descriptions help you to either solve the issue yourself or to give the details to the IBM support team. Figure 15 shows an example of an operational alert. In this example a statistics event monitor is already running on the monitored database which prevents InfoSphere Optim Performance Manager from starting another one.

Figure 15. Operational alert on the alert list
alert description highlighted

(View a larger version of Figure 15.)

On the Extended Insight Dashboard, as shown in Figure 16, bar charts are now available that graphically show you the distribution of the average response time over the client, network and data server layers. This allows you to identify, at a glance, where most of the time is spent for your database transactions workload. If you specify response time alert thresholds, then you also see the percentage of normal, warning alerts, and critical alerts as bar chart as well, which allows you to quickly identify workloads that don't meet service level objectives, as well as the ability to drill down into further detail.

Figure 16. Usability enhancement on Extended Insight Analysis Dashboard
response time alerts highlighted

(View a larger version of Figure 16.)


Packaging

InfoSphere Optim Performance Manager is available in the following five offerings.

  • InfoSphere Optim Performance Manager Enterprise Edition, Workgroup Edition and Content Manager Edition provide data server monitoring with integrated problem identification and notification, problem diagnosis, performance reporting, and DB2 Workload Manager configuration.
  • InfoSphere Optim Performance Manager Extended Edition provides all the capabilities of Enterprise Edition plus Extended Insight monitoring which includes the following.
    • Database transaction response time monitoring: Extended Insight captures the transaction when it begins at the database client and tracks its response time characteristics across the transaction.
    • Automated DB2 Workload Manager configuration adjustments based on response time objectives: Extended Insight adds new configuration options for DB2 Workload Manager. It enables users to set response time objectives for workloads and let InfoSphere Optim Performance Manager adjust resource allocation to achieve those objectives.
    • Integration with Tivoli offerings: Extended Insight also provides the correlation needed to integrate InfoSphere Optim Performance Manager with Tivoli offerings, specifically IBM Tivoli Composite Application Manager for Transactions, IBM Tivoli Composite Application Manager for Application Diagnostics, and IBM Tivoli Manager for Servers. Integration enables InfoSphere Optim Performance Manager contextual launching within the Tivoli Enterprise Portal.
  • InfoSphere Optim Performance Manager Extended Insight provides the Extended Insight feature capabilities as an add-on to InfoSphere Optim Performance Manager Enterprise Edition when purchased at a later time.

Conclusion

This article described the new enhancements in InfoSphere Optim Performance Manager V5.1 that help businesses to improve performance and reduce costs by giving database administrators (DBAs) and other IT staff the information needed to manage performance proactively to help with the following.

  • Prevent problems before they impact the business
  • Save hours of staff time and stress
  • Align monitoring objectives with business objectives

This version is packed with value for all our customers improving monitoring scalability for multi-partition and pureScale customers, real-time monitoring and historical analysis, DB2 Workload Manager deployment support, in depth connection and SQL analysis, and more.

Resources

Learn

Get products and technologies

  • For a look at health monitoring capability included in Optim Performance Manager, download the no-charge Data Studio Health Monitor.
  • Build your next development project with IBM trial software, available for download directly from developerWorks.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=780949
ArticleTitle=Real-time monitoring and more cool features in the new InfoSphere Optim Performance Manager V5.1 for DB2 for Linux, UNIX, and Windows
publish-date=12152011