The InfoSphere Optim Query Capture and Replay reports help in comparing and analyzing how accurately a workload is replayed with respect to a baseline (original capture). During replay, it is possible for the underlying data in the database to be different than that of the baseline workload. The quality of replay data can have a major impact on how representative a workload replay is to that of the baseline workload. InfoSphere Optim Query Capture and Replay reports provide two major sections to validate and analyze the accuracy of a replayed workload.
- The Replay Results provides the accuracy of SQL and Transactions that are replayed. This gives indicators on how closely the replay transactions and SQL were executed compared to the baseline by keeping the Return code, and Rows Affected in context.
- The Response Time provides the performance indicators of replay run against the baseline.
The capture and replay statements are aggregated for execution count, total response time, and average response time by grouping them together as unique statements using the Statement Grouping attributes as shown on Figure 27. These settings are managed from the Report options screen which can be accessed by clicking the Options button, as shown in Figure 26, while generating a report.
The Statement text for dynamic statements (or Collection Package Section for static statements) is used as the default grouping attributes. Apart from these, the Report Options provides client info attributes, namely: Current Schema, Client Application Name, Client Accounting String, Client Workstation, and Client User ID, which can be opted to be used for Statement Grouping while aggregating the statements. If required, the client info attributes can be enabled for masking using the Report options screen. The client info values will be masked with an asterisk ‘*’ using the user provided Begin and End indexes, as shown in Figures 26 and 27.
Figure 26. Create report with options
Figure 27. Statement grouping and masking options for report
After a baseline and replay workload is identified for report generation, click Create Report to start the process. Prior to that, use Options and Advanced Options to configure the reporting parameters such as Statement grouping and Failure thresholds. These options are available in the Create Report dashboard as shown previously in Figure 26.
Generating a report using Create Report opens the Report Progress Details page. This page presents the comparison phases and provides the comparison status, metrics and logs. The report comparison consists of three steps.
During step 1, the captured and replayed records are read for comparison. This step can take time depending on the number of captured and replayed statements.
During step 2, captured and replayed records are compared. The comparison logic reads a set of captured statements and then starts comparing the replayed statements against the captured set. The total number of statements that are read or compared is indicated in the Status table. As the replayed statements are traversed and compared, they are marked as matched or unmatched. The unmatched statements are used to compute the failure rate and the status of this indicated in the status table.
After the comparison is completed, during step 3, the comparison statistics and status are persisted in the repository and a report is rendered.
The details of these steps are provided in more informational messages in the Report Log table, as shown in Figure 28.
Figure 28. Report operation progress bar with log
Clicking on a completed validation report will bring up the accuracy report page with the following three tabs.
- Replay Results
- Response Time
In the following sections you will learn about each of these report tabs in detail.
The Details tab displays the progress information for the validation process.
If the validation process is in progress, then the Process Status bar is updated dynamically to reflect the current validation status at regular time intervals, along with the number of statements compared and failure rate statistics at the given point of time.
The Report Log section displays the log messages that are generated during validation process.
This report also contains information about the Baseline and Replayed workloads, such as workload start time, workload end time, database details on which the workload was run, and so on.
The Replay Results tab, shown in Figure 29, displays the overview of comparison results between Baseline and Replay workloads for different categories of SQL statements and transactions.
Figure 29. Replay result summary report
(View a larger version of Figure 29.)
SQL and Transactions from replay workload are categorized into different types based on the status of their comparison with baseline. Comparison is based on the return code, number of rows returned, number of rows updated for a SQL statement or transaction during its replay execution when compared to that of baseline.
The report contains a separate metrics table for SQL statements and for transactions. The first row in each table represents the statistics for baseline workload.
This report also contains information about the Baseline and Replayed workloads such as workload start time, workload end time, database details on which the workload was run, and so on.
- The Unique column in the table shows the count of unique SQLs or Transactions encountered during workload comparison.
- The Executions column shows the total number of SQL or Transaction executions encountered under a particular category.
- The Percentage column indicates the percentage of executions in replay in comparison to total number of baseline executions.
Replay Results is a high-level summary report from which you can drill down into more granular reports for SQL statements and transactions. SQL statements and transaction metrics tables contain hyper links to drill down into detailed reports for each of the following categories.
- Matched SQL Replays: SQL statements with same return code, rows returned count, and rows updated count in the baseline and replay workload.
- Unmatched SQL Replays: SQL statements with different return code, rows updated count, and rows returned count in the baseline and replay workload.
- Different return code: SQL statements with different return codes in baseline and replay workload.
- Different number of rows returned: SQL statements with different number of rows returned in baseline and replay workload.
- Different number of rows updated: SQL statements with different number of rows updated in baseline and replay workload.
- SQL statements that did not replay: SQL statements which are part of baseline workload, but did not replay.
- New SQL Statements: SQL Statements that were not part of baseline workload, but are found in the replay.
Similarly, the transaction table will have hyper links specific to transactions, like Matched transaction replays, Unmatched transaction replays, Transaction that did not replay, New transactions, and so on.
The following sections cover the SQL statement and transaction drill down reports in detail.
In this section you will learn about drill down reports for matched SQL category. You can also drill down on other SQL categories, but this is not be covered in this article.
From the Replay Results tab, click the Matched SQL Replays hyper link to launch the aggregated SQL report in a new tab, as shown in Figure 30.
Figure 30. SQL drill down report
(View a larger version of Figure 30.)
This report shows the execution statistics for all the aggregated SQL statements which are matched between baseline and replay, in tabular format. Each aggregated SQL represents the multiple executions of the same statement in baseline versus replay runs. The statistics include number of baseline executions, number of replay executions, total and average response time details, total rows returned, total rows updated, and so on, along with package and client information. Aggregation of SQL statements might vary based on the grouping options you specified.
Each aggregated SQL listed in the table will have a statement identifier which is hyper linked. Clicking an identifier opens the Top-N executions report for a given aggregated statement, as shown in Figure 31.
Figure 31. SQL details report with top and bottom executions
(View a larger version of Figure 31.)
Note: Top-N denotes number of fastest/slowest instances for a statement/transaction found during the comparison process. N is the value of the third parameter, described under Advanced Options shown in Figure 36.
The TOP-N report primarily shows the longest and shortest executions of an aggregated statement in terms of replay response time. Each table contains n number of top executions displayed with statistics like Replay response time, Rows returned, and Rows updated. The number of top executions to be displayed in this report is decided by values set by you in Report Settings tab in Advanced Options.
This report also shows the literal replaced statement text in a text area at the top of the page.
Click Execution Identifier of any of the top-n executions in TOP-N tables to open the SQL Execution details report for that particular instance of statement execution, as shown in Figure 32.
Figure 32. SQL execution detail
(View a larger version of Figure 32.)
This report contains the detailed account of a statement execution in baseline and replay, compared side by side.
The top section of the report shows the original statement text that was used during baseline and replay SQL executions. The Host variables section lists out the host variables that were used during statement execution with details like host variable format, SQL Type, and host variable values.
The Execution Information table shows the comparison between baseline and replay for execution statistics such as response time, rows returned, rows updated, and return code. Similarly, the Application Information table shows the package and application information comparison between baseline and replay for the given statement execution.
In the Replay Results tab, Click the hyper links in the Transaction Replay Comparison Metric table to open the Aggregated Transactions report in a new tab. In this section you will go through the drill down reports for only matched Transactions. Other transaction reports can be drilled down in similar way.
As shown in Figure 33, you can click the Matched transaction replays hyper link to open the report that shows the list of aggregated transactions that are matched successfully between the baseline workload and the replay workload along with execution statistics.
Figure 33. Matched transaction report
(View a larger version of Figure 33.)
The table contains details such as number of statements in a transaction, number of baseline executions, number of replay executions, aggregated replay response time, average replay response time, and package and client information. Each aggregated transaction represents multiple executions of a transaction, grouped by options specified by you as explained in the previous section of this article.
You can click the hyper links in the Transaction Identifier column to open the TOP-n executions report for a particular aggregated transaction in a separate tab, as shown in Figure 34.
Figure 34. Transaction detail report
(View a larger version of Figure 34.)
This report contains the longest and shortest executions times of an aggregated transaction in terms of replay response time. The number of top executions to be displayed in this report is controlled by values that you set in the Report Settings tab in Advanced Options. The default value is 5 transactions.
The Transaction Details table contains information about transactions, such as first statement text, number of statements, and package and client information.
Click an Execution Identifier hyper link in longest or shortest response time tables to open the Transaction Details report for a particular transaction execution instance, as shown in Figure 35.
Figure 35. Specific transaction execution detail report
This report mainly contains the details of the individual SQL executions that are part of the transaction. It shows details such as replay response time, rows returned, rows updated, and return code for each SQL execution in a transaction. The Execution Information table at the bottom of the report shows the transaction execution statistics for baseline and replay runs.
You can drill down into a particular statement execution by clicking on a Statement Identifier hyper link in the SQL table. This opens the SQL Execution details report in a new tab, as shown previously in Figure 32.
The Response time report shown in Figure 36 contains different performance metrics and graphs which will help you to identify the performance improvements and regressions between the baseline and replay workloads.
Figure 36. Response time report
(View a larger version of Figure 36.)
The Cumulative Statement Response Time chart shows the comparison of cumulated response time for all the SQL statements executed in the baseline and replay workloads. The cumulative time does not include the wait time that is elapsed between any two statement executions. Thus it helps you in analyzing the overall performance of replay workload in comparison to that of baseline in terms of execution response time only.
The SQL Executions Over Time graph plots the trend of SQL execution count over elapsed time for the baseline and replay workloads. The SQL execution count samples are collected for each 30-second interval over the duration of workload elapsed time. Similarly, the Rows Returned Over Time graph plots the rows returned count over elapsed time for the baseline and replay workloads. These graphs are useful in debugging performance issues encountered in a workload by comparing the pattern of SQL execution rate and rows returned rate between baseline and replay.
The Metric table in the bottom left part of the report contains different performance statistics based on response time and workload elapsed time. The Total improvement metric shows the cumulative total of improvement time for all the replay SQL executions over baseline. Similarly Total regression shows the cumulative total of regression time for all the SQL that are regressed when compared to baseline.
The Response time difference shows the difference between baseline and replay cumulative response times for all SQL executions.
The next four rows in the table show the number of SQL statements and Transactions which have improved or regressed by a certain percentage value in replay execution over baseline execution. From the Report Settings tab in Advanced Options, you define the percentage value before using the Threshold for performance improvement or regression option, as shown in Figure 37.
Figure 37. Advanced options
You can change this option value and re-open the report to get the SQL and Transaction performance numbers reflected according to the new value.
Baseline elapsed time and Replay elapsed time metrics show the total time taken in executing the baseline and replay workloads. Unlike response time, elapsed time includes the wait time between any two SQL executions.
The Value column in the metric table shows the total number of SQL statements or Transactions that are regressed or improved in a workload replay. By clicking the hyper links, you can drill down into multi-level reports which provide insight into SQL statement and Transaction executions at aggregated and individual execution level. This includes the list of Top-N improved and regressed executions in terms of execution response time. These reports are useful in identifying sporadic execution issues in workloads, where the executions are off the mark during the workload replay.
Clicking the SQL Statements Improvement hyper link opens the SQL Improvements report which shows the list of improved aggregated SQL statements in a table, as shown in Figure 38.
Figure 38. SQL improvement report
(View a larger version of Figure 38.)
The table contains details such as baseline and replay total response time, number of executions, rows returned, and rows updated for each aggregated SQL statement.
Clicking a Statement Identifier for an aggregated statement opens the SQL Details report, as shown in Figure 39.
Figure 39. SQL details report
(View a larger version of Figure 39.)
This report primarily shows the Top-N improved and regressed SQL statement executions for that particular aggregated record. The Statement Text section at the top of the report shows the literal replaced statement text for the statement executions. The Application Information table contains client-related information.
Clicking on a particular execution identifier opens the SQL Execution details report in a new tab.
As shown previously in Figure 36, the performance metric table contains hyper links for drilling down into transaction improvements and regressions reports. The improvement hyper link opens the Transaction Improvements report. This report lists aggregated transactions that are improved in replay executions when compared to that of baseline, as shown in Figure 40.
Figure 40. Transaction improvement report
(View a larger version of Figure 40.)
The transaction table contains details related to response time differences between the baseline workload and the replayed workload, such as total response time change, percentage total response time change, and average response time change.
You can click the Transaction Identifier hyper link for an aggregated transaction to open a Transaction Details report in a new tab, as shown in Figure 41.
Figure 41. Transaction details report
This report shows the Top-N improved and regressed executions of that particular aggregated transaction in two separate tables. This report also contains a section which shows the details related to aggregated transactions such as first statement text, number of statements in the transaction, and client information.
You can drill down to the next levels of a report by clicking the hyper links provided for each transaction execution instance in the Top-N tables, which is similar to the transaction drill down reports that show the details related to all SQL executions in a particular transaction. From this report you can drill down into the SQL Details report where all information about an individual SQL execution is be displayed, as shown previously in Figure 32.
By following the steps outlined in this article, you can get familiar with IBM InfoSphere Optim Query Capture and Replay 1.1 for Linux, UNIX and Windows in a step-by-step manner, and compare workload characteristics between different environments. This will help you to have confidence in the changes made in a test environment before moving to production. You can also run repeatable test cases in a controlled and predictable manner.