IBM InfoSphere Optim Query Capture and Replay 1.1 for Linux, UNIX, and Windows, Part 1: Introduction to OQCR

IBM® InfoSphere® Optim™ Query Capture and Replay (IOQCR) 1.1 for Linux®, UNIX® and Windows® enable an organization to create a production-like data application test environment where changes can be tested and tuned before being deployed into production. InfoSphere Optim Query Capture and Replay captures all of the application workload running against a production database and replays it against a test database without the need to replicate the entire application infrastructure. It not only replays both dynamic and static SQL, but also reproduces the number of client connections and their properties, the timing and order of execution, transaction boundaries and isolation levels, and many other critical features of workload execution. The result is a much closer approximation of production workloads and greater confidence that when changes are deployed, they will not disrupt production.

Share:

Amitava Kundu (amitavakundu@in.ibm.com), Senior Software Engineer, IBM China

Author photo of AmitavaAmitava Kundu is a Senior Software Engineer working in the IBM Information Management division at India Software Lab. He’s the architect and lead for InfoSphere Optim Query Capture and Replay reporting area. Amitava is also responsible for IBM InfoSphere Optim Performance Manager reporting solution. Prior to that he was part of IBM Rational team leading its Project Management tooling and solutions. Overall, Amitava has 19 years of IT related experience in different industry segments and functional areas. Amitava is a PMP from PMI since 2005.



Rajesh Sambandhan (rasamban@in.ibm.com), Advisory Software Engineer , IBM

Rajesh SambandhanRajesh Sambandhan is an advisory software engineer on the IBM Information Management team, India Software Labs. He has been in the industry for more than seven years, working primarily in J2EE and Test Driven Development.



Anantanag Bhat (anantbha@in.ibm.com), Test Specialist, IBM China

Author photo of AnantanagAnantanag Bhat is a test specialist on the IBM Information Management team, India Software Labs. He has been in the industry for more than nine years, working primarily on automation and manual testing. He has been involved in InfoSphere Optim Query Capture and Replay FVT extensively.



Mangesh Shanbhag (manshanb@in.ibm.com), Staff Software Engineer, IBM

Mangesh Shanbhag photoMangesh Shanbhag is a staff software engineer working with IBM India for past five and a half years. Since then he has been associated with Information Management tools development in India Software Lab. He has seven years of total experience in software product development and life cycle. Currently, he is working with the Common Application Development team in ISL. He has a B.E. in computer Science from VTU Karnataka (India). In his free time, he loves to read books, watch movies, and is an avid sports fan.



Sangeeta Ravindran (saravind@us.ibm.com), Staff Software Engineer, IBM China

Author photo of SangeetaSangeeta Ravindran is a Staff Software Engineer working at the IBM Silicon Valley Lab in San Jose, CA. She has been with IBM for seven years, working mainly on building application development tools for seamless integration of IBM data servers with Visual Studio .Net. Currently, she is the development lead for the InfoSphere Optim Query Capture and Replay web user interface.



20 September 2012

About this tutorial

Organizations continuously go through hardware/platform/software version upgrades. These requirements come from many different sources and affect stakeholders such as Line of business executives, application owners, IT managers, and testing teams. Often these kinds of upgrades are not managed well in a sense that many problems due to system in-correctness and/or performance issues are found post roll-out, and are expensive and disruptive to fix.

InfoSphere Optim Query Capture and Replay 1.1 for Linux, UNIX, and Windows lets you capture a traces of a workload that is running on a production database, or on any other target database. This is achieved in a very unobtrusive fashion with ultra-low overhead. You can then replay the workload to mimic the characteristic of the original workload on a different database that has the same or similar schemas and data volume. Before replaying the workload, you have the option to change the values for schema names and user names as these might differ between the capture database and the replay database for security reasons.

Once the workload has been replayed, you can run reports to validate and compare the results between the captured workload and the replayed workload. Reports provide the following kinds of information.

  • The first part of a report shows you a measure of the accuracy between two executions, for example how many SQL statements are identical in terms of return code and number of rows affected. It shows missing SQL statements, and statements that are new in the replayed workload. Similar information is shown for transactions where all SQL signatures and sequences are considered for comparison.
  • In the second part of a report, the relative performance of the captured and replayed SQL statements and transactions are compared in terms of total elapsed time. Only the successful (matching) SQL statements and transactions are considered for comparison. A user can view the improvement and regression of SQL statements and transactions by a predefined percentage.

You can remove failed SQL statements from the report and then replay the modified workload again and compare the results. The report feature lets you remove some or all SQL statements by error code, and create a new replay-ready workload to compare the workloads, as shown in Figure 1.

Figure 1. InfoSphere Optim Query Capture and Replay workflow
InfoSphere Optim Query Capture and Replay workflow showing the different steps a user needs to perform

Objectives

Part 1 of this tutorial provides an overview of the InfoSphere Optim Query Capture and Replay 1.1 for Linux, UNIX, and Windows, along with a sample step-by-step approach on how to configure, capture, transform, replay, and generate a report for the same workload in two different environments.

Use cases

There are many use cases for InfoSphere Optim Query Capture and Replay 1.1 for Linux, UNIX, and Windows. A few of the major ones are listed as follows.

  • Migration: Upgrade the database operating system or apply a fix pack. Understand potential performance impacts, identify bottlenecks, or rebind impacts.
  • More data: Test the impact of increased data volume to maintain stable performance as the data increases.
  • More users: Speed up replay to simulate increase in concurrent users. Continue to meet Service Level Agreements as the number of users increases.
  • Tuning: Physical database design changes, for example new indexes. Identify how a new index will impact existing access plans and overall performance.

Components and deployment architecture

InfoSphere Optim Query Capture and Replay 1.1 for Linux, UNIX, and Windows consists of an InfoSphere Guardium appliance that hosts the InfoSphere Capture Replay server. In addition, Guardium provides the S-TAP component which sits on the monitored database and collects inbound and outbound SQL-related data to and from the database server.

The diagram in Figure 2 shows a high-level overview of the InfoSphere Optim Query Capture and Replay 1.1 components.

Figure 2. InfoSphere Optim Query Capture and Replay architecture
InfoSphere Optim Query Capture and Replay architecture, showing different components and their interactions

Note: In Figure 2, the steps for Data Privacy, Test Data Subset, and Catalog and Configuration replication from production to test environment are manual processes. They are not part of the current product features, and are included here as part of the overall solution.

Configuration

InfoSphere Optim Query Capture and Replay for Linux, UNIX, and Windows can be installed and configured by performing the following tasks, which are not covered in this tutorial.

  • Installation of IBM InfoSphere Guardium Version 8.2: The first step is the installation of the InfoSphere Guardium appliance which provides the base for capturing and replaying workloads. The detailed steps can be found in the Installing IBM InfoSphere Guardium Version 8.2 topic.
  • Installation of InfoSphere Optim Query Capture and Replay license: To enable capture and replay, the InfoSphere Optim Query Capture and Replay license needs to be activated. The detailed steps can be found at: Installing the IBM InfoSphere Optim Query Capture and Replay license topic.
  • Installation of InfoSphere Capture and Replay: The next step is installation of the InfoSphere Capture Replay patch and other required software patches on the InfoSphere Guardium appliance by using the InfoSphere Guardium command-line interface (CLI). The detailed steps can be found at: Installing InfoSphere Capture Replay topic.
  • Installation of a capture and replay policy: To capture and replay DB2-to-DB2 workloads by using the InfoSphere Capture Replay web console, the Capture and Replay - DB2-to-DB2 policy must be installed and activated. The detailed steps can be found at: Installing a capture and replay policy for your environment topic.
  • Installation of IBM InfoSphere Guardium S-Tap: In order to use a database server for capturing or replaying workloads, InfoSphere Guardium S-TAP must be installed on it. InfoSphere Guardium S-TAP is a lightweight software agent that monitors database traffic and sends data access information to an InfoSphere Guardium appliance. The detailed steps can be found at: Installing and configuring IBM InfoSphere Guardium S-TAP on your database servers topic.
  • Configuring your environment to capture and replay DB2 for Linux, UNIX, and Windows SQL workloads: After installing InfoSphere Optim Query Capture and Replay (Listed previously), the following steps must be completed to configure the environment for capture and replay:
    • Add inspection engines for databases. Inspection engines make the SQL statements and transactions that are captured by InfoSphere Guardium S-TAP available for the capture-replay engine. The InfoSphere Guardium web console can be used to add an inspection engine for each database instance that will be used to capture or replay workloads.
    • Add database connections. To use the InfoSphere Capture Replay web console to capture and replay workloads, you must add database connections to the DB2 databases that have been configured for capture and replay.
    • Configure InfoSphere Capture Replay for web console authentication. To support multiple users, InfoSphere Capture Replay must be configured for web console authentication by adding a repository database and configuring authentication for the users and groups of that database. The detailed steps can be found at: Configuring IBM InfoSphere Optim Query Capture and Replay topic.

Once these tasks have been completed, you can use the InfoSphere Optim Query Capture and Replay for Linux, UNIX, and Windows web console to capture workloads from a database, replay them on other databases, and compare the results of the execution.

Supported browsers

You can open InfoSphere Optim Query Capture and Replay web console using one of the following browsers.

  • Firefox (v 3.5 and above)
  • Internet Explorer (v 8.0 and above)

More information about resource requirements and supported platform is given in the Resources section at the end of the article.


The InfoSphere Optim Query Capture and Replay web console user interface navigation structure is shown in Figure 3.

Figure 3. User interface navigation diagram
User interface navigation diagram

Web console

By default, no login is required to access the InfoSphere Optim Query Capture and Replay web console. However to support multiple users and configure the console for user authentication, a repository database must be defined, as shown in Figure 4.

Figure 4. InfoSphere Optim Query Capture and Replay web console
InfoSphere Optim Query Capture and Replay web console

Task launcher

The Task launcher is the first page that is displayed. This page provides quick launch links for all the stages of the capture replay workload process stages. The key tasks and the getting started tasks can also be accessed from the Openmenu.

Key tasks

This provides links to all the main tasks to manage a capture and replay workload life-cycle as shown previously in Figure 4.

  • Capture: Capture a DB2 workload opens the Capture dialog where required information can be specified to create a new workload to capture traffic on a DB2 database.
  • Transform: Transform a captured DB2 workload opens the Transform Dialog which enables transformation of a captured workload to a replay-ready workload.
  • Replay: Replay a DB2 workload opens the Replay dialog. Here you can select a replay-ready workload and replay it on a DB2 database.
  • Guardium web console: Open the Guardium web console link will allow you to log on to the Guardium web console. Here you can configure the S-TAP agent for the capture and replay database.

Getting started

The Databases tab provides links to all the administration/setup tasks, as shown in Figure 5.

Figure 5. Monitored database screen
Monitored Database Screen
  • Help: "Steps for capturing and replaying DB2 workloads" opens help with detailed instruction on how to use InfoSphere Optim Query Capture and Replay web console.
  • Add Databases: “Add database connections” opens the databases tab which allows you to configure database connections. All the databases used during the workload life-cycle are defined here. An option to add a new database is also available in the Capture and Transform dialogs.
  • Manage Privileges: Manage capture and replay database privileges opens the Manage Privileges tab. In order to perform the various tasks in the workload life-cycle, a user ID with privileges to perform the corresponding task must be specified.

    The Enable and Disable tab as shown in the Manage Privileges page in Figure 6 lists the privileges that are available for a selected database.

    Figure 6. Manage privileges – enable and disable
    Manage Privileges – Enable and Disable

    The privileges of database users/groups/roles for the selected database can be managed in the Grant and Revoke tab as shown in the Figure 7. Users with appropriate privileges can grant/revoke permissions to other users.

    Figure 7. Manage privileges – grant and revoke
    Manage Privileges – Grant and Revoke

    Manage access to web console opens the Console Security tab where the desired security mechanism for authenticating users can be selected, as shown in Figure 8.

    Figure 8. Security management
    Security Management

    Note: To support multiple users, a repository database must be configured first.

Learn more

This panel provides links to resources that provide more information about the InfoSphere Optim Query Capture and Replay product shown previously in Figure 4 in the section marked as 3 on the right bottom corner.

You can click Open Menu at the upper left corner of the web console to access links to all the key tasks that are accessible from the Task Launcher. The menu provides a few additional links, including the following.

  • Capture and Replay: This opens the Capture and Replay page which lists existing workloads in the SQL Workloads tab and provides buttons to open the dialogs that are used to create the stages of the capture and replay process life-cycle, including Capture, Transform, and Replay and Report, as shown in Figure 9.
    Figure 9. Capture and replay
    Capture and Replay
  • Configuration Repository: The Configuration Repository tab lets you select a database as a repository which can later be used to configure web console authentication, as shown in Figure 10.
    Figure 10. Select or edit a repository database connection
    Select or Edit a repository database connection

Capture, transform and replay

Capture

This is the first stage in the capture and replay workload life-cycle. A capture can be started from the Task Launcher link or from the SQL Workloads tab in the Capture and Replay page.

In the Capture an SQL Workload dialog that is opened, details of the workload to be captured are shown in Figure 11.

Figure 11. Capture an SQL workload
Capture an SQL Workload dialog
  • Workload name: A unique identifier for the workload. for example: Sample_Workload2.
  • Database type: DB2 for Linux, UNIX, and Windows.
  • Databases to capture: The capture database can be specified by selecting a previously created database connection or by adding a new connection.
  • Alias: If applicable, an alias for the database can be specified.
  • Start time: By default, the workload is captured immediately. A workload might be scheduled for a future date and time as shown in Figure 12.
    Figure 12. Schedule a capture
    Schedule a capture
  • Duration: The duration in minutes for which the SQL workload is captured.
  • Notes: Comments about the captured workload.

Note: At any time, you can save a draft of the workload to be run later.

Once the necessary information has been specified, run or schedule the workload capture by clicking Capture.

To capture a workload, you must specify a user ID that has Can Capture Workload privileges on the Capture database, as shown in Figure 13.

Figure 13. User credential input screen for capture operation
User credential input screen for Capture Operation

After the user is authenticated, a row is added in the grid, and the progress page which provides details of the capture progress status is opened in a new tab. The progress page shown in Figure 14 can also be opened by clicking the Status column of the workload.

Figure 14. Capture progress page
Capture progress page

When the capture log indicates that the workload capture has started, the workload/SQL statements to be captured can be run. The progress of the capture can be studied in the capture progress page, which includes the following sections:

  • Capture Workload Details: This section contains the details of the captured workload.
  • Capture Process Status: This section provides the status of the capture process, including the progress percentage, the approximate number of statements captured, and the time remaining for the workload to complete, as shown in Figure 15.
    Figure 15. Initial capture process status
    Initial Capture Process Status
    When the workload capture has run for the duration specified, the status changes to Completed. Note that the workload capture process can be stopped at any time before it completes. In that case, the status changes to Stopped. A stopped workload can be transformed, if workload data traffic has been captured.
  • Capture Log: Figure 16 shows the capture log.
    Figure 16. Capture log
    Capture Log

The First, Previous, Next and Last buttons at the bottom of the sections let you page through the log statements when the log contains a large number of messages.

Click Open Full Logto view the log in a separate browser tab or window.

Transform

Once a workload has been captured, the next step involves transforming it to a replay-ready workload. This is a preparatory step that allows selection of a replay database and specification of user ID mapping, and optionally, schema mapping information.

A workload can be transformed by opening the Transform a Workload for Replay from the Task Launcher link or from the SQL Workloads tab in the Capture and Replay page, as shown in Figure 17.

Figure 17. Transform a workload for replay
Transform a Workload for Replay dialog
  • Captured Workload: The workload to be transformed can be selected from the list of captured workloads.
  • Replay Database: The replay database can be specified by selecting a previously created database connection or adding a new connection.
  • Map database schema: If the schemas on the replay database do not match those on the capture database, and the workload cannot replay successfully unless a suitable mapping is be provided.

    Note: The mapping is used for dynamic SQL statements with fully qualified table references, and are not applied to static statements.

  • Default replay user ID: The default replay user ID to be used for mapping all captured user's needs to be specified.

    In order for replay to be successful, every user ID in the captured workload must be mapped to a user ID on the replay database.

    In order for replay to be successful, every user ID in the captured workload must be mapped to a user ID on the replay database.

    A captured workload includes SQL statements that make connections to the capture database. For these to be replayed correctly, the connection user ID and password must be valid for the replay database. Hence it is important to provide accurate user ID mappings so that captured statements do not fail when executed on the replay database and subsequently cause SQL statement mismatches.

The Test button can be used to verify that the specified user ID has the required privileges on the replay database.

  • Map captured user IDs to replay database user IDs: Optionally, mappings of all/some captured user IDs to unique replay database user IDs can be specified.

    You can click Test All to verify that the specified user ID has the required privileges on the replay database.

Note: At any time, you can save the workload as a draft to be run later.

Once the necessary information has been specified, the workload can be transformed by clicking Transform.

To transform a workload, you must specify a user ID that has Can Replay Workload privileges on the Replay database.

After the user is authenticated, a row is added in the main page grid and the progress page, as shown in Figure 18, which provides details of the transformation progress status in a new tab. The progress page can also be opened by clicking the Status column of the workload.

Figure 18. Transformation progress page
Transformation progress page

The transform progress page has the following sections:

  • Replay-Ready Workload Details: This section contains the details of the transformed workload.
  • Transform Process Status: This section provides the percentage progress of the transform process.

    When the transform process completes, the status changes to Completed. The transform process can be canceled at any time, in which case, the status changes to Canceled. A canceled replay-ready workload cannot be replayed.

  • Schema Mapping: This section displays the schema mapping that was provided when the replay-ready workload was created, if mappings were specified.
  • Schema Mapping: This section lists the replay user ID mappings that were specified when the replay-ready workload was created.
  • Transform log: This section displays the Transform log. Like the capture log, the First, Previous, Next and Last buttons at the bottom of the sections let you page through the log statements when the log contains a large number of messages.

    You can then click Open Full Log to view the log in a separate browser tab or window.

After creating the replay-ready workload, the next step involves preparing the replay database for the workload replay by creating database objects and by granting privileges so that the replay database sufficiently matches the capture database.

Replay

A workload replay executes all the originally captured inbound and outbound database traffic against the target database that was specified during the transform. The S-TAP component will capture the replay database traffic running on the replay database to facilitate the comparison between the original capture and the replayed capture, as shown in Figure 19.

Figure 19. Replay launch page with options
Replay Launch page with options

A transformed workload can be replayed by opening the Replay a Replay-Ready Workload dialog shown in Figure 19 from the Task Launcher link, or from the SQL Workloads tab in the Capture and Replay page.

In the dialog, details of the workload to be transformed are specified.

  • Replay-Ready workload: Select the replay-ready workload to be replayed. The replay-ready workload name will be automatically populated if you have selected a replay-ready workload when launching the dialog.
  • Replay database: The target database specified during the transform will be displayed in this read-only field.
  • Start time: Specify the start time for the workload replay.
  • Replay rate: You have the option to specify the speed at which the replay should run. There are four available options, as shown in Figure 20.
    Figure 20. Replay speed option
    Replay speed option dropdown

    This option allows you to increase, decrease, or keep the same wait time between SQL statements or transactions as the originally captured workload.

    When the options faster or slower rate is selected, the rate percentage by which the wait time should increase/decrease must be specified, as shown in Figure 21.

    Figure 21. Replay speed option with percentage
    Replay speed option with percentage

    If no wait time is desired, select the option to replay as fast as possible.

  • Default replay user ID: This displays the default user ID for mapping captured users, specified during transform. The password field is editable and can be modified if required.
  • Enter comments for the workload if required.

The Replay options allows you to select the number of Java Virtual Memory(JVM) processes that can be utilized for running a replay workload, as shown in Figure 22.

Figure 22. Replay process options
Replay Process options

You can click Replay to start the replay progress. This page displays real-time information when the replay is in progress. It has the following sections.

  • Replayed Workload Details are shown in Figure 23.
    Figure 23. Details
    Replay Workload details
  • Replay Process Status indicates the progress of the workload being replayed, as shown in Figure 24.
    Figure 24. Progress
    Indicates the progress of the workload being replayed
  • Replay Workload Success contains replay accuracy data. More information is available under the Replay Results section of the Report topic, as shown in Figure 25.
    Figure 25. Replay success summary
    Replay success summary
  • Replay Log displays the replay log messages in a user interface, similar to how the capture log is displayed.
  • Capture log for the replayed workload displays the capture log messages for the replayed workload in a user interface similar to how the capture log is displayed.

Reports

The InfoSphere Optim Query Capture and Replay reports help in comparing and analyzing how accurately a workload is replayed with respect to a baseline (original capture). During replay, it is possible for the underlying data in the database to be different than that of the baseline workload. The quality of replay data can have a major impact on how representative a workload replay is to that of the baseline workload. InfoSphere Optim Query Capture and Replay reports provide two major sections to validate and analyze the accuracy of a replayed workload.

  • The Replay Results provides the accuracy of SQL and Transactions that are replayed. This gives indicators on how closely the replay transactions and SQL were executed compared to the baseline by keeping the Return code, and Rows Affected in context.
  • The Response Time provides the performance indicators of replay run against the baseline.

Report options – statement grouping

The capture and replay statements are aggregated for execution count, total response time, and average response time by grouping them together as unique statements using the Statement Grouping attributes as shown on Figure 27. These settings are managed from the Report options screen which can be accessed by clicking the Options button, as shown in Figure 26, while generating a report.

The Statement text for dynamic statements (or Collection Package Section for static statements) is used as the default grouping attributes. Apart from these, the Report Options provides client info attributes, namely: Current Schema, Client Application Name, Client Accounting String, Client Workstation, and Client User ID, which can be opted to be used for Statement Grouping while aggregating the statements. If required, the client info attributes can be enabled for masking using the Report options screen. The client info values will be masked with an asterisk ‘*’ using the user provided Begin and End indexes, as shown in Figures 26 and 27.

Figure 26. Create report with options
Create Report Dialog with options
Figure 27. Statement grouping and masking options for report
Statement grouping and masking options for report

Different stages of report generation and status page

After a baseline and replay workload is identified for report generation, click Create Report to start the process. Prior to that, use Options and Advanced Options to configure the reporting parameters such as Statement grouping and Failure thresholds. These options are available in the Create Report dashboard as shown previously in Figure 26.

Generating a report using Create Report opens the Report Progress Details page. This page presents the comparison phases and provides the comparison status, metrics and logs. The report comparison consists of three steps.

During step 1, the captured and replayed records are read for comparison. This step can take time depending on the number of captured and replayed statements.

During step 2, captured and replayed records are compared. The comparison logic reads a set of captured statements and then starts comparing the replayed statements against the captured set. The total number of statements that are read or compared is indicated in the Status table. As the replayed statements are traversed and compared, they are marked as matched or unmatched. The unmatched statements are used to compute the failure rate and the status of this indicated in the status table.

After the comparison is completed, during step 3, the comparison statistics and status are persisted in the repository and a report is rendered.

The details of these steps are provided in more informational messages in the Report Log table, as shown in Figure 28.

Figure 28. Report operation progress bar with log
Report operation progress bar with log

Clicking on a completed validation report will bring up the accuracy report page with the following three tabs.

  • Details
  • Replay Results
  • Response Time

In the following sections you will learn about each of these report tabs in detail.

Details

The Details tab displays the progress information for the validation process.

If the validation process is in progress, then the Process Status bar is updated dynamically to reflect the current validation status at regular time intervals, along with the number of statements compared and failure rate statistics at the given point of time.

The Report Log section displays the log messages that are generated during validation process.

This report also contains information about the Baseline and Replayed workloads, such as workload start time, workload end time, database details on which the workload was run, and so on.

Replay results

The Replay Results tab, shown in Figure 29, displays the overview of comparison results between Baseline and Replay workloads for different categories of SQL statements and transactions.

Figure 29. Replay result summary report
Replay Result summary report

(View a larger version of Figure 29.)

SQL and Transactions from replay workload are categorized into different types based on the status of their comparison with baseline. Comparison is based on the return code, number of rows returned, number of rows updated for a SQL statement or transaction during its replay execution when compared to that of baseline.

The report contains a separate metrics table for SQL statements and for transactions. The first row in each table represents the statistics for baseline workload.

This report also contains information about the Baseline and Replayed workloads such as workload start time, workload end time, database details on which the workload was run, and so on.

  • The Unique column in the table shows the count of unique SQLs or Transactions encountered during workload comparison.
  • The Executions column shows the total number of SQL or Transaction executions encountered under a particular category.
  • The Percentage column indicates the percentage of executions in replay in comparison to total number of baseline executions.

Replay Results is a high-level summary report from which you can drill down into more granular reports for SQL statements and transactions. SQL statements and transaction metrics tables contain hyper links to drill down into detailed reports for each of the following categories.

  • Matched SQL Replays: SQL statements with same return code, rows returned count, and rows updated count in the baseline and replay workload.
  • Unmatched SQL Replays: SQL statements with different return code, rows updated count, and rows returned count in the baseline and replay workload.
  • Different return code: SQL statements with different return codes in baseline and replay workload.
  • Different number of rows returned: SQL statements with different number of rows returned in baseline and replay workload.
  • Different number of rows updated: SQL statements with different number of rows updated in baseline and replay workload.
  • SQL statements that did not replay: SQL statements which are part of baseline workload, but did not replay.
  • New SQL Statements: SQL Statements that were not part of baseline workload, but are found in the replay.

Similarly, the transaction table will have hyper links specific to transactions, like Matched transaction replays, Unmatched transaction replays, Transaction that did not replay, New transactions, and so on.

The following sections cover the SQL statement and transaction drill down reports in detail.

SQL drill down reports for matched SQL category

In this section you will learn about drill down reports for matched SQL category. You can also drill down on other SQL categories, but this is not be covered in this article.

From the Replay Results tab, click the Matched SQL Replays hyper link to launch the aggregated SQL report in a new tab, as shown in Figure 30.

Figure 30. SQL drill down report
SQL Drilldown report

(View a larger version of Figure 30.)

This report shows the execution statistics for all the aggregated SQL statements which are matched between baseline and replay, in tabular format. Each aggregated SQL represents the multiple executions of the same statement in baseline versus replay runs. The statistics include number of baseline executions, number of replay executions, total and average response time details, total rows returned, total rows updated, and so on, along with package and client information. Aggregation of SQL statements might vary based on the grouping options you specified.

Each aggregated SQL listed in the table will have a statement identifier which is hyper linked. Clicking an identifier opens the Top-N executions report for a given aggregated statement, as shown in Figure 31.

Figure 31. SQL details report with top and bottom executions
SQL Details report with Top and Bottom executions

(View a larger version of Figure 31.)

Note: Top-N denotes number of fastest/slowest instances for a statement/transaction found during the comparison process. N is the value of the third parameter, described under Advanced Options shown in Figure 36.

The TOP-N report primarily shows the longest and shortest executions of an aggregated statement in terms of replay response time. Each table contains n number of top executions displayed with statistics like Replay response time, Rows returned, and Rows updated. The number of top executions to be displayed in this report is decided by values set by you in Report Settings tab in Advanced Options.

This report also shows the literal replaced statement text in a text area at the top of the page.

Click Execution Identifier of any of the top-n executions in TOP-N tables to open the SQL Execution details report for that particular instance of statement execution, as shown in Figure 32.

Figure 32. SQL execution detail
SQL Execution Detail

(View a larger version of Figure 32.)

This report contains the detailed account of a statement execution in baseline and replay, compared side by side.

The top section of the report shows the original statement text that was used during baseline and replay SQL executions. The Host variables section lists out the host variables that were used during statement execution with details like host variable format, SQL Type, and host variable values.

The Execution Information table shows the comparison between baseline and replay for execution statistics such as response time, rows returned, rows updated, and return code. Similarly, the Application Information table shows the package and application information comparison between baseline and replay for the given statement execution.

Transaction drill down reports

In the Replay Results tab, Click the hyper links in the Transaction Replay Comparison Metric table to open the Aggregated Transactions report in a new tab. In this section you will go through the drill down reports for only matched Transactions. Other transaction reports can be drilled down in similar way.

As shown in Figure 33, you can click the Matched transaction replays hyper link to open the report that shows the list of aggregated transactions that are matched successfully between the baseline workload and the replay workload along with execution statistics.

Figure 33. Matched transaction report
Matched Transaction report

(View a larger version of Figure 33.)

The table contains details such as number of statements in a transaction, number of baseline executions, number of replay executions, aggregated replay response time, average replay response time, and package and client information. Each aggregated transaction represents multiple executions of a transaction, grouped by options specified by you as explained in the previous section of this article.

You can click the hyper links in the Transaction Identifier column to open the TOP-n executions report for a particular aggregated transaction in a separate tab, as shown in Figure 34.

Figure 34. Transaction detail report
Transaction Detail report

(View a larger version of Figure 34.)

This report contains the longest and shortest executions times of an aggregated transaction in terms of replay response time. The number of top executions to be displayed in this report is controlled by values that you set in the Report Settings tab in Advanced Options. The default value is 5 transactions.

The Transaction Details table contains information about transactions, such as first statement text, number of statements, and package and client information.

Click an Execution Identifier hyper link in longest or shortest response time tables to open the Transaction Details report for a particular transaction execution instance, as shown in Figure 35.

Figure 35. Specific transaction execution detail report
Specific Transaction Execution detail report

This report mainly contains the details of the individual SQL executions that are part of the transaction. It shows details such as replay response time, rows returned, rows updated, and return code for each SQL execution in a transaction. The Execution Information table at the bottom of the report shows the transaction execution statistics for baseline and replay runs.

You can drill down into a particular statement execution by clicking on a Statement Identifier hyper link in the SQL table. This opens the SQL Execution details report in a new tab, as shown previously in Figure 32.

Response time report

The Response time report shown in Figure 36 contains different performance metrics and graphs which will help you to identify the performance improvements and regressions between the baseline and replay workloads.

Figure 36. Response time report
Response Time report

(View a larger version of Figure 36.)

The Cumulative Statement Response Time chart shows the comparison of cumulated response time for all the SQL statements executed in the baseline and replay workloads. The cumulative time does not include the wait time that is elapsed between any two statement executions. Thus it helps you in analyzing the overall performance of replay workload in comparison to that of baseline in terms of execution response time only.

The SQL Executions Over Time graph plots the trend of SQL execution count over elapsed time for the baseline and replay workloads. The SQL execution count samples are collected for each 30-second interval over the duration of workload elapsed time. Similarly, the Rows Returned Over Time graph plots the rows returned count over elapsed time for the baseline and replay workloads. These graphs are useful in debugging performance issues encountered in a workload by comparing the pattern of SQL execution rate and rows returned rate between baseline and replay.

The Metric table in the bottom left part of the report contains different performance statistics based on response time and workload elapsed time. The Total improvement metric shows the cumulative total of improvement time for all the replay SQL executions over baseline. Similarly Total regression shows the cumulative total of regression time for all the SQL that are regressed when compared to baseline.

The Response time difference shows the difference between baseline and replay cumulative response times for all SQL executions.

The next four rows in the table show the number of SQL statements and Transactions which have improved or regressed by a certain percentage value in replay execution over baseline execution. From the Report Settings tab in Advanced Options, you define the percentage value before using the Threshold for performance improvement or regression option, as shown in Figure 37.

Figure 37. Advanced options
Advanced Options Dialog

You can change this option value and re-open the report to get the SQL and Transaction performance numbers reflected according to the new value.

Baseline elapsed time and Replay elapsed time metrics show the total time taken in executing the baseline and replay workloads. Unlike response time, elapsed time includes the wait time between any two SQL executions.

Performance drill down reports

The Value column in the metric table shows the total number of SQL statements or Transactions that are regressed or improved in a workload replay. By clicking the hyper links, you can drill down into multi-level reports which provide insight into SQL statement and Transaction executions at aggregated and individual execution level. This includes the list of Top-N improved and regressed executions in terms of execution response time. These reports are useful in identifying sporadic execution issues in workloads, where the executions are off the mark during the workload replay.

SQL drill down reports

Clicking the SQL Statements Improvement hyper link opens the SQL Improvements report which shows the list of improved aggregated SQL statements in a table, as shown in Figure 38.

Figure 38. SQL improvement report
SQL Improvement Report

(View a larger version of Figure 38.)

The table contains details such as baseline and replay total response time, number of executions, rows returned, and rows updated for each aggregated SQL statement.

Clicking a Statement Identifier for an aggregated statement opens the SQL Details report, as shown in Figure 39.

Figure 39. SQL details report
SQL Details Report

(View a larger version of Figure 39.)

This report primarily shows the Top-N improved and regressed SQL statement executions for that particular aggregated record. The Statement Text section at the top of the report shows the literal replaced statement text for the statement executions. The Application Information table contains client-related information.

Clicking on a particular execution identifier opens the SQL Execution details report in a new tab.

Transaction drill down reports

As shown previously in Figure 36, the performance metric table contains hyper links for drilling down into transaction improvements and regressions reports. The improvement hyper link opens the Transaction Improvements report. This report lists aggregated transactions that are improved in replay executions when compared to that of baseline, as shown in Figure 40.

Figure 40. Transaction improvement report
Transaction Improvement Report

(View a larger version of Figure 40.)

The transaction table contains details related to response time differences between the baseline workload and the replayed workload, such as total response time change, percentage total response time change, and average response time change.

You can click the Transaction Identifier hyper link for an aggregated transaction to open a Transaction Details report in a new tab, as shown in Figure 41.

Figure 41. Transaction details report
Transaction Details Report

This report shows the Top-N improved and regressed executions of that particular aggregated transaction in two separate tables. This report also contains a section which shows the details related to aggregated transactions such as first statement text, number of statements in the transaction, and client information.

You can drill down to the next levels of a report by clicking the hyper links provided for each transaction execution instance in the Top-N tables, which is similar to the transaction drill down reports that show the details related to all SQL executions in a particular transaction. From this report you can drill down into the SQL Details report where all information about an individual SQL execution is be displayed, as shown previously in Figure 32.

Conclusion

By following the steps outlined in this article, you can get familiar with IBM InfoSphere Optim Query Capture and Replay 1.1 for Linux, UNIX and Windows in a step-by-step manner, and compare workload characteristics between different environments. This will help you to have confidence in the changes made in a test environment before moving to production. You can also run repeatable test cases in a controlled and predictable manner.

Resources

Learn

Get products and technologies

  • Build your next development project with IBM trial software, available for download directly from developerWorks.
  • Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=835512
ArticleTitle=IBM InfoSphere Optim Query Capture and Replay 1.1 for Linux, UNIX, and Windows, Part 1: Introduction to OQCR
publish-date=09202012