Use Optim Data Tools to get the most out of BLU Acceleration

BLU Acceleration is an exciting new feature in IBM® DB2® 10.5 that can speed up complex analytic workloads tremendously. Workloads run faster without the need for tuning. At the heart of this new technology is the column-organized format for user tables. It's easy to create and load tables in the new column-organized format, as is on-going maintenance because there is no need for indexes or tuning materialized query tables (MQTs). This article walks you through three scenarios that illustrate how to use Data Studio and IBM® InfoSphere® Optim™ Query Workload Tuner (OQWT) with the new BLU Acceleration feature.

Share:

Loic Julien (lj@us.ibm.com), Software Architect, IBM

Loic JulienLoic Julien is a senior software engineer at IBM and is currently working on the architecture and the development of all core components within IBM Data Studio. Loic has previously contributed to the development of InfoSphere Data Architect, Rational XDE, and Rational Rose. Loic has a master's degree from San Jose State University in client/server computing.



Daniel Zilio (zilio@ca.ibm.com), Senior Software Engineer, IBM

Author photo of Daniel ZilioDaniel Zilio is a senior software developer in the IBM InfoSphere Optim Query Workload Tuner team. Previously, he was on the IBM DB2 Linux, UNIX and Windows Optimizer and Autonomic Computing teams. He also worked on DB design algorithms, explain, DB simulation, self-tuning memory management, XML design selection, automatic statistics collection, the data mart advisor, workload statistical views advisor, collect actuals feature, workload index advisor, workload statistics advisor, workload table organization advisor, access plan comparison, and what-if index analysis.



Leila Johannesen (leilaj@us.ibm.com), User Experience Engineer, IBM

Leila Johannesen photoLeila Johannesen, PhD, is a user experience engineer at the IBM Silicon Valley Lab. She works on making Information Management products easier to use. Most recently, she has worked on the Optim Data Tools and IBM PureApplication System.



08 August 2013

Also available in Chinese Russian

Overview

IBM Data Studio 4.1 provides useful GUI capabilities to exploit BLU Acceleration, convert existing tables, and determine which tables to convert to a column-organized format. You can use BLU Acceleration on all of the tables for workloads that execute deep analytic queries. If your workload is somewhat mixed, then the Workload Table Organization Advisor in OQWT 4.1 offered with IBM Data Studio 4.1 can analyze your workload and make recommendations on how to exploit the new technology for some tables and not others.

Optim Data Tools synergy

Other products in the Optim suite provide synergistic capabilities for BLU. For example, you can use InfoSphere Optim Query Capture Replay (IOQCR) to capture workload that you analyze within OQWT. Also, you can use Optim Performance Manager to monitor the workload performance.

This article illustrates how to use IBM Data Studio for getting the most out of BLU Acceleration. We cover the following three scenarios:

  • Determining which tables to convert to column-organized format
  • Converting tables to column-organized format
  • Creating a new column-organized table

Prerequisites

These scenarios assume the following product versions:

  • DB2® for Linux®, UNIX® and Windows® 10.5
  • IBM Data Studio 4.1
  • InfoSphere Optim Query Workload Tuner 4.1

Enabling BLU Acceleration

First, you need to configure BLU Acceleration. Within Data Studio, you can click Configure BLU Acceleration for a particular database. (The database needs to be at the 10.5 release level or higher for this menu item to be visible.) See Figure 1:

Figure 1. Configuring BLU Acceleration
Figure shows the Configure BLU Acceleration menu item.

This menu item opens the editor shown in Figure 2, where you can run the command db2set DB2_WORKLOAD=ANALYTICS, which enables automatic workload management:

Figure 2. Setting automatic workload management
This fgure explains setting up automatic workload management

Setting DB2_WORKLOAD to ANALYTICS also configures the default type for new tables in the database. This default type is driven from a new database configuration parameter named DFT_TABLE_ORG. See the section Setting the default table organization for a database for more information about this configuration parameter.

When you select Command Line Processor as the Run method (as shown in Figure 2), Data Studio requires a running SSH server on the remote system to execute commands. (Starting with Data Studio 4.1, the IBM SSH Server is recommended.)


Scenario 1: Determining which tables to convert to column-organized format

You can get a performance benefit by converting tables involved in analytic workloads to the new column-organized format. But if you have a mixed workload, you need to figure out which tables to convert and how much of a performance gain you'll get. This scenario illustrates how you can use OQWT to find out.

Steps to follow

  1. Start tuning

    From within OQWT, select a particular database connection, then click Analyze and Tune and then Start Tuning, as shown in Figure 3:

    Figure 3. Starting the Query Tuner analysis
    Figure shows the Start Query Tuning menu item.

    This opens the Query Tuner Workflow Assistant, as shown in Figure 4:

    Figure 4. The Query Tuner Workflow Assistant Capture view for capturing statements from the package cache
    Figure shows the Query Tuner Workflow Assistant Capture view for capturing statements from the package cache
  2. Capture the workload

    You can capture a workload from a variety of sources, such as the package cache, an SQL procedure, and the Optim Performance Manager repository. In this scenario, we have captured a workload from the package cache. You can filter the statement collection from the package cache, as shown in Figure 4. You can set a variety of filters, such as selecting statements that execute above a certain number of seconds or capturing statements that are estimated to run above a certain number of timerons.

    In this example, select the Default_all filter, which captures all the statements. Clicking Capture Now shows the results of capturing the workload (see Figure 5). The set of statements for the database connection can be saved under a unique workload name by clicking Save All to Workload....

    Figure 5. Results of capturing statements from the package cache
    Figure shows the results of capturing statements from the package cache
  3. Invoke the Workload Table Organization Advisor

    After the workload is saved, you can go to the Manage view and right-click the workload and select the Invoke Workload Advisors and Tools menu item as shown in Figure 6. (Alternatively, you can select the workload and then click Invoke Advisors.)

    Figure 6. Invoking the workload advisors from the workload name in the Manage view
    Figure shows invoking the workload advisors from the workload name in the Manage view

    The Invoke view opens. At this point, you can select the Run the Workload Advisors option under the Workload heading on the left, as shown in Figure 7. Next, click Select What to Run..., which opens the Select Activities dialog. You need to select the Table Organization check box. This runs the Workload Table Organization Advisor, which gives recommendations on which tables to convert to a column-organized format.

    Best practice

    Run and implement the statistics recommendations before getting table organization recommendations. You might also want to run the statistics advisor again after you implement the table organization recommendations to make sure that the statistics after the conversion are up to date.

    Notice that some recommendation categories (called "advisors") are disabled when you select Table organization. Some advisors should not be run at the same time because their recommendations might conflict. For example, the Index advisor cannot be run at the same time as the Workload Table Organization advisor because a table that is recommended for column-organization cannot have indexes on it.

    Figure 7. Invoking the Workload Table Organization Advisor
    Figure shows invoking the Workload Table Organization Advisor

    To generate the recommendations, the Workload Table Organization Advisor filters out tables that violate column organization restrictions, such as tables with LOB columns. For the remaining tables in the workload, the advisor virtually converts the tables to column organization format and uses these virtual tables to obtain the DB2 Linux, UNIX and Windows Optimizer cost estimates on the workload statements. This cost estimate is compared with the performance from using the row-organized format, and an estimated workload performance improvement is obtained.

    The advisor selects the subset of the tables whose virtual conversion leads to a performance improvement above some threshold, which defaults to 30%. You can change this threshold by selecting the Set Advisor Options in Figure 7 and changing the Minimum threshold for estimated performance improvement value under the Table Organization Advisor tab as shown in Figure 8:

    Figure 8. Changing the minimum threshold for estimated performance improvement for the Workload Table Organization Advisor
    Figure shows changing the minimum threshold for estimated performance improvement for the Workload Table Organization Advisor
  4. View the list of recommended tables to convert to column-based format

    Select the Review view of the Query Tuner Workflow Assistant. You see summary information, which indicates the status of the recommendations that were provided. In our scenario, Table organization is shown under the Item Analyzed column along with the result of New recommendations were generated. (See Figure 9.)

    Figure 9. Review summary after completing the Workload Table Organization Advisor
    Figure shows the review summary after completing the Workload Table Organization Advisor

    You can click the Table Organization tab from this view to see the details for the recommendations, as shown in Figure 10:

    Figure 10. Table organization recommendation details
    Figure shows the table organization recommendation details

    At the top of the page, the estimated workload performance improvement is provided. In this case, it is 85.24%. This is the cost improvement over the whole workload when the workload estimated cost with the conversion recommendations is compared to the cost of the workload before the recommended conversion.

    The number of recommended tables to convert in this case is 3 out of the 4 tables found in the workload.

    The top grid shows the list of recommended tables to convert from row to column organization. (If you wanted to see the information for all of the tables in the workload, you could change the Filter by option, which defaults to showing the Tables to be converted, to instead view All tables or Tables not to be converted). In this scenario, because we are filtering by Tables to be converted we see that the Current Organization column values only shows "Row," and the Recommended Organization column only shows Column.

    The Finding column lists why the table should be converted (to improve performance) or not converted. The reasons to not convert a table can vary: it could be that the table cannot be converted, (for example, a table might have a LOB column or be used in a statement with a cursor in the workload), or it could be that converting the table does not provide a sufficient improvement in performance.

    The Conversion Warning column lets you know when conversion will result in removing or changing certain table properties. For example, because indexes are not used in column-organized tables, any secondary indexes that currently exist on a row-organized table are removed when that table is converted to column organization. The properties that are removed during a conversion, if they exist, are: indexes, data partitioning, multidimensional clustering, and MQTs. In addition, enforced referential integrity or check constraints change to "not enforced." The table space might be recommended to be changed if the row-organized table was not defined on a table space with AUTOMATIC STORAGE. The table space that is recommended is found in the DDL script, which is described later. You can view all the warnings by right-clicking the table row in the grid and selecting the View Warnings option in the drop-down menu.

    The References to Table column indicates the number of references to the table in the workload. The Cumulative Total Cost column shows the estimated cost of the statements in the workload that reference the table.The IUDM Statements column shows the number of times the table is used in insert, update, delete, or merge statements.

    The lower grid, labeled SQL Statements Affected, shows all the SQL statements that are affected by the tables that are recommended to be converted. In this second grid, you can see how many times the statement was executed in the workload, along with the statement text and the estimated performance gain for the statement with the virtual conversion compared to the cost before the conversion. The grid also shows the estimated cost in timerons before the conversion and after the virtual conversion. The lower grid also shows a Weight column. The weight of a statement is defined to be the proportion of the total cost of the workload that the cost of the statement takes, including the execution count. (That is, the weight is the execution count of the statement times the “cost before” the recommendation divided by the total cost of the workload.) The tables in the grid are shown in order of the values in the Weight column.

  5. View and run the script

    Clicking Show DDL Script shows the DDL for the script to convert all the recommended tables, as shown in Figure 11. You can save the script and run it when you're ready. You could also copy the script to the clipboard and then paste it within the Data Studio SQL editor and run it with the job scheduler.

    Another method to convert the tables is to note the tables to be converted along with the targeted tablespaces and select those tables in the Database Administration perspective in IBM Data Studio. The Convert to Column Organization menu creates the ADMIN_MOVE_TABLE necessary syntax, as shown later in this article.

    Note: Conversion works only in one direction from row to column-organized format.

    Figure 11. Recommended conversion (DDL) script from the Workload Table Organization Advisor
    Figure shows the recommended conversion (DDL) script from the Workload Table Organization Advisor

What If Analysis

You can explore what the performance improvement would be if you were to select a subset of the tables that are recommended for conversion, rather than all of the tables. You can do this by clicking Test Candidate Table Organization from the recommendation details, as shown in Figure 10. The Test Candidate Table Organization page opens, as shown in Figure 12, and lists the tables in the workload, with check marks next to those tables that are recommended for conversion. You can try those tables or select a different set of tables. Then, when you click Test Candidate Table Organization, the feature tries to virtually convert the selected tables to column-organization and explain the workload using that conversion.

Figure 12. Invoking the Workload Test Candidate Table Organization feature
Figure shows invoking the Workload Test Candidate Table Organization feature

You can see the results in the Review view on the Candidate Table Organization tab, as shown in Figure 13. You can see the performance improvement associated with converting the candidate tables and the grid that shows information for each table in the workload and the SQL statements that would be affected by the conversion.

Note that the Test Candidate feature might not have been able to virtually convert some of the user selected tables for various reasons. The reason a table cannot be converted is shown in the Finding column. Some possible reasons: the table violates column organization restrictions or the table might already be a column-organized table. Only tables that are able to be virtually converted would be considered in the cost estimates by the test candidate feature.

Figure 13. Output details from the Test Candidate Table Organization feature
Figure shows the output details from the Test Candidate Table Organization feature

Scenario 2: Converting tables to column organization in Data Studio

If you already know what tables you want to convert to column organization, you can convert them by using the Database Administration perspective. The grid view lists the tables and identifies their organization (whether row or column organized). The table icon also provides a visual indication of the table organization.

You can select one or more row-organized tables, and then click the Convert to Column Organization menu item.

Figure 14. Convert to Column Organization menu choice for a table
Figure shows the convert to Column Organization menu choice for a table

The conversion editor offers an option to back up the database during the conversion process if archive logging is used. Data Studio uses the ADMIN_MOVE_TABLE stored procedure as the conversion utility, as shown in Figure 15:

Figure 15. Conversion script generated by IBM Data Studio
Figure shows the conversion script generated by IBM Data Studio

Setting the default table organization for a database

When you configure automatic workload management, the database configuration parameter DFT_TABLE_ORG is automatically configured to default new tables to the column-organized type. This setting applies to new tables that are created, but has no effect on existing tables. You can override this setting by directly providing the ORGANIZE BY syntax a CREATE statement.

To review or modify this database configuration parameter, select a database and click the Configure menu, shown in Figure 16:

Figure 16. Configure database parameters
Figure shows the configure database parameters

Note that the following table types always default to row-organization:

  • MQTs
  • Clustered tables (range, insert time)
  • Range partitioned tables
  • Created and declared global temporary tables
  • Temporal tables
  • Typed tables

Scenario 3: Creating a new column-organized table in Data Studio

Data Studio aggregates proposed objects to create into a change plan. A change plan allows for batch creation with a customized execution approach (for example, run methods, rollback, scheduled, or immediate). Past or current change plans can be accessed under the Change Plans folder.

The default change plan, where new objects are automatically added, is associated with a bar at the top of the grid view in the editor, shown in Figure 17:

Figure 17. IBM Data Studio change plans list
Figure shows the IBM Data Studio change plans list

The contents of the properties view of a table are tailored for the kind of table that it is, whether column or row organized. For example, the temporal or cluster definition tabs does not display if DFT_TABLE_ORG is set to Column.

Figure 18. Properties view for new table
Figure shows the properties view for new table

Executed DDL is displayed in the SQL Results view of Data Studio, along with the status and time of the execution.


Conclusion

In this article, we covered three scenarios that illustrate how to use Data Studio 4.1 and OQWT 4.1 to get the most out of BLU Acceleration.

Resources

Learn

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=940126
ArticleTitle=Use Optim Data Tools to get the most out of BLU Acceleration
publish-date=08082013