IBM Data Studio 4.1 provides useful GUI capabilities to exploit BLU Acceleration, convert existing tables, and determine which tables to convert to a column-organized format. You can use BLU Acceleration on all of the tables for workloads that execute deep analytic queries. If your workload is somewhat mixed, then the Workload Table Organization Advisor in OQWT 4.1 offered with IBM Data Studio 4.1 can analyze your workload and make recommendations on how to exploit the new technology for some tables and not others.
This article illustrates how to use IBM Data Studio for getting the most out of BLU Acceleration. We cover the following three scenarios:
- Determining which tables to convert to column-organized format
- Converting tables to column-organized format
- Creating a new column-organized table
These scenarios assume the following product versions:
- DB2® for Linux®, UNIX® and Windows® 10.5
- IBM Data Studio 4.1
- InfoSphere Optim Query Workload Tuner 4.1
Enabling BLU Acceleration
First, you need to configure BLU Acceleration. Within Data Studio, you can click Configure BLU Acceleration for a particular database. (The database needs to be at the 10.5 release level or higher for this menu item to be visible.) See Figure 1:
Figure 1. Configuring BLU Acceleration
This menu item opens the editor shown in Figure 2, where you can run the command db2set DB2_WORKLOAD=ANALYTICS, which enables automatic workload management:
Figure 2. Setting automatic workload management
Setting DB2_WORKLOAD to ANALYTICS also configures the default type for new tables in the database. This default type is driven from a new database configuration parameter named DFT_TABLE_ORG. See the section Setting the default table organization for a database for more information about this configuration parameter.
When you select Command Line Processor as the Run method (as shown in Figure 2), Data Studio requires a running SSH server on the remote system to execute commands. (Starting with Data Studio 4.1, the IBM SSH Server is recommended.)
Scenario 1: Determining which tables to convert to column-organized format
You can get a performance benefit by converting tables involved in analytic workloads to the new column-organized format. But if you have a mixed workload, you need to figure out which tables to convert and how much of a performance gain you'll get. This scenario illustrates how you can use OQWT to find out.
Steps to follow
From within OQWT, select a particular database connection, then click Analyze and Tune and then Start Tuning, as shown in Figure 3:
Figure 3. Starting the Query Tuner analysis
This opens the Query Tuner Workflow Assistant, as shown in Figure 4:
Figure 4. The Query Tuner Workflow Assistant Capture view for capturing statements from the package cache
Capture the workload
You can capture a workload from a variety of sources, such as the package cache, an SQL procedure, and the Optim Performance Manager repository. In this scenario, we have captured a workload from the package cache. You can filter the statement collection from the package cache, as shown in Figure 4. You can set a variety of filters, such as selecting statements that execute above a certain number of seconds or capturing statements that are estimated to run above a certain number of timerons.
In this example, select the Default_all filter, which captures all the statements. Clicking Capture Now shows the results of capturing the workload (see Figure 5). The set of statements for the database connection can be saved under a unique workload name by clicking Save All to Workload....
Figure 5. Results of capturing statements from the package cache
Invoke the Workload Table Organization Advisor
After the workload is saved, you can go to the Manage view and right-click the workload and select the Invoke Workload Advisors and Tools menu item as shown in Figure 6. (Alternatively, you can select the workload and then click Invoke Advisors.)
Figure 6. Invoking the workload advisors from the workload name in the Manage view
The Invoke view opens. At this point, you can select the Run the Workload Advisors option under the Workload heading on the left, as shown in Figure 7. Next, click Select What to Run..., which opens the Select Activities dialog. You need to select the Table Organization check box. This runs the Workload Table Organization Advisor, which gives recommendations on which tables to convert to a column-organized format.
Notice that some recommendation categories (called "advisors") are disabled when you select Table organization. Some advisors should not be run at the same time because their recommendations might conflict. For example, the Index advisor cannot be run at the same time as the Workload Table Organization advisor because a table that is recommended for column-organization cannot have indexes on it.
Figure 7. Invoking the Workload Table Organization Advisor
To generate the recommendations, the Workload Table Organization Advisor filters out tables that violate column organization restrictions, such as tables with LOB columns. For the remaining tables in the workload, the advisor virtually converts the tables to column organization format and uses these virtual tables to obtain the DB2 Linux, UNIX and Windows Optimizer cost estimates on the workload statements. This cost estimate is compared with the performance from using the row-organized format, and an estimated workload performance improvement is obtained.
The advisor selects the subset of the tables whose virtual conversion leads to a performance improvement above some threshold, which defaults to 30%. You can change this threshold by selecting the Set Advisor Options in Figure 7 and changing the Minimum threshold for estimated performance improvement value under the Table Organization Advisor tab as shown in Figure 8:
Figure 8. Changing the minimum threshold for estimated performance improvement for the Workload Table Organization Advisor
View the list of recommended tables to convert to column-based format
Select the Review view of the Query Tuner Workflow Assistant. You see summary information, which indicates the status of the recommendations that were provided. In our scenario, Table organization is shown under the Item Analyzed column along with the result of New recommendations were generated. (See Figure 9.)
Figure 9. Review summary after completing the Workload Table Organization Advisor
You can click the Table Organization tab from this view to see the details for the recommendations, as shown in Figure 10:
Figure 10. Table organization recommendation details
At the top of the page, the estimated workload performance improvement is provided. In this case, it is 85.24%. This is the cost improvement over the whole workload when the workload estimated cost with the conversion recommendations is compared to the cost of the workload before the recommended conversion.
The number of recommended tables to convert in this case is 3 out of the 4 tables found in the workload.
The top grid shows the list of recommended tables to convert from row to column organization. (If you wanted to see the information for all of the tables in the workload, you could change the Filter by option, which defaults to showing the Tables to be converted, to instead view All tables or Tables not to be converted). In this scenario, because we are filtering by Tables to be converted we see that the Current Organization column values only shows "Row," and the Recommended Organization column only shows Column.
The Finding column lists why the table should be converted (to improve performance) or not converted. The reasons to not convert a table can vary: it could be that the table cannot be converted, (for example, a table might have a LOB column or be used in a statement with a cursor in the workload), or it could be that converting the table does not provide a sufficient improvement in performance.
The Conversion Warning column lets you know when conversion will result in removing or changing certain table properties. For example, because indexes are not used in column-organized tables, any secondary indexes that currently exist on a row-organized table are removed when that table is converted to column organization. The properties that are removed during a conversion, if they exist, are: indexes, data partitioning, multidimensional clustering, and MQTs. In addition, enforced referential integrity or check constraints change to "not enforced." The table space might be recommended to be changed if the row-organized table was not defined on a table space with AUTOMATIC STORAGE. The table space that is recommended is found in the DDL script, which is described later. You can view all the warnings by right-clicking the table row in the grid and selecting the View Warnings option in the drop-down menu.
The References to Table column indicates the number of references to the table in the workload. The Cumulative Total Cost column shows the estimated cost of the statements in the workload that reference the table.The IUDM Statements column shows the number of times the table is used in insert, update, delete, or merge statements.
The lower grid, labeled SQL Statements Affected, shows all the SQL statements that are affected by the tables that are recommended to be converted. In this second grid, you can see how many times the statement was executed in the workload, along with the statement text and the estimated performance gain for the statement with the virtual conversion compared to the cost before the conversion. The grid also shows the estimated cost in timerons before the conversion and after the virtual conversion. The lower grid also shows a Weight column. The weight of a statement is defined to be the proportion of the total cost of the workload that the cost of the statement takes, including the execution count. (That is, the weight is the execution count of the statement times the “cost before” the recommendation divided by the total cost of the workload.) The tables in the grid are shown in order of the values in the Weight column.
View and run the script
Clicking Show DDL Script shows the DDL for the script to convert all the recommended tables, as shown in Figure 11. You can save the script and run it when you're ready. You could also copy the script to the clipboard and then paste it within the Data Studio SQL editor and run it with the job scheduler.
Another method to convert the tables is to note the tables to be converted along with the targeted tablespaces and select those tables in the Database Administration perspective in IBM Data Studio. The Convert to Column Organization menu creates the ADMIN_MOVE_TABLE necessary syntax, as shown later in this article.
Note: Conversion works only in one direction from row to column-organized format.
Figure 11. Recommended conversion (DDL) script from the Workload Table Organization Advisor
What If Analysis
You can explore what the performance improvement would be if you were to select a subset of the tables that are recommended for conversion, rather than all of the tables. You can do this by clicking Test Candidate Table Organization from the recommendation details, as shown in Figure 10. The Test Candidate Table Organization page opens, as shown in Figure 12, and lists the tables in the workload, with check marks next to those tables that are recommended for conversion. You can try those tables or select a different set of tables. Then, when you click Test Candidate Table Organization, the feature tries to virtually convert the selected tables to column-organization and explain the workload using that conversion.
Figure 12. Invoking the Workload Test Candidate Table Organization feature
You can see the results in the Review view on the Candidate Table Organization tab, as shown in Figure 13. You can see the performance improvement associated with converting the candidate tables and the grid that shows information for each table in the workload and the SQL statements that would be affected by the conversion.
Note that the Test Candidate feature might not have been able to virtually convert some of the user selected tables for various reasons. The reason a table cannot be converted is shown in the Finding column. Some possible reasons: the table violates column organization restrictions or the table might already be a column-organized table. Only tables that are able to be virtually converted would be considered in the cost estimates by the test candidate feature.
Figure 13. Output details from the Test Candidate Table Organization feature
Scenario 2: Converting tables to column organization in Data Studio
If you already know what tables you want to convert to column organization, you can convert them by using the Database Administration perspective. The grid view lists the tables and identifies their organization (whether row or column organized). The table icon also provides a visual indication of the table organization.
You can select one or more row-organized tables, and then click the Convert to Column Organization menu item.
Figure 14. Convert to Column Organization menu choice for a table
The conversion editor offers an option to back up the database during the conversion process if archive logging is used. Data Studio uses the ADMIN_MOVE_TABLE stored procedure as the conversion utility, as shown in Figure 15:
Figure 15. Conversion script generated by IBM Data Studio
Setting the default table organization for a database
When you configure automatic workload management, the database configuration parameter DFT_TABLE_ORG is automatically configured to default new tables to the column-organized type. This setting applies to new tables that are created, but has no effect on existing tables. You can override this setting by directly providing the ORGANIZE BY syntax a CREATE statement.
To review or modify this database configuration parameter, select a database and click the Configure menu, shown in Figure 16:
Figure 16. Configure database parameters
Note that the following table types always default to row-organization:
- Clustered tables (range, insert time)
- Range partitioned tables
- Created and declared global temporary tables
- Temporal tables
- Typed tables
Scenario 3: Creating a new column-organized table in Data Studio
Data Studio aggregates proposed objects to create into a change plan. A change plan allows for batch creation with a customized execution approach (for example, run methods, rollback, scheduled, or immediate). Past or current change plans can be accessed under the Change Plans folder.
The default change plan, where new objects are automatically added, is associated with a bar at the top of the grid view in the editor, shown in Figure 17:
Figure 17. IBM Data Studio change plans list
The contents of the properties view of a table are tailored for the kind of table that it is, whether column or row organized. For example, the temporal or cluster definition tabs does not display if DFT_TABLE_ORG is set to Column.
Figure 18. Properties view for new table
Executed DDL is displayed in the SQL Results view of Data Studio, along with the status and time of the execution.
In this article, we covered three scenarios that illustrate how to use Data Studio 4.1 and OQWT 4.1 to get the most out of BLU Acceleration.
- Take a look at the DB2 with BLU Acceleration microsite.
- Read more about DB2 with BLU Acceleration.
- DB2 Magazine article on BLU Acceleration BLU Acceleration: Super Analytics, Super Easy.
- Video on BLU Acceleration: Deep Dive on BLU Acceleration in DB2 10.5, Super Analytics, Super Easy.
- Participate in the discussion forum.
- Get involved in the My developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.