Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Cover Story: Maintenance Makes Money

Five ways to cut database costs - just by paying attention

Sheryl M. Larsen, Consultant, Sheryl M. Larsen, Inc.
Sheryl M. Larsen (www.smlsql.com) is an internationally recognized researcher, consultant, and lecturer, specializing in DB2. She has more than 20 years’ experience in DB2, is known for her extensive expertise in SQL, and performs detailed DB2 performance reviews for many clients. Larsen was voted an IBM Information Champion in 2009 and 2010 and the IDUG Speaker Hall of Fame in 2001, and has been a member of IBM’s DB2 Gold Consultants program since 1994.

Summary:  Here are five ways to reduce costs and improve performance by paying attention to database detail. Database maintenance means more than just cleaning up—it's about understanding the specific platform that your database runs on, and taking advantage of new technologies.

IBM Data Management magazine table of contents

View more content in this series

Date:  30 Jul 2010
Level:  Introductory
Also available in:   Chinese

Activity:  4403 views
Comments:  

Read this article in our interactive digital edition format!

Few things do well without proper care. Houses, cars, pets, friends, and family all need our attention to stay happy and healthy, and database management systems are no different.

Without care and attention, a DBMS will break down, slow down, and eventually have to be shut down because it costs too much to operate. Keeping maintenance levels current to take advantage of changes in technology is one thing; applying the correct technology is another. This article points out five ways to reduce the cost of running DB2 by providing proper care to the system, using the right tool for the job at hand, and paying attention to detail.

1. Use the right SQL for the job--and make sure you know what it is

There are many ways to code a query that will yield the desired results; the trick is finding the method that plays to the strengths of your DBMS. That may seem obvious, but it's not as easy as it sounds because the tools and technology available to you change over time.

A few years ago, one of my clients inherited a customer relationship management (CRM) application that had been ported to its DB2 for z/OS Version 8 subsystem. One program was bringing the company's new IBM System z9 to its knees, consuming 25 percent of all CPU cycles on a daily basis. The program had 10 CREATED Global Temporary Tables, and most were embedded in recursive SQL statements. Six of the temporary tables could contain between 10,000 and 100,000 rows, depending on the depth of the sales chain, and were accessed repetitively.

But there is a second type of Global Temporary Table on DB2 for z/OS: DECLARED. And although CREATED temporary tables could not use any index technique until DB2 9, DECLARED tables could. So, I recommended converting six CREATED temporary tables to DECLARED temporary tables with clustering indexes. The remaining four CREATED temporary tables were appropriate for the job since they contained a small number of rows and were accessed only once. This change reduced the time required to complete each transaction from 1.78 seconds to 0.07 seconds, which doesn't look that impressive until you know that this program is executed a minimum of 6,000 times per day. Therefore, the one small change saved the client 2.8 CPU hours every business day. The results were so dramatic that the senior DBA on site even joked about returning the new z9!

All of that occurred three years ago. Today, the client has DB2 9, which automatically generates sparse indexes on repetitively accessed work files, including temporary tables. But this sort of situation occurs all the time. SQL is a moving target: DB2 SQL had very humble beginnings as a query language when it started out in 1983—at that time it supported only inner joins, subqueries, the GROUP BY clause, the HAVING clause, the ORDER BY clause, and approximately 21 built-in functions. Now, 27 years later, the SQL available with DB2 9 for z/OS has exploded (see sidebar, "Look How You've Grown").

Some SQL features are synonymous, but most provide unique functionality. And it is important to fully exploit new features as they become available, because many can increase performance and cut the cost of running your applications.

Look How You've Grown

When DB2 SQL was first introduced, you could practically count the functions on one hand. These days, there's a lot more to choose from:

TABLE EXPRESSIONS, COMPLEX CORRELATION, GLOBAL TEMPORARY TABLES, CASE, 100+ BUILT-IN FUNCTIONS, LIMITED FETCH, SCROLLABLE CURSORS, UNION EVERYWHERE, MIN/MAX SINGLE INDEX SUPPORT, SELF REFERENCING UPDATES WITH SUBQUERIES, SORT AVOIDANCE FOR ORDER BY, AND ROW EXPRESSIONS, 2M STATEMENT LENGTH, GROUP BY EXPRESSION, SEQUENCES, SC ALAR FULLSELECT, MATERIALIZED QUERY TABLES, COMMON TABLE EXPRESSIONS, RECURSIVE SQL, CURENT PA CKAGE PA TH, VOLATILE TABLE SUPPORT, STAR JOIN SPARSE INDEX, QUALIFIED COLUMN NAMES, MULTIPLE DISTINCT CLAUSES, IS NOT DISTINCT FROM, ON COMIT DROP, TRANSPARENT ROWID COLUMN, GET DIAGNOSTICS, STAGE1 UNLIKE DATA TYPES, MULTI-ROW INSERT, MULTI-ROW FETCH, DYNAMIC SCROLLABLE CURSORS, MULTIPLE CCSIDS PER STATEMENT, ENHANCED UNICODE, AND PARALLEL SORT, TRUN CATE, DECIMAL FLOAT, VARBINARY , OPTIMISTIC LOCKING, FETCH CONTINUE, MERGE, CALL FROM TRIGG ER, STATEMENT ISOLATION, FOR READ ONLY KEEP UPDATE LOCKS, SET CURENT SCHEMA, CLIENT SPECIAL REGISTERS, LONG SQL OBJECT NAMES, SELECT FROM INSERT, UPDATE, DELETE, MERGE, INSTEAD OF TRIGGER, NATIVE SQL PROCEDURE LANGUAGE, BIGINT, FILE REFERENCE VARIABLES, XML, FETCH FIRST & ORDER BY IN SUBSELECT AND FULLSELECT, CASELESS COMPARISONS, INTERSECT, EXCEPT, NOT LOGGED TABLES

Searching static and dynamic SQL for keywords is an effective way to gauge how advanced your production SQL portfolio is. Figures 1 and 2 are SQL report cards prepared for clients by searching based on what was running in their current production environments. Client XYZ used very little SQL technology within its 3,000 financial reporting applications. This translates to almost all relational business rules and filtering performed in the application program, instead of taking advantage of high-performance SQL functions that could carry that load.


Figure 1. Client XYZ's SQL report card as of DB2 7 for z/OS.
Figure 1 client XYZ's SQL report card

Figure 2. Client ABC's SQL report card as of DB2 8 for z/OS.
Figure 2 Client ABC's SQL report card

Client ABC's report card was much better but still lacked implementation of SELECT INTO with ORDER BY and multi-row fetch. These features typically reduce CPU demand by 30 percent and 50 percent, respectively. In thousands of occurrences and millions of executions, Client ABC was using obsolete single-row fetches instead of the current technology. Current SQL coding skills need to be cultivated in today's programming pool to transfer the technology into the production environment.

By moving to DB2 9, you get access to many new features, including:

  • TRUNCATE
  • MERGE
  • SELECT FROM UPDATE/DELETE/MERGE
  • EXCEPT, INTERSECT
  • RANK, DENSE_RANK, ROW_NUM
  • ORDER BY / FETCH FIRST in Subselect

Not taking advantage of these features, especially in a data warehousing environment, means that you need external tools to provide the functionality desired. This requirement drives up cost in terms of licensing fees and can have a significant impact on performance, since many of these tools require data to be transferred out of DB2 for external processing. (The searches used to gauge your SQL portfolio can also be used to determine whether a soon-to-be-purchased application package uses advanced features.)


2. Create optimal index design

Base table indexes are the first line of defense for performance-related issues. The most popular or critical transactions and services should have indexes to support all of the joining, filtering, and sequencing tasks required. Their omission can result in seriously delayed filtering or joining, all of which requires extra resources.

Consider the following scenario: A company had a Web page that its customers could access, and the company used the common tactic of executing a series of services when the customer logged in and storing the results so they could readily be displayed when the customer clicked on the associated tabs. This popular "P-Search" architecture let the customer quickly access commonly used functions, but such a strategy can be costly if the services being executed aren't indexed properly.

In this case, the P-Search service was consuming 9.9 CPU seconds of "in DB2" time on IBM System z10; during peak season, the customers waited between 10 and 40 seconds to see the Web page. The P-Search service contained a dynamic query that was responsible for the CPU consumption: a simple, three-table join on TABLEA, TABLEB, and TABLEC. All tables were partitioned and clustered by a single column SEQEN_NR; however, SEQEN_NR was not added to any other index. This forced the optimizer to choose between filtering and joining.


Figure 3. The first table accessed, TABLEC, applied the LAST_NM filter on the non-clustering index, delaying SEQEN_NR pickup. The second table accessed, TABLEB, applied the ROLE_CD filter, delaying the join. The third table accessed, TABLEA, applied the join filter on the clustering index, delaying the GUAR_DT filter.
Figure 3 diagram

Figure 3 shows what happened:

Step 1. TABLEC is the first table accessed. DB2 Optimizer chooses to apply the local filter LAST_NM, but because SEQEN_NR is not included in the chosen index, DB2 Optimizer incurs random I/O to retrieve all SEQEN_NRs needed for the join to table TABLEB. Because the join is not performed using an index structure including LAST_NM and SEQEN_NR, it's delayed until data pages are retrieved. (See No. 1 in Figure 3.)

Step 2. The second table accessed, TABLEB, suffers from the same index design and applies the local filter ROLE_CD, again incurring random I/O before applying the join predicate. Because the join was not performed using an index structure, it is also delayed until the data pages are retrieved. (See No. 2 in Figure 3.)

Step 3. TABLEA, the last table accessed, also suffers but this time uses the primary/clustering index for the join, uses sequential I/O for the data pages, applies the local filter GUAR_DT, and then sorts for the ORDER BY. (See No. 3 in Figure 3.)

All three tables accessed had filtering or joining delayed until data pages were accessed. This is because DB2 Optimizer had to choose between joining or filtering, as no composite indexes satisfied both.

Solutions for this frequent, business-critical service consisted of the following:

  • Altering the LAST_NM.FIRST_NM index to add SEQEN_NR and SSN_NR to achieve index-only access to completely eliminate all random I/O to the data pages
  • Altering the ROLE_CD index to add SEQEN_NR to combine the filtering and the joining
  • Adding a fifth index on TABLEA of GUAR_DT.SEQEN_NR to combine the filtering and joining and eliminate the sort for the ORDER BY

The new index solutions brought the CPU time down to 0.02 seconds by moving all filtering, joining, and sequencing to composite index structures.


3. Monitor RID pool failures

An expensive access path is one that has many steps that must be completed before a single result row is returned. These access paths usually involve row identifier (RID) sorting to eliminate random I/O and are called List Prefetch, Multiple Index Access, and Hybrid Join - Type N. All of these access paths use the RID pool to pre-sequence data page numbers prior to access. This RID pool resource has limits and, when those limits are exceeded, a table scan occurs.

The more RIDs being poured into the RID pool by any number of programs, the greater the chance of a failure, which forces a table scan. One symptom of RID pool failure is unexpected spikes in sequential prefetch counts. Tracking sequential prefetch counts is one way to spot failures. A better way is through an online monitor. Figure 4 shows the data from an online monitor of one program executing in a one-hour period using Hybrid Join - Type N that observed 61 RID pool failures.


Figure 4. A program with Hybrid Join - Type N access path gets 61 RID pool failures between 12 p.m. and 1 p.m.
Figure 4 number of RID pool failures chart

One solution to the problem is to increase the size of the RID pool, but that may only temporarily suppress failures; worse, it may have no impact at all. A better solution is to identify queries that rely heavily on the RID pool by using DB2 traces or an online monitor. Each query needs to justify its RID pool use by fetching through to the end of the entire result set. Any queries not passing this test need to turn off heroic access paths that use the RID pool by adding either the OPTIMIZE FOR n ROWS or the FETCH FIRST n ROWS ONLY clause to the end of the query. DB2 10 will provide relief from these failures by utilizing external work files for the overflow, thus masking the need to analyze. However, it is best to find and resolve such issues before migrating to DB2 10, or you will have work file madness.


4. Neutralize frozen or poorly performing SQL

Frozen SQL is SQL embedded in a purchased application that cannot be touched. The queries inside frozen SQL sometimes do not follow any performance guidelines and you are stuck with them. You need these queries to execute using as few resources as possible. For example, WHERE DATE(col_TS) BETWEEN :date1 AND :date2 is a Stage 2/Residual condition. This predicate should be rewritten to WHERE col_TS BETWEEN TIMESTAMP(:date1, '00:00:00')AND TIMESTAMP(:date2, '59:59:99'), which is indexable Stage 1/Sargable. But you are forbidden to make that change.

Getting the work done ahead of time and storing the result on disc is one option. For DB2 8 sites, a solution is to create a materialized query table (MQT) that SELECTS DATE(col_TS) as NEW_COL FROM TABLE and to create an index on NEW_COL column. Now all dynamic queries will be redirected to use the index. Another disc-based solution for DB2 9 sites is to use the feature called Index on Expression—CREATE INDEX on DATE(col_TS). This will yield the same result with only one structure and can be used by both static and dynamic queries.

Both techniques are for queries that are ultra expensive to run, ultra complex to compute, or just plain ultra popular and may not be frozen. If your site is missing either type of structure, you are missing out on CPU demand reductions.

Another solution not yet available is memory based and can accelerate longer-running queries by 5 to 10 times without requiring changes to the application, according to IBM lab and customer beta experiences. Here's how: Using IBM Data Studio (available at no charge), select a fact table and dimension tables to be preloaded in memory. This will copy and highly compress portions of your data warehouse into an all-memory, network-attached appliance, forming an accelerated query table (AQT). The appliance is called the IBM Smart Analytics Optimizer and is currently in beta. The DB2 9 Query Router used in redirecting queries from base tables/indexes to MQTs/indexes will be able to transparently redirect INNER/LEFT JOIN queries to use AQTs. This memory-based appliance will gain functionality over time to redirect a larger portion of queries for even more mind-bending query speed-up. Scalability requires a smart performance layer; MQTs, Index on Expression, and AQTs make up this performance layer.


5. Create the optimal schema design for your workload pattern

Scalability also requires a smart schema design that efficiently accommodates processing data in a workload pattern. For example, online transaction processing (OLTP) and intranet applications weather best with a third normal-form table design to mitigate update/delete anomalies and when heavily indexed for probing. This normalized design is well suited for static and focused business questions. Online analytical processing (OLAP) and business intelligence (BI) weather best with star schema and denormalization. The fact table and dimension table design are well suited for ad hoc and wide business questions that span vast amounts of data. It makes sense that these environments would have disparate base table designs, index designs, MQTs, Indexes on Expressions, and in the future, AQTs, to take advantage of all DB2 has to offer for the workload pattern (see Figure 5).


Figure 5. Web and OLTP applications need smart index strategies, including Index on Expression, to handle large numbers of short-answer queries for probing and scrolling. OLAP and BI applications benefit from MQTs and AQTs to handle broad business questions that require vast amounts of data.
Figure 5 chart

Show your database you care

Organizations need to be intimate with their DBMS of choice—attending to its proper care and paying attention to detail. To scale well, packaged DB2 applications also need to adhere to performance rules. By following simple rules—such as using the right SQL, indexes, access paths, MQTs, table design, and soon-to-come AQTs for the job—you can reduce the cost of data management. These rules require some thought and time beyond today's rapid application development cycle. But without this extra effort, your DBMS may start looking worse for wear.


Resources

About the author

Sheryl M. Larsen (www.smlsql.com) is an internationally recognized researcher, consultant, and lecturer, specializing in DB2. She has more than 20 years’ experience in DB2, is known for her extensive expertise in SQL, and performs detailed DB2 performance reviews for many clients. Larsen was voted an IBM Information Champion in 2009 and 2010 and the IDUG Speaker Hall of Fame in 2001, and has been a member of IBM’s DB2 Gold Consultants program since 1994.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=503452
ArticleTitle=Cover Story: Maintenance Makes Money
publish-date=07302010
author1-email=editor@tdagroup.com
author1-email-cc=editor@tdagroup.com

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.

Special offers