Skip to main content

The Performance and Cost Optimization Ripple Effect

What every manager needs to know to make database optimization pay off

Scott Hayes (Scott.Hayes@dbisoftware.com), President and CEO, DBI Software
Scott Hayes is president and CEO of DBI Software. He is an IBM Data Champion and Gold Consultant; a frequent speaker at IDUG, IBM, ISACA, and ISSA conferences; a published author; and a regular blogger on DB2 for Linux, UNIX, and Windows performance.

Summary:  A look at database optimization and what every manager needs to know to make it pay off

Date:  30 Jun 2009
Level:  Introductory
Activity:  1053 views
Comments:  


Data Manager column from print edition of IBM Database magazine

Your database is probably the last place you would look for IT cost savings. After all, what's most important about a database is that it delivers the correct information on time, every time. But the beauty of database optimization is that it both cuts costs and improves performance. Optimizing your organization's databases reduces unnecessary and wasteful CPU consumption, generating a ripple effect of benefits: lower CPU consumption, lower energy costs for hardware and cooling, improved server consolidation and virtualization, lower hardware costs, lower software licensing costs, and improved organizational productivity through better, more reliable, and more predictable response times.

But to get the most out of your database optimization efforts, there are two things that you need to know. The first thing is: The physical design of your database is critically important, especially the indexes. Indexes are important to databases for much the same reasons that they're important to people: they make it possible to find things faster with less effort.

Imagine you have a bowl of blueberries in your refrigerator and you want to use them in a recipe. You get out your favorite cookbook to find recipes that use blueberries, but it's pretty big: 592 pages. You could scan through every page of the cookbook and scan ingredients lists looking for blueberries, but scans are costly. They take time, energy, and might even make your eyes hurt. More likely, your cookbook has a good index in the back and you can look up "blueberries," find the recipes, and go straight to those pages to pick the one you want.

Databases work the same way. Your 592-page cookbook wouldn't make for a very large database table-almost certainly small enough to fit on a 32 MB USB memory stick. But if you asked a cookbook application to search for recipes with blueberries in the ingredient list, the application would send a query to the database, and the database would scan recipe entries. The scan might take 592 CPU instructions or more if the cookbook is not already in database memory, and it might take five seconds to complete. Now, if a useful index on INGREDIENT is available, the same search could be completed using only six CPU instructions and would be finished in a fraction of a second.

Your business system applications probably don't do frequent searches for blueberries, but the same principles apply. Just about every business system database and data warehouse has a large number of frequently accessed small tables that would fit on USB jump drives. Ask your DBAs how many tables exist in your databases that are smaller than 64 MB-the answer may surprise you. Small database tables are important because of the second thing you need to know: Maximum performance optimization is possible only when you consider the entire system.

DBAs use a utility provided by database vendors called EXPLAIN to review the database's access strategy to satisfy queries. The EXPLAIN utility reports the anticipated computing resource cost of executing the query. DBAs normally review the anticipated costs before putting queries into production. If the anticipated cost is high, they tune the statement and the physical design of the database to reduce its anticipated cost, often by adding an index.

If the anticipated cost of a query is relatively low-say a five-second query about blueberries-a DBA will often pay no attention to it. The problem is that EXPLAIN provides estimates for only a single execution of a given query. If your application frequently accesses smaller tables without proper indexes, the database will load them into memory each time and scan them.

Processing waste like this is prolific in at least 9 out of 10 production databases. At a large bank I worked with recently, a multi-terabyte OLTP banking application spent 34 percent of all its CPU time running queries against a table with only 32 rows. At a large retailer, adding a missing index to a small, 2,000-row table in an 8 TB warehouse reduced the elapsed time of a critical decision support query from three hours to two minutes. At another online retailer, 97 percent of the CPU cost on a system with eight CPUs was attributable to a single query accessing a relatively small table. As we say in Texas, "I'm going to shoot you straight"-wasteful, high-cost statements like these exist in almost every database, but they are often hidden from plain view because they access smaller tables.

Databases with self-tuning memory capabilities will detect costly queries and attempt to keep small tables in memory, mitigating I/O costs and reducing response time somewhat. But these queries are still running at a CPU cost that is 99 percent higher than it should be. Self-tuning memory simply reacts to observed performance characteristics and does its best to compensate for physical design flaws.

In the current economic climate, the immediate need for and value of database performance cost optimization and tuning cannot be overstated. Unless your organization is flush with cash and has no regard for energy and the environment, it is past time to give your databases a proper tune-up. Ignoring this call to action makes no more sense than driving your car around with underinflated tires and a trunk full of rocks. Consider: If you could cut your number of database servers in half, your license costs in half, your energy costs in half, and simultaneously double or triple performance and improve reliability, would you? And what would you be willing to invest to obtain these savings and improvements?


About the author

Scott Hayes is president and CEO of DBI Software. He is an IBM Data Champion and Gold Consultant; a frequent speaker at IDUG, IBM, ISACA, and ISSA conferences; a published author; and a regular blogger on DB2 for Linux, UNIX, and Windows performance.

Comments



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=405390
ArticleTitle=The Performance and Cost Optimization Ripple Effect
publish-date=06302009
author1-email=Scott.Hayes@dbisoftware.com
author1-email-cc=Author1 cc address

Table of contents

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers