Your database is probably the last place you would look for IT cost savings. After all, what's most important about a database is that it delivers the correct information on time, every time. But the beauty of database optimization is that it both cuts costs and improves performance. Optimizing your organization's databases reduces unnecessary and wasteful CPU consumption, generating a ripple effect of benefits: lower CPU consumption, lower energy costs for hardware and cooling, improved server consolidation and virtualization, lower hardware costs, lower software licensing costs, and improved organizational productivity through better, more reliable, and more predictable response times.
But to get the most out of your database optimization efforts, there are two things that you need to know. The first thing is: The physical design of your database is critically important, especially the indexes. Indexes are important to databases for much the same reasons that they're important to people: they make it possible to find things faster with less effort.
Imagine you have a bowl of blueberries in your refrigerator and you want to use them in a recipe. You get out your favorite cookbook to find recipes that use blueberries, but it's pretty big: 592 pages. You could scan through every page of the cookbook and scan ingredients lists looking for blueberries, but scans are costly. They take time, energy, and might even make your eyes hurt. More likely, your cookbook has a good index in the back and you can look up "blueberries," find the recipes, and go straight to those pages to pick the one you want.
Databases work the same way. Your 592-page cookbook wouldn't make for a very large database table-almost certainly small enough to fit on a 32 MB USB memory stick. But if you asked a cookbook application to search for recipes with blueberries in the ingredient list, the application would send a query to the database, and the database would scan recipe entries. The scan might take 592 CPU instructions or more if the cookbook is not already in database memory, and it might take five seconds to complete. Now, if a useful index on INGREDIENT is available, the same search could be completed using only six CPU instructions and would be finished in a fraction of a second.
Your business system applications probably don't do frequent searches for blueberries, but the same principles apply. Just about every business system database and data warehouse has a large number of frequently accessed small tables that would fit on USB jump drives. Ask your DBAs how many tables exist in your databases that are smaller than 64 MB-the answer may surprise you. Small database tables are important because of the second thing you need to know: Maximum performance optimization is possible only when you consider the entire system.
DBAs use a utility provided by database vendors called EXPLAIN to review the database's access strategy to satisfy queries. The EXPLAIN utility reports the anticipated computing resource cost of executing the query. DBAs normally review the anticipated costs before putting queries into production. If the anticipated cost is high, they tune the statement and the physical design of the database to reduce its anticipated cost, often by adding an index.
If the anticipated cost of a query is relatively low-say a five-second query about blueberries-a DBA will often pay no attention to it. The problem is that EXPLAIN provides estimates for only a single execution of a given query. If your application frequently accesses smaller tables without proper indexes, the database will load them into memory each time and scan them.
Processing waste like this is prolific in at least 9 out of 10 production databases. At a large bank I worked with recently, a multi-terabyte OLTP banking application spent 34 percent of all its CPU time running queries against a table with only 32 rows. At a large retailer, adding a missing index to a small, 2,000-row table in an 8 TB warehouse reduced the elapsed time of a critical decision support query from three hours to two minutes. At another online retailer, 97 percent of the CPU cost on a system with eight CPUs was attributable to a single query accessing a relatively small table. As we say in Texas, "I'm going to shoot you straight"-wasteful, high-cost statements like these exist in almost every database, but they are often hidden from plain view because they access smaller tables.
Databases with self-tuning memory capabilities will detect costly queries and attempt to keep small tables in memory, mitigating I/O costs and reducing response time somewhat. But these queries are still running at a CPU cost that is 99 percent higher than it should be. Self-tuning memory simply reacts to observed performance characteristics and does its best to compensate for physical design flaws.
In the current economic climate, the immediate need for and value of database performance cost optimization and tuning cannot be overstated. Unless your organization is flush with cash and has no regard for energy and the environment, it is past time to give your databases a proper tune-up. Ignoring this call to action makes no more sense than driving your car around with underinflated tires and a trunk full of rocks. Consider: If you could cut your number of database servers in half, your license costs in half, your energy costs in half, and simultaneously double or triple performance and improve reliability, would you? And what would you be willing to invest to obtain these savings and improvements?





