Did you know….. you can reduce the CPU consumption of RUNSTATS by using the Statistics Advisor of InfoSphere Optim Query Workload Tuner for DB2 for z/OS
IBM_Optim 27000269HS Comment (1) Visits (4096)
…and at the same time maintain or improve query performance.
Note from the editor: This blog entry has been updated to reflect the current product name, InfoSphere Optim Query Workload Tuner for DB2 for z/OS.
However, there’s another way that this solution can help drive down CPU costs , and it has to do with the general maintenance that DBAs are tasked to perform.
A common practice in many DB2 for z/OS shops is to execute RUNSTATS with the TABLESPACE <dat
Why is this option expensive? The TABLE ALL option gathers statistics on all columns of the table(s) in the named table space. The CPU resources required increases as the table size increases and is also dependent on the number of columns defined. Are all of the column statistics really needed? Probably not, because if the column is not referenced in the WHERE clause of a query, then unneeded statistics have been gathered and unnecessary CPU resources have been consumed. Also the TABLE ALL option does not gather COLGROUP or histogram statistics, which might improve filter factor estimates and improve query performance.
The DB2 for z/OS V9.1 Utilities Guide has a paragraph in the Improving RUNSTATS performance section:
“Run RUNSTATS on only the columns or column groups that might be used as search conditions in a WHERE clause of queries. Use the COLGROUP option to identify the column groups. Collecting additional statistics on groups or columns that are used as predicates improves the accuracy of the filter factor estimate and leads to improved query performance. Collecting statistics on all columns of a table is costly and might not be necessary.”
Easier said than done. How does one perform the analysis required to only gather the statistics that are truly required to maintain query performance and reduce the CPU requirements of RUNSTATS? Without a tool, it’s a very manually intensive process and can be even more of a challenge when working with dynamic SQL. Just to give you a feel for this effort, here are the steps you might need to do:
Fortunately, the Statistics Advisor in InfoSphere Optim Query Workload Tuner for DB2 for z/OS can make this analysis less daunting – a lot less daunting. Statistics Advisor will analyze a single statement or group of statements (aka workload) and provide a set of recommended RUNSTATS statements.
To illustrate how easy it is to gather the necessary statistics in OE, I captured a workload from the DB2 catalog for a specific collection. I was presented with 852 queries and then invoked Workload Statistics Advisor on this set of queries to receive the recommended RUNSTATS statements for objects in the workload. The process was completed in less than thirty minutes. For dynamic SQL, I would follow basically the same process, but the workload would have been captured from the dynamic statement cache, or from the profile monitor for DB2 for z/OS V9.1 subsystems.
To validate the CPU usage assertions, I actually executed RUNSTATS twice on one of the table spaces used in the above workload. The first execution was with the commonly used TABLE ALL option:
RUNSTATS TABLESPACE DBAS
SHRLEVEL CHANGE REPORT YES
The second execution used the statement recommended by the Query Workload Tuner Statistics Advisor:
RUNSTATS TABLESPACE DBASE1.TSPACE1
INDEX(QUAL1.TB1X3 HISTOGRAM NUMCOLS 1 NUMQUANTILES 20,
SHRLEVEL CHANGE REPORT YES
There was a 28% reduction in CPU time between the two runs. Not only has the CPU consumption been reduced, but additional filter factor statistics have been gathered that should improve query performance.
One topic I have not covered is SAMPLING. It is a technique that can be used to further reduce RUNSTATS CPU consumption by limiting the number of rows evaluated. Statistics Advisor supports SAMPLING and its use is controlled via user-managed preferences.
So in summary, InfoSphere Optim Query Workload Tuner for DB2 for z/OSs can help DBAs gather the right statistics resulting in less CPU consumption and improved maintenance window throughput. And the additional stats gathered may improve query performance, which is the ultimate goal of running RUNSTATS after all.