Coding queries with aggregate functions efficiently

If your query involves aggregate functions, you can take measure to increase the chances that they are evaluated when the data is retrieved, rather than afterward. Doing that can improve the performance of the query.

About this task

Begin program-specific programming interface information. In general, a aggregate function performs best when evaluated during data access and next best when evaluated during Db2 sort. Least preferable is to have a aggregate function evaluated after the data has been retrieved. You can use EXPLAIN to determine when Db2 evaluates the aggregate functions.

Queries that involve the functions MAX or MIN might be able to take advantage of one-fetch access.

Procedure

To ensure that an aggregate function is evaluated when Db2 retrieves the data:

Code the query so that every aggregate function that it contains meets the following criteria:
  • No sort is needed for GROUP BY. Check this in the EXPLAIN output.
  • No stage 2 (residual) predicates exist. Check this in your application.
  • No distinct set functions exist, such as COUNT(DISTINCT C1).
  • If the query is a join, all set functions must be on the last table joined. Check this by looking at the EXPLAIN output.
  • All aggregate functions must be on single columns with no arithmetic expressions.
  • The aggregate function is not one of the following aggregate functions:
    • STDDEV
    • STDDEV_SAMP
    • VAR
    • VAR_SAMP
End program-specific programming interface information.