group-by-clause

The GROUP BY clause specifies a result table that consists of a grouping of the rows of intermediate result table that is the result of the previous clause.

group-by-clause

Read syntax diagramSkip visual syntax diagramGROUP BY ,grouping-expressiongrouping-setssuper-groups

In its simplest form, a GROUP BY clause contains a grouping-expression.

grouping-expression
A grouping-expression is an expression that defines the grouping of R. The following restrictions apply to grouping-expression:
  • If grouping-expression is a single column, the column name must unambiguously identify a column of R.
  • The result of grouping-expression cannot be a LOB data type (or a distinct type that is based on a LOB) or an XML data type.
  • grouping-expression cannot include any of the following items:
    • A correlated column
    • A host variable
    • An aggregate function
    • Any function or expression that is not deterministic or that is defined to have an external action
    • A scalar fullselect
    • A CASE expression whose searched-when-clause contains a quantified predicate, an IN predicate using a fullselect, or an EXISTS predicate

More complex forms of the GROUP BY clause include grouping-sets and super-groups.

The result of GROUP BY is a set of groups of rows. In each group of more than one row, all values of each grouping-expression are equal, and all rows with the same set of values of the grouping-expression are in the same group. For grouping, all null values for a grouping-expression are considered equal.

If a grouping-expression contains DECFLOAT values, the DECFLOAT values with the same value will be in the same group. But the number of digits returned for each group is unpredictable.

Because every row of a group contains the same value of any grouping-expression, a grouping-expression can be used in a search condition in a HAVING clause or an expression in a SELECT clause, or in a sort-key-expression of an ORDER BY clause. In each case, the reference specifies only one value for each group. For example, if grouping-expression is col1+col2, col1+col2+3 would be an allowed expression in the select list. Associative rules for expressions do not allow the similar expression of 3+col1+col2, unless parentheses are used to ensure that the corresponding expression is evaluated in the same order. Thus, 3+(col1+col2) would also be allowed in the select list. If the concatenation operator is used, grouping-expression must be used exactly as the expression was specified in the select list.

If a grouping-expression contains varying-length strings with trailing blanks, the values in the group can differ in the number of trailing blanks and might not all have the same length. In that case, a reference to grouping-expression still specifies only one value for each group, but the value for a group is chosen arbitrarily from the available set of values. Thus, the actual length of the result value is unpredictable.

Row access controls do not affect the operation of the GROUP BY clause.

grouping-sets

Read syntax diagramSkip visual syntax diagramGROUPING SETS( ,grouping-expressionsuper-groups(,grouping-expressionsuper-groups) )

A grouping-sets specification can be used to specify multiple grouping clauses in a single statement. This can be thought of as the union of two or more groups of rows into a single result set. It is logically equivalent to the union of multiple subselects with the GROUP BY clause in each subselect corresponding to one grouping set. A grouping set can be a single element or can be a list of elements delimited by parentheses, where an element is either a grouping-expression or a super-group. The groups can be computed with a single pass over the base table using grouping-sets.

Grouping sets are the fundamental building blocks for GROUP BY operations. A simple GROUP BY with a single column can be considered a grouping set with one element. For example, the following clauses are equivalent:

Simple GROUP BY clause Equivalent GROUPING SETS clause
   GROUP BY a
   GROUP BY GROUPING SETS((a))
   GROUP BY a,b,c
   GROUP BY GROUPING SETS((a,b,c))

Non-aggregation columns from the select list of the subselect that are excluded from a grouping set will return a null for such columns for each row generated for that grouping set. This reflects the fact that aggregation was done without considering the values for those columns.

Start of changeIf GROUPING SETS is specified, a table-reference in the associated FROM clause must not contain a column defined with a field procedure. End of change

super-groups

Read syntax diagramSkip visual syntax diagramROLLUP(grouping-expression-list)1CUBE(grouping-expression-list)2grand-total
grouping-expression-list
Read syntax diagramSkip visual syntax diagram,grouping-expression(,grouping-expression)
grand-total
Read syntax diagramSkip visual syntax diagram()
Notes:
  • 1 Alternate specification when used alone in a group-by-clause is: grouping-expression-list WITH ROLLUP.
  • 2 Alternate specification when used alone in group-by-clause is: grouping-expression-list WITH CUBE.
ROLLUP (grouping-expression-list)
A ROLLUP grouping produces a result set that contains sub-total rows in addition to the regular grouped rows. Sub-total rows are super-aggregate rows that contain further aggregates whose values are derived by applying the same aggregate functions that were used to obtain the grouped rows. These rows are called sub-total rows, because that is their most common use. However, any aggregate function can be used for the aggregation.

A ROLLUP grouping is a series of grouping-sets. The n elements of the ROLLUP translate to n+1 grouping sets.

The general specification of a ROLLUP with n elements, as shown in the first column of the following table, is equivalent to the n+1 grouping sets as shown in the second column of the table:
ROLLUP clause with n elements Equivalent GROUPING SETS clause
GROUP BY ROLLUP(C1,C2,...,Cn-1,Cn)
GROUP BY GROUPING SETS((C1,C2,...,Cn-1,Cn)
                         (C1,C2,...,Cn-1)
                         ...
                         (C1,C2)
                         (C1)
                         () )
Note: The order in which the elements of the grouping-expression-list are specified is significant for ROLLUP. As the following table shows, the initial element in the grouping-expression-list for the ROLLUP is the final element in the GROUPING SET.
ROLLUP clause Equivalent GROUPING SETS clause
  GROUP BY ROLLUP(a,b)
  GROUP BY GROUPING SETS((a,b)
                         (a)
                         () )
  GROUP BY ROLLUP(b,a)
  GROUP BY GROUPING SETS((b,a)
                         (b)
                         () )

The ORDER BY clause is the only way to guarantee the order of the rows in the result set.

CUBE (grouping-expression-list)
A CUBE grouping produces a result set that contains all the rows of a ROLLUP aggregation and, in addition, contains cross-tabulation rows. Cross-tabulation rows are additional super-aggregate rows that are not part of an aggregation with sub-totals.

Similar to a ROLLUP, a CUBE grouping can also be thought of as a series of grouping-sets. In the case of a CUBE, all permutations of the cubed grouping-expression-list are computed along with the grand total. Therefore, the n elements of a CUBE translate to 2n grouping-sets.

For example, the following clauses are equivalent:

CUBE clause Equivalent GROUPING SETS clause
   GROUP BY CUBE(a,b,c)
   GROUP BY GROUPING SETS((a,b,c)
                          (a,b)
                          (a,c)
                          (b,c)
                          (a)
                          (b)
                          (c)
                          ())
Note: The three elements of the CUBE translate into eight grouping sets.

The order specified for the elements does not matter for CUBE. CUBE (DayOfYear, Sales_Person) and CUBE (Sales_Person, DayOfYear) yield the same result sets (the result sets might not be in the same order).

The ORDER BY clause is the only way to guarantee the order of the rows in the result set.

grouping-expression-list
A grouping-expression-list is used within a ROLLUP or CUBE clause to define the number of elements in the ROLLUP or CUBE operation. The number of elements is controlled by using parentheses to delimit elements with multiple grouping-expression elements.

For example, suppose that a query is to return the total expenses for the ROLLUP of City within Province but not within County. Without a grouping-expression-list, the following clause results in unwanted subtotal rows for County:

   GROUP BY ROLLUP(Province, County, City)

Compare the equivalent GROUPING SETS clause:

   GROUP BY GROUPING SETS((Province, County, City)
                          (Province, County)
                          (Province)
                          () )

In the following clause, the composite (County, City) forms one element in the ROLLUP clause and, therefore, a query that uses the following clause will yield the required result:

   GROUP BY ROLLUP(Province, (County, City))

In other words, this is a two-element ROLLUP, which is equivalent to the following GROUPING SETS clause:

   GROUP BY GROUPING SETS((Province, County, City)
                          (Province)
                          () )
grand-total
Both ROLLUP and CUBE return a row which is the overall aggregation (grand total). This can be separately specified with empty parentheses within the GROUPING SETS clause. It can also be specified directly in the GROUP BY clause, although there is no effect on the result of the query.

Combined groupings

The different types of GROUP BY clause can be combined. When simple grouping-expression expressions are combined with other groups, those expressions are prepended to the resulting grouping sets. When ROLLUP or CUBE expressions are combined, those expressions form additional grouping set entries on the remaining expression, according to the definition of either a ROLLUP expression or a CUBE expression.

Table 1. Combined grouping-expression and equivalent GROUPING SETS clauses
GROUP BY clause with combined grouping-expression Equivalent GROUP BY clause with GROUPING SETS clause
  GROUP BY a, ROLLUP(b,c)
  GROUP BY GROUPING SETS((a,b,c)
                         (a,b)
                         (a) )
  GROUP BY a, b, ROLLUP(c,d)
  GROUP BY GROUPING SETS((a,b,c,d)
                         (a,b,c)
                         (a,b) )
  GROUP BY ROLLUP(a), ROLLUP(b,c)
  GROUP BY GROUPING SETS((a,b,c)
                         (a,b)
                         (a)
                         (b,c)
                         (b)
                         () )
  GROUP BY ROLLUP(a), CUBE(b,c)
  GROUP BY GROUPING SETS((a,b,c)
                         (a,b)
                         (a,c)
                         (a)
                         (b,c)
                         (b)
                         (c)
                         () )
  GROUP BY CUBE(a,b), ROLLUP(c,d)
  GROUP BY GROUPING SETS((a,b,c,d)
                         (a,b,c)
                         (a,b)
                         (a,c,d)
                         (a,c)
                         (a)
                         (b,c,d)
                         (b,c)
                         (b)
                         (c,d)
                         (c)
                         () )
  GROUP BY a, ROLLUP(a,b)
  GROUP BY GROUPING SETS((a,b)
                         (a) )
1
Note:
  1. Combined grouping sets also eliminate duplicates within each grouping set.

A more complete example of combining grouping sets is to construct a result set that eliminates certain rows that might be returned for a full CUBE aggregation. For example, consider the following GROUP BY clause:

  GROUP BY Region,
        ROLLUP(Sales_Person, WEEK(Sales_Date)),
        CUBE(YEAR(Sales_Date), MONTH (Sales_Date))

The column listed immediately to the right of GROUP BY is grouped, the columns within the parenthesis following ROLLUP are rolled up, and the columns within the parenthesis following CUBE are cubed. Thus, the GROUP BY clause results in a cube of MONTH within YEAR which is then rolled up within WEEK within Sales_Person within the Region aggregation. This GROUP BY does not result in any grand total row or any cross-tabulation rows on Region, Sales_Person or WEEK(Sales_Date), so it produces fewer rows than the following clause:

  GROUP BY ROLLUP (Region, Sales_Person, WEEK(Sales_Date),
                   YEAR(Sales_Date), MONTH(Sales_Date))