group-by-clause

The GROUP BY clause specifies an intermediate result table that consists of a grouping of the rows of R. R is the result of the previous clause of the subselect.

Read syntax diagramSkip visual syntax diagramGROUP BY ,grouping-expressiongrouping-setssuper-groups

In its simplest form, a GROUP BY clause contains a grouping expression. A grouping expression is an expression used in defining the grouping of R. Each expression or column name included in grouping-expression must unambiguously identify a column of R (SQLSTATE 42702 or 42703). A grouping expression cannot include a scalar fullselect or an XMLQUERY or XMLEXISTS expression (SQLSTATE 42822), or any expression or function that is not deterministic or has an external action (SQLSTATE 42845).

Note: The following expressions, which do not contain an explicit column reference, can be used in a grouping-expression to identify a column of R:
  • ROW CHANGE TIMESTAMP FOR table-designator
  • ROW CHANGE TOKEN FOR table-designator
  • RID_BIT or RID scalar function

More complex forms of the GROUP BY clause include grouping-sets and super-groups. For a description of grouping-sets, see grouping-sets. For a description of super-groups, see super-groups.

The result of GROUP BY is a set of groups of rows. Each row in this result represents the set of rows for which the grouping-expression is equal. For grouping, all null values from a grouping-expression are considered equal.

If a grouping-expression contains decimal floating-point columns, and multiple representations of the same number exist in these columns, the number that is returned can be any of the representations of the number.

A grouping-expression can be used in a search condition in a HAVING clause, in an expression in a SELECT clause, or in a sort-key-expression of an ORDER BY clause (see order-by-clause for details). In each case, the reference specifies only one value for each group. For example, if the grouping-expression is col1+col2, then an allowed expression in the SELECT list is col1+col2+3. Associativity rules for expressions disallow the similar expression, 3+col1+col2, unless parentheses are used to ensure that the corresponding expression is evaluated in the same order. Thus, 3+(col1+col2) is also allowed in the SELECT list. If the concatenation operator is used, the grouping-expression must be used exactly as the expression was specified in the SELECT list.

If the grouping-expression contains varying-length strings with trailing blanks, the values in the group can differ in the number of trailing blanks and might not all have the same length. In that case, a reference to the grouping-expression still specifies only one value for each group. However, the value for a group is chosen arbitrarily from the available set of values or a normalized form that might or might not be from the available set of values. Thus, the actual length of the result value is unpredictable.

As noted, there are some cases where the GROUP BY clause cannot refer directly to a column that is specified in the SELECT clause as an expression (scalar-fullselect, not deterministic or external action functions). To group using such an expression, use a nested table expression or a common table expression to first provide a result table with the expression as a column of the result. For an example using nested table expressions, see Example 9 in Examples of subselect queries.

grouping-sets

Read syntax diagramSkip visual syntax diagramGROUPING SETS( ,grouping-expressionsuper-groups(,grouping-expressionsuper-groups) )

A grouping-sets specification can be used to specify multiple grouping clauses in a single statement. This can be thought of as the union of two or more groups of rows into a single result set. It is logically equivalent to the union of multiple subselects with the group by clause in each subselect corresponding to one grouping set. A grouping set can be a single element or can be a list of elements delimited by parentheses, where an element is either a grouping-expression or a super-group. The groups can be computed with a single pass over the base table using grouping-sets.

A simple grouping-expression or the more complex forms of super-groups are supported by the grouping-sets specification. For a description of super-groups, see super-groups.

Note that grouping sets are the fundamental building blocks for GROUP BY operations. A simple GROUP BY with a single column can be considered a grouping set with one element. For example:
   GROUP BY a
is the same as
   GROUP BY GROUPING SETS((a))
and
   GROUP BY a,b,c
is the same as
   GROUP BY GROUPING SETS((a,b,c))

Non-aggregation columns from the SELECT list of the subselect that are excluded from a grouping set will return a null for such columns for each row generated for that grouping set. This reflects the fact that aggregation was done without considering the values for those columns.

The use of grouping sets is illustrated in Example 2 through Example 7 in Examples of grouping sets, cube, and rollup queries.

super-groups

Read syntax diagramSkip visual syntax diagramROLLUP(grouping-expression-list)1CUBE(grouping-expression-list)2grand-total
grouping-expression-list
Read syntax diagramSkip visual syntax diagram,grouping-expression(,grouping-expression)
grand-total
Read syntax diagramSkip visual syntax diagram()
Notes:
  • 1 Alternate specification when used alone in group-by-clause is: grouping-expression-list WITH ROLLUP.
  • 2 Alternate specification when used alone in group-by-clause is: grouping-expression-list WITH CUBE.
ROLLUP ( grouping-expression-list )
A ROLLUP grouping is an extension to the GROUP BY clause that produces a result set containing sub-total rows in addition to the regular grouped rows. Sub-total rows are super-aggregate rows that contain further aggregates whose values are derived by applying the same aggregate functions that were used to obtain the grouped rows. These rows are called sub-total rows, because that is their most common use; however, any aggregate function can be used for the aggregation. For instance, MAX and AVG are used in Example 8 in Examples of grouping sets, cube, and rollup queries. The GROUPING aggregate function can be used to indicate if a row was generated by the super-group.
A ROLLUP grouping is a series of grouping-sets. The general specification of a ROLLUP with n elements
  GROUP BY ROLLUP(C1,C2,...,Cn-1,Cn)
is equivalent to
  GROUP BY GROUPING SETS((C1,C2,...,Cn-1,Cn)
                         (C1,C2,...,Cn-1)
                         ...
                         (C1,C2)
                         (C1)
                         () )
Note that the n elements of the ROLLUP translate to n+1 grouping sets. Note also that the order in which the grouping-expressions is specified is significant for ROLLUP. For example, the following clause:
  GROUP BY ROLLUP(a,b)
is equivalent to:
  GROUP BY GROUPING SETS((a,b)
                         (a)
                         () )
Similarly, the following clause:
  GROUP BY ROLLUP(b,a)
is equivalent to:
  GROUP BY GROUPING SETS((b,a)
                         (b)
                         () )

The ORDER BY clause is the only way to guarantee the order of the rows in the result set. Example 3 in Examples of grouping sets, cube, and rollup queries illustrates the use of ROLLUP.

CUBE ( grouping-expression-list )
A CUBE grouping is an extension to the GROUP BY clause that produces a result set that contains all the rows of a ROLLUP aggregation and, in addition, contains "cross-tabulation" rows. Cross-tabulation rows are additional "super-aggregate" rows that are not part of an aggregation with sub-totals. The GROUPING aggregate function can be used to indicate if a row was generated by the super-group.
Similar to a ROLLUP, a CUBE grouping can also be thought of as a series of grouping-sets. In the case of a CUBE, all permutations of the cubed grouping-expression-list are computed along with the grand total. Therefore, the n elements of a CUBE translate to 2**n (2 to the power n) grouping-sets. For example, a specification of:
   GROUP BY CUBE(a,b,c)

is equivalent to:

   GROUP BY GROUPING SETS((a,b,c)
                          (a,b)
                          (a,c)
                          (b,c)
                          (a)
                          (b)
                          (c)
                          () )

Note that the three elements of the CUBE translate into eight grouping sets.

The order of specification of elements does not matter for CUBE. 'CUBE (DayOfYear, Sales_Person)' and 'CUBE (Sales_Person, DayOfYear)' yield the same result sets. The use of the word 'same' applies to content of the result set, not to its order. The ORDER BY clause is the only way to guarantee the order of the rows in the result set. The use of CUBE is illustrated in Example 4 in Examples of grouping sets, cube, and rollup queries.

grouping-expression-list
A grouping-expression-list is used within a CUBE or ROLLUP clause to define the number of elements in the CUBE or ROLLUP operation. This is controlled by using parentheses to delimit elements with multiple grouping-expressions.

For example, suppose that a query is to return the total expenses for the ROLLUP of City within a Province but not within a County. However, the clause:

   GROUP BY ROLLUP(Province, County, City)

results in unwanted subtotal rows for the County. In the clause:

   GROUP BY ROLLUP(Province, (County, City))

the composite (County, City) forms one element in the ROLLUP and, therefore, a query that uses this clause will yield the required result. In other words, the two-element ROLLUP:

   GROUP BY ROLLUP(Province, (County, City))

generates:

   GROUP BY GROUPING SETS((Province, County, City)
                          (Province)
                          () )

and the three-element ROLLUP generates:

   GROUP BY GROUPING SETS((Province, County, City)
                          (Province, County)
                          (Province)
                          () )

Example 2 in Examples of grouping sets, cube, and rollup queries also utilizes composite column values.

grand-total
Both CUBE and ROLLUP return a row which is the overall (grand total) aggregation. This can be separately specified with empty parentheses within the GROUPING SET clause. It can also be specified directly in the GROUP BY clause, although there is no effect on the result of the query. Example 4 in Examples of grouping sets, cube, and rollup queries uses the grand-total syntax.

Combining grouping sets

This can be used to combine any of the types of GROUP BY clauses. When simple grouping-expression fields are combined with other groups, they are "appended" to the beginning of the resulting grouping sets. When ROLLUP or CUBE expressions are combined, they operate similarly to "multipliers" on the remaining expression, forming additional grouping set entries according to the definition of either ROLLUP or CUBE.

For instance, combining grouping-expression elements acts as follows:
  GROUP BY a, ROLLUP(b,c)
is equivalent to
  GROUP BY GROUPING SETS((a,b,c)
                         (a,b)
                         (a) )
Or similarly,
  GROUP BY a, b, ROLLUP(c,d)
is equivalent to
  GROUP BY GROUPING SETS((a,b,c,d)
                         (a,b,c)
                         (a,b) )
Combining of ROLLUP elements acts as follows:
  GROUP BY ROLLUP(a), ROLLUP(b,c)

is equivalent to

  GROUP BY GROUPING SETS((a,b,c)
                         (a,b)
                         (a)
                         (b,c)
                         (b)
                         () )

Similarly,

  GROUP BY ROLLUP(a), CUBE(b,c)

is equivalent to

  GROUP BY GROUPING SETS((a,b,c)
                         (a,b)
                         (a,c)
                         (a)
                         (b,c)
                         (b)
                         (c)
                         () )

Combining of CUBE and ROLLUP elements acts as follows:

  GROUP BY CUBE(a,b), ROLLUP(c,d)

is equivalent to

  GROUP BY GROUPING SETS((a,b,c,d)
                         (a,b,c)
                         (a,b)
                         (a,c,d)
                         (a,c)
                         (a)
                         (b,c,d)
                         (b,c)
                         (b)
                         (c,d)
                         (c)
                         () )
Similar to a simple grouping-expression, combining grouping sets also eliminates duplicates within each grouping set. For instance,
  GROUP BY a, ROLLUP(a,b)
is equivalent to
  GROUP BY GROUPING SETS((a,b)
                         (a) )

A more complete example of combining grouping sets is to construct a result set that eliminates certain rows that might be returned for a full CUBE aggregation.

For example, consider the following GROUP BY clause:
  GROUP BY Region,
           ROLLUP(Sales_Person, WEEK(Sales_Date)),
           CUBE(YEAR(Sales_Date), MONTH (Sales_Date))

The column listed immediately to the right of GROUP BY is grouped, those within the parenthesis following ROLLUP are rolled up, and those within the parenthesis following CUBE are cubed. Thus, the GROUP BY clause results in a cube of MONTH within YEAR which is then rolled up within WEEK within Sales_Person within the Region aggregation. It does not result in any grand total row or any cross-tabulation rows on Region, Sales_Person or WEEK(Sales_Date) so produces fewer rows than the clause:

  GROUP BY ROLLUP (Region, Sales_Person, WEEK(Sales_Date),
                   YEAR(Sales_Date), MONTH(Sales_Date) )