group-by-clause
The GROUP BY clause specifies an intermediate result table that consists of a grouping of the rows of R. R is the result of the previous clause of the subselect.
In its simplest form, a GROUP BY clause contains a grouping expression. A grouping expression is an expression used in defining the grouping of R. Each expression or column name included in grouping-expression must unambiguously identify a column of R (SQLSTATE 42702 or 42703). A grouping expression cannot include a scalar fullselect or an XMLQUERY or XMLEXISTS expression (SQLSTATE 42822), or any expression or function that is not deterministic or has an external action (SQLSTATE 42845).
- ROW CHANGE TIMESTAMP FOR table-designator
- ROW CHANGE TOKEN FOR table-designator
- RID_BIT or RID scalar function
More complex forms of the GROUP BY clause include grouping-sets and super-groups. For a description of grouping-sets, see grouping-sets. For a description of super-groups, see super-groups.
The result of GROUP BY is a set of groups of rows. Each row in this result represents the set of rows for which the grouping-expression is equal. For grouping, all null values from a grouping-expression are considered equal.
If a grouping-expression contains decimal floating-point columns, and multiple representations of the same number exist in these columns, the number that is returned can be any of the representations of the number.
A grouping-expression can be used in a search condition in a HAVING clause, in an expression in a SELECT clause, or in a sort-key-expression of an ORDER BY clause (see order-by-clause for details). In each case, the reference specifies only one value for each group. For example, if the grouping-expression is col1+col2, then an allowed expression in the SELECT list is col1+col2+3. Associativity rules for expressions disallow the similar expression, 3+col1+col2, unless parentheses are used to ensure that the corresponding expression is evaluated in the same order. Thus, 3+(col1+col2) is also allowed in the SELECT list. If the concatenation operator is used, the grouping-expression must be used exactly as the expression was specified in the SELECT list.
If the grouping-expression contains varying-length strings with trailing blanks, the values in the group can differ in the number of trailing blanks and might not all have the same length. In that case, a reference to the grouping-expression still specifies only one value for each group. However, the value for a group is chosen arbitrarily from the available set of values or a normalized form that might or might not be from the available set of values. Thus, the actual length of the result value is unpredictable.
As noted, there are some cases where the GROUP BY clause cannot refer directly to a column that is specified in the SELECT clause as an expression (scalar-fullselect, not deterministic or external action functions). To group using such an expression, use a nested table expression or a common table expression to first provide a result table with the expression as a column of the result. For an example using nested table expressions, see Example 9 in Examples of subselect queries.
grouping-sets
A grouping-sets specification can be used to specify multiple grouping clauses in a single statement. This can be thought of as the union of two or more groups of rows into a single result set. It is logically equivalent to the union of multiple subselects with the group by clause in each subselect corresponding to one grouping set. A grouping set can be a single element or can be a list of elements delimited by parentheses, where an element is either a grouping-expression or a super-group. The groups can be computed with a single pass over the base table using grouping-sets.
A simple grouping-expression or the more complex forms of super-groups are supported by the grouping-sets specification. For a description of super-groups, see super-groups.
GROUP BY a
is
the same as GROUP BY GROUPING SETS((a))
and
GROUP BY a,b,c
is the same as GROUP BY GROUPING SETS((a,b,c))
Non-aggregation columns from the SELECT list of the subselect that are excluded from a grouping set will return a null for such columns for each row generated for that grouping set. This reflects the fact that aggregation was done without considering the values for those columns.
The use of grouping sets is illustrated in Example 2 through Example 7 in Examples of grouping sets, cube, and rollup queries.
super-groups
- 1 Alternate specification when used alone in group-by-clause is: grouping-expression-list WITH ROLLUP.
- 2 Alternate specification when used alone in group-by-clause is: grouping-expression-list WITH CUBE.
- ROLLUP ( grouping-expression-list )
- A ROLLUP grouping is an extension to the GROUP BY clause
that produces a result set containing sub-total rows in addition
to the
regular
grouped rows. Sub-total rows aresuper-aggregate
rows that contain further aggregates whose values are derived by applying the same aggregate functions that were used to obtain the grouped rows. These rows are called sub-total rows, because that is their most common use; however, any aggregate function can be used for the aggregation. For instance, MAX and AVG are used in Example 8 in Examples of grouping sets, cube, and rollup queries. The GROUPING aggregate function can be used to indicate if a row was generated by the super-group.A ROLLUP grouping is a series of grouping-sets. The general specification of a ROLLUP with n elements
is equivalent toGROUP BY ROLLUP(C1,C2,...,Cn-1,Cn)
GROUP BY GROUPING SETS((C1,C2,...,Cn-1,Cn) (C1,C2,...,Cn-1) ... (C1,C2) (C1) () )
Note that the n elements of the ROLLUP translate to n+1 grouping sets. Note also that the order in which the grouping-expressions is specified is significant for ROLLUP. For example, the following clause:
is equivalent to:GROUP BY ROLLUP(a,b)
Similarly, the following clause:GROUP BY GROUPING SETS((a,b) (a) () )
is equivalent to:GROUP BY ROLLUP(b,a)
GROUP BY GROUPING SETS((b,a) (b) () )
The ORDER BY clause is the only way to guarantee the order of the rows in the result set. Example 3 in Examples of grouping sets, cube, and rollup queries illustrates the use of ROLLUP.
- CUBE ( grouping-expression-list )
- A CUBE grouping is an extension to the GROUP BY clause
that produces a result set that contains all the rows of a ROLLUP
aggregation and, in addition, contains "cross-tabulation" rows. Cross-tabulation rows
are additional "super-aggregate" rows that are not part of an aggregation
with sub-totals. The GROUPING aggregate function can be used
to indicate if a row was generated by the super-group. Similar to a ROLLUP, a CUBE grouping can also be thought of as a series of grouping-sets. In the case of a CUBE, all permutations of the cubed grouping-expression-list are computed along with the grand total. Therefore, the n elements of a CUBE translate to 2**n (2 to the power n) grouping-sets. For example, a specification of:
GROUP BY CUBE(a,b,c)
is equivalent to:
GROUP BY GROUPING SETS((a,b,c) (a,b) (a,c) (b,c) (a) (b) (c) () )
Note that the three elements of the CUBE translate into eight grouping sets.
The order of specification of elements does not matter for CUBE. 'CUBE (DayOfYear, Sales_Person)' and 'CUBE (Sales_Person, DayOfYear)' yield the same result sets. The use of the word 'same' applies to content of the result set, not to its order. The ORDER BY clause is the only way to guarantee the order of the rows in the result set. The use of CUBE is illustrated in Example 4 in Examples of grouping sets, cube, and rollup queries.
grouping-expression-list
- A grouping-expression-list is used within
a CUBE or ROLLUP clause to define the number of elements in the CUBE
or ROLLUP operation. This is controlled by using parentheses to delimit
elements with multiple grouping-expressions.
For example, suppose that a query is to return the total expenses for the ROLLUP of City within a Province but not within a County. However, the clause:
GROUP BY ROLLUP(Province, County, City)
results in unwanted subtotal rows for the County. In the clause:
GROUP BY ROLLUP(Province, (County, City))
the composite (County, City) forms one element in the ROLLUP and, therefore, a query that uses this clause will yield the required result. In other words, the two-element ROLLUP:
GROUP BY ROLLUP(Province, (County, City))
generates:
GROUP BY GROUPING SETS((Province, County, City) (Province) () )
and the three-element ROLLUP generates:
GROUP BY GROUPING SETS((Province, County, City) (Province, County) (Province) () )
Example 2 in Examples of grouping sets, cube, and rollup queries also utilizes composite column values.
- grand-total
- Both CUBE and ROLLUP return a row which is the overall (grand total) aggregation. This can be separately specified with empty parentheses within the GROUPING SET clause. It can also be specified directly in the GROUP BY clause, although there is no effect on the result of the query. Example 4 in Examples of grouping sets, cube, and rollup queries uses the grand-total syntax.
Combining grouping sets
This can be used to combine any of the types of GROUP BY clauses. When simple grouping-expression fields are combined with other groups, they are "appended" to the beginning of the resulting grouping sets. When ROLLUP or CUBE expressions are combined, they operate similarly to "multipliers" on the remaining expression, forming additional grouping set entries according to the definition of either ROLLUP or CUBE.
GROUP BY a, ROLLUP(b,c)
is
equivalent to GROUP BY GROUPING SETS((a,b,c)
(a,b)
(a) )
Or similarly, GROUP BY a, b, ROLLUP(c,d)
is
equivalent to GROUP BY GROUPING SETS((a,b,c,d)
(a,b,c)
(a,b) )
Combining of ROLLUP elements
acts as follows: GROUP BY ROLLUP(a), ROLLUP(b,c)
is equivalent to
GROUP BY GROUPING SETS((a,b,c)
(a,b)
(a)
(b,c)
(b)
() )
Similarly,
GROUP BY ROLLUP(a), CUBE(b,c)
is equivalent to
GROUP BY GROUPING SETS((a,b,c)
(a,b)
(a,c)
(a)
(b,c)
(b)
(c)
() )
Combining of CUBE and ROLLUP elements acts as follows:
GROUP BY CUBE(a,b), ROLLUP(c,d)
is equivalent to
GROUP BY GROUPING SETS((a,b,c,d)
(a,b,c)
(a,b)
(a,c,d)
(a,c)
(a)
(b,c,d)
(b,c)
(b)
(c,d)
(c)
() )
GROUP BY a, ROLLUP(a,b)
is
equivalent to GROUP BY GROUPING SETS((a,b)
(a) )
A more complete example of combining grouping sets is to construct a result set that eliminates certain rows that might be returned for a full CUBE aggregation.
GROUP BY Region,
ROLLUP(Sales_Person, WEEK(Sales_Date)),
CUBE(YEAR(Sales_Date), MONTH (Sales_Date))
The column listed immediately to the right of GROUP BY is grouped, those within the parenthesis following ROLLUP are rolled up, and those within the parenthesis following CUBE are cubed. Thus, the GROUP BY clause results in a cube of MONTH within YEAR which is then rolled up within WEEK within Sales_Person within the Region aggregation. It does not result in any grand total row or any cross-tabulation rows on Region, Sales_Person or WEEK(Sales_Date) so produces fewer rows than the clause:
GROUP BY ROLLUP (Region, Sales_Person, WEEK(Sales_Date),
YEAR(Sales_Date), MONTH(Sales_Date) )