DB2 Version 9.7 for Linux, UNIX, and Windows

subselect


>>-select-clause--from-clause--+--------------+----------------->
                               '-where-clause-'   

>--+-----------------+--+---------------+----------------------->
   '-group-by-clause-'  '-having-clause-'   

>--+-----------------+--+--------------------+------------------>
   '-order-by-clause-'  '-fetch-first-clause-'   

>--+------------------+----------------------------------------><
   '-isolation-clause-'

The subselect is a component of the fullselect.

A subselect specifies a result table derived from the tables, views or nicknames identified in the FROM clause. The derivation can be described as a sequence of operations in which the result of each operation is input for the next. (This is only a way of describing the subselect. The method used to perform the derivation can be quite different from this description. If portions of the subselect do not actually need to be executed for the correct result to be obtained, they might or might not be executed.)

The authorization for a subselect is described in the Authorization section in "SQL queries".

The clauses of the subselect are processed in the following sequence:

FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
FETCH FIRST clause

A subselect that contains an ORDER BY or FETCH FIRST clause cannot be specified:

In the outermost fullselect of a view.
In the outer fullselect of a materialized query table.
Unless the subselect is enclosed in parenthesis.

For example, the following is not valid (SQLSTATE 428FJ):

SELECT * FROM T1
   ORDER BY C1
UNION
SELECT * FROM T2
   ORDER BY C1

The following example is valid:

(SELECT * FROM T1
   ORDER BY C1)
UNION
(SELECT * FROM T2
   ORDER BY C1)

Note: An ORDER BY clause in a subselect does not affect the order of the rows returned by a query. An ORDER BY clause only affects the order of the rows returned if it is specified in the outermost fullselect.

select-clause


           .-ALL------.   
>>-SELECT--+----------+----------------------------------------->
           '-DISTINCT-'   

>--+-*-----------------------------------------------+---------><
   | .-,-------------------------------------------. |   
   | V                                             | |   
   '---+-expression--+-------------------------+-+-+-'   
       |             | .-AS-.                  | |       
       |             '-+----+--new-column-name-' |       
       '-exposed-name.*--------------------------'

The SELECT clause specifies the columns of the final result table, R. The column values are produced by the application of the select list to R. The select list is the names or expressions specified in the SELECT clause, and R is the result of the previous operation of the subselect. For example, if the only clauses specified are SELECT, FROM, and WHERE, R is the result of that WHERE clause.

ALL

Retains all rows of the final result table, and does not eliminate redundant duplicates. This is the default.

DISTINCT

Eliminates all but one of each set of duplicate rows of the final result table. If DISTINCT is used, no string column of the result table can be a LOB type, distinct type based on LOB, or structured type. DISTINCT may be used more than once in a subselect. This includes SELECT DISTINCT, the use of DISTINCT in an aggregate function of the select list or HAVING clause, and subqueries of the subselect.

Two rows are duplicates of one another only if each value in the first is equal to the corresponding value in the second. For determining duplicates, two null values are considered equal, and two different decimal floating-point representations of the same number are considered equal. For example, -0 is equal to +0 and 2.0 is equal to 2.00. Each of the decimal floating-point special values are also considered equal: -NAN equals -NAN, -SNAN equals -SNAN, -INFINITY equals -INFINITY, INFINITY equals INFINITY, SNAN equals SNAN, and NAN equals NAN.

When the data type of a column is decimal floating-point, and multiple representations of the same number exist in the column, the particular value that is returned for a SELECT DISTINCT can be any one of the representations in the column. For more information, see Numeric comparisons.

For compatibility with other SQL implementations, UNIQUE can be specified as a synonym for DISTINCT.

Select list notation

*

Represents a list of names that identify the columns of table R, excluding any columns defined as IMPLICITLY HIDDEN. The first name in the list identifies the first column of R, the second name identifies the second column of R, and so on.

The list of names is established when the program containing the SELECT clause is bound. Hence * (the asterisk) does not identify any columns that have been added to a table after the statement containing the table reference has been bound.

expression

Specifies the values of a result column. Can be any expression that is a valid SQL language element, but commonly includes column names. Each column name used in the select list must unambiguously identify a column of R. The result type of the expression cannot be a row type (SQLSTATE 428H2).

new-column-name or AS new-column-name

Names or renames the result column. The name must not be qualified and does not have to be unique. Subsequent usage of column-name is limited as follows:

A new-column-name specified in the AS clause can be used in the order-by-clause, provided the name is unique.
A new-column-name specified in the AS clause of the select list cannot be used in any other clause within the subselect (where-clause, group-by-clause or having-clause).
A new-column-name specified in the AS clause cannot be used in the update-clause.
A new-column-name specified in the AS clause is known outside the fullselect of nested table expressions, common table expressions and CREATE VIEW.

exposed-name.*

Represents the list of names that identify the columns of the result table identified by exposed-name, excluding any columns defined as IMPLICITLY HIDDEN. The exposed-name may be a table name, view name, nickname, or correlation name, and must designate a table, view or nickname named in the FROM clause. The first name in the list identifies the first column of the table, view or nickname, the second name in the list identifies the second column of the table, view or nickname, and so on.

The list of names is established when the statement containing the SELECT clause is bound. Therefore, * does not identify any columns that have been added to a table after the statement has been bound.

The number of columns in the result of SELECT is the same as the number of expressions in the operational form of the select list (that is, the list established when the statement is prepared), and cannot exceed 500 for a 4K page size or 1012 for an 8K, 16K, or 32K page size.

Limitations on string columns

For limitations on the select list, see "Restrictions Using Varying-Length Character Strings".

Applying the select list

Some of the results of applying the select list to R depend on whether or not GROUP BY or HAVING is used. The results are described in two separate lists.

If GROUP BY or HAVING is used

An expression X (not an aggregate function) used in the select list must have a GROUP BY clause with:
- a grouping-expression in which each expression or column-name unambiguously identifies a column of R (see group-by-clause) or
- each column of R referenced in X as a separate grouping-expression.
The select list is applied to each group of R, and the result contains as many rows as there are groups in R. When the select list is applied to a group of R, that group is the source of the arguments of the aggregate functions in the select list.

If neither GROUP BY nor HAVING is used

Either the select list must not include any aggregate functions, or each column-name in the select list must be specified within an aggregate function or must be a correlated column reference.
If the select does not include aggregate functions, then the select list is applied to each row of R and the result contains as many rows as there are rows in R.
If the select list is a list of aggregate functions, then R is the source of the arguments of the functions and the result of applying the select list is one row.

In either case the nth column of the result contains the values specified by applying the nth expression in the operational form of the select list.

Null attributes of result columns

Result columns do not allow null values if they are derived from:

A column that does not allow null values
A constant
The COUNT or COUNT_BIG function
A host variable that does not have an indicator variable
A scalar function or expression that does not include an operand that allows nulls

Result columns allow null values if they are derived from:

Any aggregate function except COUNT or COUNT_BIG
A column that allows null values
A scalar function or expression that includes an operand that allows nulls
A NULLIF function with arguments containing equal values
A host variable that has an indicator variable, an SQL parameter, an SQL variable, or a global variable
A result of a set operation if at least one of the corresponding items in the select list is nullable
An arithmetic expression or view column that is derived from an arithmetic expression and the database is configured with dft_sqlmathwarn set to Yes
A scalar subselect
A dereference operation
A GROUPING SETS grouping-expression

Names of result columns

If the AS clause is specified, the name of the result column is the name specified on the AS clause.
If the AS clause is not specified and a column list is specified in the correlation clause, the name of the result column is the corresponding name in the correlation column list.
If neither an AS clause nor a column list in the correlation clause is specified and the result column is derived only from a single column (without any functions or operators), then the result column name is the unqualified name of that column.
If neither an AS clause nor a column list in the correlation clause is specified and the result column is derived only from a single SQL variable or SQL parameter (without any functions or operators), then the result column name is the unqualified name of that SQL variable or SQL parameter.
If neither an AS clause nor a column list in the correlation clause is specified and the result column is derived using a dereference operation, then the result column name is the unqualified name of the target column of the dereference operation.
All other result column names are unnamed. The system assigns temporary numbers (as character strings) to these columns.

Data types of result columns

Each column of the result of SELECT acquires a data type from the expression from which it is derived.

When the expression is ...	The data type of the result column is ...
the name of any numeric column	the same as the data type of the column, with the same precision and scale for DECIMAL columns, or the same precision for DECFLOAT columns.
a constant	the same as the data type of the constant.
the name of any numeric variable	the same as the data type of the variable, with the same precision and scale for DECIMAL variables, or the same precision for DECFLOAT variables.
the name of any string column	the same as the data type of the column, with the same length attribute.
the name of any string variable	the same as the data type of the variable, with the same length attribute; if the data type of the variable is not identical to an SQL data type (for example, a NUL-terminated string in C), the result column is a varying-length string.
the name of a datetime column	the same as the data type of the column.
the name of a user-defined type column	the same as the data type of the column.
the name of a reference type column	the same as the data type of the column.

from-clause


         .-,---------------.   
         V                 |   
>>-FROM----table-reference-+-----------------------------------><

The FROM clause specifies an intermediate result table.

If only one table-reference is specified, the intermediate result table is simply the result of that table-reference. If more than one table-reference is specified, the intermediate result table consists of all possible combinations of the rows of the specified table-reference (the Cartesian product). Each row of the result is a row from the first table-reference concatenated with a row from the second table-reference, concatenated in turn with a row from the third, and so on. The number of rows in the result is the product of the number of rows in all the individual table references. For a description of table-reference, see table-reference.

table-reference


>>-+-table-name--+------------------------+--+------------------------+---------------+-><
   |             '-| correlation-clause |-'  '-| tablesample-clause |-'               |   
   +-+-+-nickname--+-------------------+--+------------------------+------------------+   
   | | '-view-name-'                   |  '-| correlation-clause |-'                  |   
   | '-+-ONLY--+--(--+-table-name-+--)-'                                              |   
   |   '-OUTER-'     '-view-name--'                                                   |   
   +-TABLE--(--function-name--(--+----------------+--)--)--+------------------------+-+   
   |                             | .-,----------. |        '-| correlation-clause |-' |   
   |                             | V            | |                                   |   
   |                             '---expression-+-'                                   |   
   |                     (1)                                                          |   
   +-xmltable-expression------+------------------------+------------------------------+   
   |                          '-| correlation-clause |-'                              |   
   +-| analyze_table-expression |--+------------------------+-------------------------+   
   |                               '-| correlation-clause |-'                         |   
   +-| nested-table-expression |--+------------------------+--------------------------+   
   |                              '-| correlation-clause |-'                          |   
   +-| data-change-table-reference |--+------------------------+----------------------+   
   |                                  '-| correlation-clause |-'                      |   
   +-| collection-derived-table |--+------------------------+-------------------------+   
   |                               '-| correlation-clause |-'                         |   
   '-joined-table---------------------------------------------------------------------'   

correlation-clause

   .-AS-.                                                
|--+----+--correlation-name--+-----------------------+----------|
                             |    .-,-----------.    |   
                             |    V             |    |   
                             '-(----column-name-+--)-'   

tablesample-clause

|--TABLESAMPLE--+-BERNOULLI-+--(--numeric-expression1--)-------->
                '-SYSTEM----'                              

>--+---------------------------------------+--------------------|
   '-REPEATABLE--(--numeric-expression2--)-'   

analyze_table-expression

|--+-table-name-+----------------------------------------------->
   '-view-name--'   

                 (2)                                    
>--ANALYZE_TABLE------(--| implementation-clause |--)-----------|

implementation-clause

|--IMPLEMENTATION--'--string--'---------------------------------|

nested-table-expression

|--+-----------------------------------------------+--(fullselect)--|
   |         (3)                                   |                 
   '-LATERAL------+------------------------------+-'                 
                  '-| continue-handler |--WITHIN-'                   

continue-handler

                      .-,----------------------------.   
                      V                              |   
|--RETURN DATA UNTIL----| specific-condition-value |-+----------|

specific-condition-value

                        .-VALUE-.                                                       
|--FEDERATED--SQLSTATE--+-------+--string-constant--+-------------------------------+--|
                                                    |          .-,----------------. |   
                                                    |          V                  | |   
                                                    '-SQLCODE----integer-constant-+-'   

data-change-table-reference

|--+-+-FINAL-+--TABLE--(--insert-statement--)----------+--------|
   | '-NEW---'                                         |   
   +-+-FINAL-+--TABLE--(--searched-update-statement--)-+   
   | +-NEW---+                                         |   
   | '-OLD---'                                         |   
   '-OLD TABLE--(--searched-delete-statement--)--------'   

collection-derived-table

|--UNNEST-table-function--+---------------------+---------------|
                          |                 (4) |   
                          '-WITH ORDINALITY-----'

Notes:

An XMLTABLE expression can be part of a table-reference. In this case, subexpressions within the XMLTABLE expression are in-scope of prior range variables in the FROM clause. For more information, see the description of "XMLTABLE".
An ANALYZE_TABLE expression can be part of a table-reference (see Analyze table expressions).
TABLE can be specified in place of LATERAL.
WITH ORDINALITY can be specified only if the argument to the UNNEST table function is one or more ordinary array variables or functions with ordinary array return types; an associative array variable or function with an associative array return type cannot be specified (SQLSTATE 428HT).

Each table-name, view-name or nickname specified as a table-reference must identify an existing table, view or nickname at the application server or the table-name of a common table expression defined preceding the fullselect containing the table-reference. If the table-name references a typed table, the name denotes the UNION ALL of the table with all its subtables, with only the columns of the table-name. Similarly, if the view-name references a typed view, the name denotes the UNION ALL of the view with all its subviews, with only the columns of the view-name.

The use of ONLY(table-name) or ONLY(view-name) means that the rows of the proper subtables or subviews are not included. If the table-name used with ONLY does not have subtables, then ONLY(table-name) is equivalent to specifying table-name. If the view-name used with ONLY does not have subviews, then ONLY(view-name) is equivalent to specifying view-name.

The use of OUTER(table-name) or OUTER(view-name) represents a virtual table. If the table-name or view-name used with OUTER does not have subtables or subviews, then specifying OUTER is equivalent to not specifying OUTER. OUTER(table-name) is derived from table-name as follows:

The columns include the columns of table-name followed by the additional columns introduced by each of its subtables (if any). The additional columns are added on the right, traversing the subtable hierarchy in depth-first order. Subtables that have a common parent are traversed in creation order of their types.
The rows include all the rows of table-name and all the rows of its subtables. Null values are returned for columns that are not in the subtable for the row.

The previous points also apply to OUTER(view-name), substituting view-name for table-name and subview for subtable.

The use of ONLY or OUTER requires the SELECT privilege on every subtable of table-name or subview of view-name.

Each function-name together with the types of its arguments, specified as a table reference must resolve to an existing table function at the application server.

A fullselect in parentheses is called a nested table expression.

A joined-table specifies an intermediate result set that is the result of one or more join operations. For more information, see joined-table.

The exposed names of all table references should be unique. An exposed name is:

A correlation-name
A table-name that is not followed by a correlation-name
A view-name that is not followed by a correlation-name
A nickname that is not followed by a correlation-name
An alias-name that is not followed by a correlation-name

If a correlation-clause does not follow a function-name reference, xmltable-expression, nested table expression, or data-change-table-reference, there is no exposed name for that table reference.

Each correlation-name is defined as a designator of the immediately preceding table-name, view-name, nickname, function-name reference, xmltable-expression, nested table expression, or data-change-table-reference. Any qualified reference to a column must use the exposed name. If the same table name, view or nickname name is specified twice, at least one specification should be followed by a correlation-name. The correlation-name is used to qualify references to the columns of the table, view or nickname. When a correlation-name is specified, column-names can also be specified to give names to the columns of the table reference. If the correlation-clause does not include column-names, the exposed column names are determined as follows:.

Column names of the referenced table, view, or nickname when the table-reference is a table-name, view-name, nickname, or alias-name
Column names specified in the RETURNS clause of the CREATE FUNCTION statement when the table-reference is a function-name reference
Column names specified in the COLUMNS clause of the xmltable-expression when the table-reference is an xmltable-expression
Column names exposed by the fullselect when the table-reference is a nested-table-expression
Column names from the target table of the data change statement, along with any defined INCLUDE columns when the table-reference is a data-change-table-reference

In general, collection-derived tables, table functions, and nested table expressions can be specified on any from-clause. Columns from the table functions, nested table expressions, or collection-derived tables can be referenced in the select list and in the rest of the subselect using the correlation name. The scope of this correlation name is the same as correlation names for other tables, views, or nicknames in the FROM clause. A nested table expression can be used:

In place of a view to avoid creating the view (when general use of the view is not required)
When the desired result table is based on host variables

A collection-derived-table can be used to convert the elements of an array into values of a column in separate rows. If WITH ORDINALITY is specified, an extra column of data type INTEGER is appended. This column contains the position of the element in the array. The columns can be referenced in the select list and the rest of the subselect by using the names specified for the columns in the correlation-clause. The collection-derived-table clause can only be used in a context where arrays are supported (SQLSTATE 42887). See the "UNNEST table function" for details.

An expression in the select list of a nested table expression that is referenced within, or is the target of, a data change statement within a fullselect is only valid when it does not include:

A function that reads or modifies SQL data
A function that is non-deterministic
A function that has external action
An OLAP function

If a view is referenced directly in, or as the target of a nested table expression in a data change statement within a FROM clause, the view must either be symmetric (have WITH CHECK OPTION specified) or satisfy the restriction for a WITH CHECK OPTION view.

If the target of a data change statement within a FROM clause is a nested table expression, the modified rows are not requalified, WHERE clause predicates are not re-evaluated, and ORDER BY or FETCH FIRST operations are not redone.

The optional tablesample-clause can be used to obtain a random subset (a sample) of the rows from the specified table-name, rather than the entire contents of that table-name, for this query. This sampling is in addition to any predicates that are specified in the where-clause. Unless the optional REPEATABLE clause is specified, each execution of the query will usually yield a different sample, except in degenerate cases where the table is so small relative to the sample size that any sample must return the same rows. The size of the sample is controlled by the numeric-expression1 in parentheses, representing an approximate percentage (P) of the table to be returned. The method by which the sample is obtained is specified after the TABLESAMPLE keyword, and can be either BERNOULLI or SYSTEM. For both methods, the exact number of rows in the sample may be different for each execution of the query, but on average should be approximately P percent of the table, before any predicates further reduce the number of rows.

The table-name must be a stored table. It can be a materialized query table (MQT) name, but not a subselect or table expression for which an MQT has been defined, because there is no guarantee that the database manager will route to the MQT for that subselect.

Semantically, sampling of a table occurs before any other query processing, such as applying predicates or performing joins. Repeated accesses of a sampled table within a single execution of a query (such as in a nested-loop join or a correlated subquery) will return the same sample. More than one table may be sampled in a query.

BERNOULLI sampling considers each row individually. It includes each row in the sample with probability P/100 (where P is the value of numeric-expression1), and excludes each row with probability 1 - P/100, independently of the other rows. So if the numeric-expression1 evaluated to the value 10, representing a ten percent sample, each row would be included with probability 0.1, and excluded with probability 0.9.

SYSTEM sampling permits the database manager to determine the most efficient manner in which to perform the sampling. In most cases, SYSTEM sampling applied to a table-name means that each page of table-name is included in the sample with probability P/100, and excluded with probability 1 - P/100. All rows on each page that is included qualify for the sample. SYSTEM sampling of a table-name generally executes much faster than BERNOULLI sampling, because fewer data pages need to be retrieved; however, it can often yield less accurate estimates for aggregate functions (SUM(SALES), for example), especially if the rows of table-name are clustered on any columns referenced in that query. The optimizer may in certain circumstances decide that it is more efficient to perform SYSTEM sampling as if it were BERNOULLI sampling, for example when a predicate on table-name can be applied by an index and is much more selective than the sampling rate P.

The numeric-expression1 specifies the size of the sample to be obtained from table-name, expressed as a percentage. It must be a constant numeric expression that cannot contain columns. The expression must evaluate to a positive number that is less than or equal to 100, but can be between 1 and 0. For example, a value of 0.01 represents one one-hundredth of a percent, meaning that 1 row in 10 000 would be sampled, on average. A numeric-expression1 that evaluates to 100 is handled as if the tablesample-clause were not specified. If numeric-expression1 evaluates to the null value, or to a value that is greater than 100 or less than 0, an error is returned (SQLSTATE 2202H).

It is sometimes desirable for sampling to be repeatable from one execution of the query to the next; for example, during regression testing or query "debugging". This can be accomplished by specifying the REPEATABLE clause. The REPEATABLE clause requires the specification of a numeric-expression2 in parentheses, which serves the same role as the seed in a random number generator. Adding the REPEATABLE clause to the tablesample-clause of any table-name ensures that repeated executions of that query (using the same value for numeric-expression2) return the same sample, assuming, of course, that the data itself has not been updated, reorganized, or repartitioned. To guarantee that the same sample of table-name is used across multiple queries, use of a global temporary table is recommended. Alternatively, the multiple queries could be combined into one query, with multiple references to a sample that is defined using the WITH clause.

Following are some examples:

Example 1: Request a 10% Bernoulli sample of the Sales table for auditing purposes.

  SELECT * FROM Sales
    TABLESAMPLE BERNOULLI(10)

Example 2: Compute the total sales revenue in the Northeast region for each product category, using a random 1% SYSTEM sample of the Sales table. The semantics of SUM are for the sample itself, so to extrapolate the sales to the entire Sales table, the query must divide that SUM by the sampling rate (0.01).

SELECT SUM(Sales.Revenue) / (0.01)
  FROM Sales TABLESAMPLE SYSTEM(1)
  WHERE Sales.RegionName = 'Northeast'
  GROUP BY Sales.ProductCategory

Example 3: Using the REPEATABLE clause, modify the previous query to ensure that the same (yet random) result is obtained each time the query is executed. (The value of the constant enclosed by parentheses is arbitrary.)

SELECT SUM(Sales.Revenue) / (0.01)
  FROM Sales TABLESAMPLE SYSTEM(1) REPEATABLE(3578231)
  WHERE Sales.RegionName = 'Northeast'
  GROUP BY Sales.ProductCategory

Table function references

In general, a table function, together with its argument values, can be referenced in the FROM clause of a SELECT in exactly the same way as a table or view. There are, however, some special considerations which apply.

Table Function Column Names
Unless alternate column names are provided following the correlation-name, the column names for the table function are those specified in the RETURNS clause of the CREATE FUNCTION statement. This is analogous to the names of the columns of a table, which are defined in the CREATE TABLE statement.
Table Function Resolution
The arguments specified in a table function reference, together with the function name, are used by an algorithm called function resolution to determine the exact function to be used. This is no different from what happens with other functions (such as scalar functions) that are used in a statement.

Table Function Arguments

As with scalar function arguments, table function arguments can in general be any valid SQL expression. The following examples are valid syntax:

  Example 1:  SELECT c1
              FROM TABLE( tf1('Zachary') ) AS z
              WHERE c2 = 'FLORIDA';

  Example 2:  SELECT c1
              FROM TABLE( tf2 (:hostvar1, CURRENT DATE) ) AS z;

  Example 3:  SELECT c1
              FROM t
              WHERE c2 IN
                       (SELECT c3 FROM
                        TABLE( tf5(t.c4) ) AS z  -- correlated reference
                       )                         -- to previous FROM clause

Table Functions That Modify SQL Data

Table functions that are specified with the MODIFIES SQL DATA option can only be used as the last table reference in a select-statement, common-table-expression, or RETURN statement that is a subselect, a SELECT INTO, or a row-fullselect in a SET statement. Only one table function is allowed in one FROM clause, and the table function arguments must be correlated to all other table references in the subselect (SQLSTATE 429BL). The following examples have valid syntax for a table function with the MODIFIES SQL DATA property:

  Example 1:  SELECT c1
              FROM TABLE( tfmod('Jones') ) AS z

  Example 2:  SELECT c1
              FROM t1, t2, TABLE( tfmod(t1.c1, t2.c1) ) AS z

  Example 3:  SET var =
              (SELECT c1
              FROM TABLE( tfmod('Jones') ) AS z

  Example 4:  RETURN SELECT c1
              FROM TABLE( tfmod('Jones') ) AS z

  Example 5:  WITH v1(c1) AS
              (SELECT c1
              FROM TABLE( tfmod(:hostvar1) ) AS z)
              SELECT c1 
              FROM v1, t1 WHERE v1.c1 = t1.c1

Error tolerant nested-table-expression

Certain errors that occur within a nested-table-expression can be tolerated, and instead of returning an error, the query can continue and return a result.

Specifying the RETURN DATA UNTIL clause will cause any rows that are returned from the fullselect before the indicated condition is encountered to make up the result set from the fullselect. This means that a partial result set (which could also be an empty result set) from the fullselect is acceptable as the result for the nested-table-expression.

The FEDERATED keyword restricts the condition to handle only errors that occur at a remote data source.

The condition can be specified as an SQLSTATE value, with a string-constant length of 5. You can optionally specify an SQLCODE value for each specified SQLSTATE value. For portable applications, specify SQLSTATE values as much as possible, because SQLCODE values are generally not portable across platforms and are not part of the SQL standard.

Only certain conditions can be tolerated. Errors that do not allow the rest of the query to be executed cannot be tolerated, and an error is returned for the whole query. The specific-condition-value might specify conditions that cannot actually be tolerated by the database manager, even if a specific SQLSTATE or SQLCODE value is specified, and for these cases, an error is returned.

The following SQLSTATE values and SQLCODE values have the potential, when specified, to be tolerated by the database manager:

SQLSTATE 08001; SQLCODEs -1336, -30080, -30081, -30082
SQLSTATE 08004
SQLSTATE 42501
SQLSTATE 42704; SQLCODE -204
SQLSTATE 42720
SQLSTATE 28000

A query or view containing an error tolerant nested-table-expression is read-only.

The fullselect of an error tolerant nested-table-expression is not optimized using materialized query tables.

Correlated references in table-references

Correlated references can be used in nested table expressions or as arguments to table functions. The basic rule that applies for both these cases is that the correlated reference must be from a table-reference at a higher level in the hierarchy of subqueries. This hierarchy includes the table-references that have already been resolved in the left-to-right processing of the FROM clause. For nested table expressions, the LATERAL keyword must appear before the fullselect. So the following examples are valid syntax:

  Example 1:  SELECT t.c1, z.c5
              FROM t, TABLE( tf3(t.c2) ) AS z      -- t precedes tf3
              WHERE t.c3 = z.c4;                   -- in FROM, so t.c2
                                                   -- is known

  Example 2:  SELECT t.c1, z.c5
              FROM t, TABLE( tf4(2 * t.c2) ) AS z  -- t precedes tf4
              WHERE t.c3 = z.c4;                   -- in FROM, so t.c2
                                                   -- is known

  Example 3:  SELECT d.deptno, d.deptname,
                     empinfo.avgsal, empinfo.empcount
              FROM department d,
                   LATERAL (SELECT AVG(e.salary) AS avgsal,
                                 COUNT(*) AS empcount
                          FROM employee e           -- department precedes
                          WHERE e.workdept=d.deptno -- and TABLE is
                         ) AS empinfo;              -- specified, so
                                                    -- d.deptno is known

But the following examples are not valid:

  Example 4:  SELECT t.c1, z.c5
              FROM TABLE( tf6(t.c2) ) AS z, t  -- cannot resolve t in t.c2!
              WHERE t.c3 = z.c4;               -- compare to Example 1 above.

  Example 5:  SELECT a.c1, b.c5
              FROM TABLE( tf7a(b.c2) ) AS a, TABLE( tf7b(a.c6) ) AS b
              WHERE a.c3 = b.c4;               -- cannot resolve b in b.c2!

  Example 6:  SELECT d.deptno, d.deptname,
                     empinfo.avgsal, empinfo.empcount
              FROM department d,
                   (SELECT AVG(e.salary) AS avgsal,
                                 COUNT(*) AS empcount
                          FROM employee e           -- department precedes
                          WHERE e.workdept=d.deptno -- but TABLE is not
                         ) AS empinfo;              -- specified, so
                                                    -- d.deptno is unknown

Data change table references

A data-change-table-reference clause specifies an intermediate result table. This table is based on the rows that are directly changed by the searched UPDATE, searched DELETE, or INSERT statement that is included in the clause. A data-change-table-reference can be specified as the only table-reference in the FROM clause of the outer fullselect that is used in a select-statement, a SELECT INTO statement, or a common table expression. A data-change-table-reference can be specified as the only table reference in the only fullselect in a SET Variable statement (SQLSTATE 428FL). The target table or view of the data change statement is considered to be a table or view that is referenced in the query; therefore, the authorization ID of the query must have SELECT privilege on that target table or view. A data-change-table-reference clause cannot be specified in a view definition, materialized query table definition, or FOR statement (SQLSTATE 428FL).

The target of the UPDATE, DELETE, or INSERT statement cannot be a temporary view defined in a common table expression (SQLSTATE 42807) or a nickname (SQLSTATE 25000).

Expressions in the select list of a view or fullselect as target of a data change statement in a table-reference can only be selected if OLD TABLE is specified or the expression does not include the following elements (SQLSTATE 428G6):

A subquery
A function that reads or modifies sql data
A function is that is non-deterministic or has an external action
An OLAP function
A NEXT VALUE FOR sequence reference

FINAL TABLE: Specifies that the rows of the intermediate result table represent the set of rows that are changed by the SQL data change statement as they appear at the completion of the data change statement. If there are AFTER triggers or referential constraints that result in further operations on the table that is the target of the SQL data change statement, an error is returned (SQLSTATE 560C6). If the target of the SQL data change statement is a view that is defined with an INSTEAD OF trigger for the type of data change, an error is returned (SQLSTATE 428G3).
NEW TABLE: Specifies that the rows of the intermediate result table represent the set of rows that are changed by the SQL data change statement prior to the application of referential constraints and AFTER triggers. Data in the target table at the completion of the statement might not match the data in the intermediate result table because of additional processing for referential constraints and AFTER triggers.
OLD TABLE: Specifies that the rows of the intermediate result table represent the set of rows that are changed by the SQL data change statement as they existed prior to the application of the data change statement.
(searched-update-statement): Specifies a searched UPDATE statement. A WHERE clause or a SET clause in the UPDATE statement cannot contain correlated references to columns outside of the UPDATE statement.
(searched-delete-statement): Specifies a searched DELETE statement. A WHERE clause in the DELETE statement cannot contain correlated references to columns outside of the DELETE statement.
(insert-statement): Specifies an INSERT statement. A fullselect in the INSERT statement cannot contain correlated references to columns outside of the fullselect of the INSERT statement.

The content of the intermediate result table for a data-change-table-reference is determined when the cursor opens. The intermediate result table contains all manipulated rows, including all the columns in the specified target table or view. All the columns of the target table or view for an SQL data change statement are accessible using the column names from the target table or view. If an INCLUDE clause was specified within a data change statement, the intermediate result table will contain these additional columns.

Analyze table expressions

table-name or view-name

The table-name or view-name must identify an existing table or view, or the table-name of a common table expression defined preceding the fullselect containing the table-reference. A nickname can be specified. However, in-database analytics are intended for local data, and retrieving the data for a nickname from another data source does not take advantage of the intended performance benefits.

ANALYZE_TABLE

Returns the result of executing a specific data mining model using a valid in-database analytics provider, a named model implementation, and specified input data. A query referencing ANALYZE_TABLE cannot be a static SQL statement. The statement cannot be a data definition language (DDL) statement. Input or output values cannot be of type CHAR FOR BIT DATA, VARCHAR FOR BIT DATA, BLOB, CLOB, DBCLOB, NCLOB, XML, or DB2SECURITYLABEL.

IMPLEMENTATION 'string'

Specifies how the expression is to be evaluated. The string parameter is a string constant whose maximum length is 1024 bytes. The specified value is used to establish a session with an in-database analytic provider. When SAS is specified as the provider, you must specify values for the following case insensitive parameters.

PROVIDER: Currently, the only supported provider value is SAS.
ROUTINE_SOURCE_TABLE: Specifies a user table containing the DS2 code (and optionally, any required format or metadata) to implement the algorithm that is specified by ROUTINE_SOURCE_NAME. DS2 is a procedural language processor for SAS, designed for data modeling, stored procedures, and data extraction, transformation, and load (ETL) processing.
ROUTINE_SOURCE_TABLE has a defined structure (see ANALYZE_TABLE examples) and, in a partitioned database environment, must reside on the catalog database partition. The table cannot be a global temporary table. The MODELDS2 column for a given row must not be empty or contain the null value; if the value of the MODELFORMATS or MODELMETADATA column is not null, the value must have length greater than 0. If no table schema name is specified, CURRENT SCHEMA is assumed.
ROUTINE_SOURCE_NAME: Specifies the name of the algorithm to perform.

For example:

IMPLEMENTATION
  'PROVIDER=SAS;
  ROUTINE_SOURCE_TABLE=ETLIN.SOURCE_TABLE;
  ROUTINE_SOURCE_NAME=SCORING_FUN1;'

If the table name, schema name, or algorithm name contains lowercase or mixed-case letters, specify delimited identifiers. For example:

IMPLEMENTATION
  'PROVIDER=SAS;
  ROUTINE_SOURCE_TABLE="ETLin"."Source_Table";
  ROUTINE_SOURCE_NAME="Scoring_Fun1";'

ANALYZE_TABLE examples

The following examples show you how to use the ANALYZE_TABLE expression.

SAS tooling will help you to define a table to store model implementations for scoring functions. A row in this table stores an algorithm written in DS2, with any required SAS format information and metadata. The MODELNAME column serves as the primary key; for a given value of the ROUTINE_SOURCE_NAME parameter, at most one row will be retrieved from the table that is specified by the ROUTINE_SOURCE_TABLE parameter. For example:

   CREATE TABLE ETLIN.SOURCE_TABLE (
     MODELNAME VARCHAR(128) NOT NULL PRIMARY KEY,
     MODELDS2 BLOB(4M) NOT NULL,
     MODELFORMATS BLOB(4M),
     MODELMETADATA BLOB(4M)
   );

MODELNAME contains the name of the algorithm. MODELDS2 contains the DS2 source code that implements the algorithm. MODELFORMATS contains the aggregated SAS format definition that is required by the algorithm. If the algorithm does not require a SAS format, this column contains the null value. MODELMETADATA contains any additional metadata that is required by the algorithm. If the algorithm does not require any additional metadata, this column contains the null value. Note that if this table is created by the SAS EP installer, it might include additional columns.

Use the data in columns C1 and C2 in table T1 as input data with the scoring model SCORING_FUN1, whose implementation is stored in ETLIN.SOURCE_TABLE:

   WITH sas_score_in (c1,c2) AS
     (SELECT c1,c2 FROM t1)
     SELECT *
       FROM sas_score_in ANALYZE_TABLE(
         IMPLEMENTATION
           'PROVIDER=SAS;
            ROUTINE_SOURCE_TABLE=ETLIN.SOURCE_TABLE;
            ROUTINE_SOURCE_NAME=SCORING_FUN1;');

Use all the data in the table T2 with the scoring model SCORING_FUN2, whose implementation is stored in ETLIN.SOURCE_TABLE:

   SELECT *
     FROM t2 ANALYZE_TABLE(
       IMPLEMENTATION
         'PROVIDER=SAS;
          ROUTINE_SOURCE_TABLE=ETLIN.SOURCE_TABLE;
          ROUTINE_SOURCE_NAME=SCORING_FUN2;');

Use all of the data in view V1 with the scoring model SCORING_FUN3, whose implementation is stored in ETLIN.SOURCE_TABLE, and return the output in ascending order of the first output column:

   SELECT *
     FROM v1 ANALYZE_TABLE(
       IMPLEMENTATION
         'PROVIDER=SAS;
          ROUTINE_SOURCE_TABLE=ETLIN.SOURCE_TABLE;
          ROUTINE_SOURCE_NAME=SCORING_FUN3;')
     ORDER BY 1;

joined-table


                      .-INNER-----.                                                
>>-+-table-reference--+-----------+--JOIN--table-reference--ON--join-condition-+-><
   |                  '-| outer |-'                                            |   
   +-table-reference--CROSS JOIN--table-reference------------------------------+   
   '-(--joined-table--)--------------------------------------------------------'   

outer

              .-OUTER-.   
|--+-LEFT--+--+-------+-----------------------------------------|
   +-RIGHT-+              
   '-FULL--'

A joined table specifies an intermediate result table that is the result of either an inner join or an outer join. The table is derived by applying one of the join operators: CROSS, INNER, LEFT OUTER, RIGHT OUTER, or FULL OUTER to its operands.

Cross joins represent the cross product of the tables, where each row of the left table is combined with every row of the right table. Inner joins can be thought of as the cross product of the tables, keeping only the rows where the join condition is true. The result table might be missing rows from either or both of the joined tables. Outer joins include the inner join and preserve these missing rows. There are three types of outer joins:

Left outer join includes rows from the left table that were missing from the inner join.
Right outer join includes rows from the right table that were missing from the inner join.
Full outer join includes rows from both the left and right tables that were missing from the inner join.

If a join-operator is not specified, INNER is implicit. The order in which multiple joins are performed can affect the result. Joins can be nested within other joins. The order of processing for joins is generally from left to right, but based on the position of the required join-condition. Parentheses are recommended to make the order of nested joins more readable. For example:

   TB1 LEFT JOIN TB2 ON TB1.C1=TB2.C1
     RIGHT JOIN TB3 LEFT JOIN TB4 ON TB3.C1=TB4.C1
       ON TB1.C1=TB3.C1

is the same as:

   (TB1 LEFT JOIN TB2 ON TB1.C1=TB2.C1)
     RIGHT JOIN (TB3 LEFT JOIN TB4 ON TB3.C1=TB4.C1)
       ON TB1.C1=TB3.C1

A joined table can be used in any context in which any form of the SELECT statement is used. A view or a cursor is read-only if its SELECT statement includes a joined table.

A join-condition is a search-condition, except that:

It cannot contain any subqueries, scalar or otherwise
It cannot include any dereference operations or the DEREF function, where the reference value is other than the object identifier column
It cannot include an SQL function
Any column referenced in an expression of the join-condition must be a column of one of the operand tables of the associated join (in the scope of the same joined-table clause)
Any function referenced in an expression of the join-condition of a full outer join must be deterministic and have no external action
It cannot include an XMLQUERY or XMLEXISTS expression

An error occurs if the join condition does not comply with these rules (SQLSTATE 42972).

Column references are resolved using the rules for resolution of column name qualifiers. The same rules that apply to predicates apply to join conditions.

Join operations

A join-condition specifies pairings of T1 and T2, where T1 and T2 are the left and right operand tables of the JOIN operator of the join-condition. For all possible combinations of rows of T1 and T2, a row of T1 is paired with a row of T2 if the join-condition is true. When a row of T1 is joined with a row of T2, a row in the result consists of the values of that row of T1 concatenated with the values of that row of T2. The execution might involve the generation of a null row. The null row of a table consists of a null value for each column of the table, regardless of whether the columns allow null values.

The following summarizes the result of the join operations:

The result of T1 CROSS JOIN T2 consists of all possible pairings of their rows.
The result of T1 INNER JOIN T2 consists of their paired rows where the join-condition is true.
The result of T1 LEFT OUTER JOIN T2 consists of their paired rows where the join-condition is true and, for each unpaired row of T1, the concatenation of that row with the null row of T2. All columns derived from T2 allow null values.
The result of T1 RIGHT OUTER JOIN T2 consists of their paired rows where the join-condition is true and, for each unpaired row of T2, the concatenation of that row with the null row of T1. All columns derived from T1 allow null values.
The result of T1 FULL OUTER JOIN T2 consists of their paired rows and, for each unpaired row of T2, the concatenation of that row with the null row of T1 and, for each unpaired row of T1, the concatenation of that row with the null row of T2. All columns derived from T1 and T2 allow null values.

where-clause


>>-WHERE--search-condition-------------------------------------><

The WHERE clause specifies an intermediate result table that consists of those rows of R for which the search-condition is true. R is the result of the FROM clause of the subselect.

The search-condition must conform to the following rules:

Each column-name must unambiguously identify a column of R or be a correlated reference. A column-name is a correlated reference if it identifies a column of a table-reference in an outer subselect.
An aggregate function must not be specified unless the WHERE clause is specified in a subquery of a HAVING clause and the argument of the function is a correlated reference to a group.

Any subquery in the search-condition is effectively executed for each row of R, and the results are used in the application of the search-condition to the given row of R. A subquery is actually executed for each row of R only if it includes a correlated reference. In fact, a subquery with no correlated references may be executed just once, whereas a subquery with a correlated reference may have to be executed once for each row.

group-by-clause


             .-,-----------------------.   
             V                         |   
>>-GROUP BY----+-grouping-expression-+-+-----------------------><
               +-grouping-sets-------+     
               '-super-groups--------'

The GROUP BY clause specifies an intermediate result table that consists of a grouping of the rows of R. R is the result of the previous clause of the subselect.

In its simplest form, a GROUP BY clause contains a grouping expression. A grouping expression is an expression used in defining the grouping of R. Each expression or column name included in grouping-expression must unambiguously identify a column of R (SQLSTATE 42702 or 42703). A grouping expression cannot include a scalar fullselect or an XMLQUERY or XMLEXISTS expression (SQLSTATE 42822), or any expression or function that is not deterministic or has an external action (SQLSTATE 42845).

Note: The following expressions, which do not contain an explicit column reference, can be used in a grouping-expression to identify a column of R:

ROW CHANGE TIMESTAMP FOR table-designator
ROW CHANGE TOKEN FOR table-designator
RID_BIT or RID scalar function

More complex forms of the GROUP BY clause include grouping-sets and super-groups. For a description of these forms, see grouping-sets and super-groups, respectively.

The result of GROUP BY is a set of groups of rows. Each row in this result represents the set of rows for which the grouping-expression is equal. For grouping, all null values from a grouping-expression are considered equal.

If a grouping-expression contains decimal floating-point columns, and multiple representations of the same number exist in these columns, the number that is returned can be any of the representations of the number.

A grouping-expression can be used in a search condition in a HAVING clause, in an expression in a SELECT clause or in a sort-key-expression of an ORDER BY clause (see order-by-clause for details). In each case, the reference specifies only one value for each group. For example, if the grouping-expression is col1+col2, then an allowed expression in the select list would be col1+col2+3. Associativity rules for expressions would disallow the similar expression, 3+col1+col2, unless parentheses are used to ensure that the corresponding expression is evaluated in the same order. Thus, 3+(col1+col2) would also be allowed in the select list. If the concatenation operator is used, the grouping-expression must be used exactly as the expression was specified in the select list.

If the grouping-expression contains varying-length strings with trailing blanks, the values in the group can differ in the number of trailing blanks and may not all have the same length. In that case, a reference to the grouping-expression still specifies only one value for each group, but the value for a group is chosen arbitrarily from the available set of values. Thus, the actual length of the result value is unpredictable.

As noted, there are some cases where the GROUP BY clause cannot refer directly to a column that is specified in the SELECT clause as an expression (scalar-fullselect, not deterministic or external action functions). To group using such an expression, use a nested table expression or a common table expression to first provide a result table with the expression as a column of the result. For an example using nested table expressions, see Example A9.

grouping-sets


                     .-,-------------------------------------.      
                     V                                       |      
>>-GROUPING SETS--(----+-+-grouping-expression-+-----------+-+--)-><
                       | '-super-groups--------'           |        
                       |    .-,-----------------------.    |        
                       |    V                         |    |        
                       '-(----+-grouping-expression-+-+--)-'        
                              '-super-groups--------'

A grouping-sets specification allows multiple grouping clauses to be specified in a single statement. This can be thought of as the union of two or more groups of rows into a single result set. It is logically equivalent to the union of multiple subselects with the group by clause in each subselect corresponding to one grouping set. A grouping set can be a single element or can be a list of elements delimited by parentheses, where an element is either a grouping-expression or a super-group. Using grouping-sets allows the groups to be computed with a single pass over the base table.

The grouping-sets specification allows either a simple grouping-expression to be used, or the more complex forms of super-groups. For a description of super-groups, see super-groups.

Note that grouping sets are the fundamental building blocks for GROUP BY operations. A simple GROUP BY with a single column can be considered a grouping set with one element. For example:

   GROUP BY a

is the same as

   GROUP BY GROUPING SETS((a))

and

   GROUP BY a,b,c

is the same as

   GROUP BY GROUPING SETS((a,b,c))

Non-aggregation columns from the select list of the subselect that are excluded from a grouping set will return a null for such columns for each row generated for that grouping set. This reflects the fact that aggregation was done without considering the values for those columns.

Example C2 through Example C7 illustrate the use of grouping sets.

super-groups


                                            (1)     
>>-+-ROLLUP--(--grouping-expression-list--)-----+--------------><
   |                                      (2)   |   
   +-CUBE--(--grouping-expression-list--)-------+   
   '-| grand-total |----------------------------'   

grouping-expression-list

   .-,---------------------------------.   
   V                                   |   
|----+-grouping-expression-----------+-+------------------------|
     |    .-,-------------------.    |     
     |    V                     |    |     
     '-(----grouping-expression-+--)-'     

grand-total

|--(--)---------------------------------------------------------|

Notes:

Alternate specification when used alone in group-by-clause is: grouping-expression-list WITH ROLLUP.
Alternate specification when used alone in group-by-clause is: grouping-expression-list WITH CUBE.

ROLLUP ( grouping-expression-list )

A ROLLUP grouping is an extension to the GROUP BY clause that produces a result set containing sub-total rows in addition to the "regular" grouped rows. Sub-total rows are "super-aggregate" rows that contain further aggregates whose values are derived by applying the same aggregate functions that were used to obtain the grouped rows. These rows are called sub-total rows, because that is their most common use; however, any aggregate function can be used for the aggregation. For instance, MAX and AVG are used in Example C8. The GROUPING aggregate function can be used to indicate if a row was generated by the super-group.

A ROLLUP grouping is a series of grouping-sets. The general specification of a ROLLUP with n elements

  GROUP BY ROLLUP(C₁,C₂,...,C_n-1,C_n)

is equivalent to

  GROUP BY GROUPING SETS((C₁,C₂,...,C_n-1,C_n)
                         (C₁,C₂,...,C_n-1)
                         ...
                         (C₁,C₂)
                         (C₁)
                         () )

Note that the n elements of the ROLLUP translate to n+1 grouping sets. Note also that the order in which the grouping-expressions is specified is significant for ROLLUP. For example:

  GROUP BY ROLLUP(a,b)

is equivalent to

  GROUP BY GROUPING SETS((a,b)
                         (a)
                         () )

while

  GROUP BY ROLLUP(b,a)

is the same as

  GROUP BY GROUPING SETS((b,a)
                         (b)
                         () )

The ORDER BY clause is the only way to guarantee the order of the rows in the result set. Example C3 illustrates the use of ROLLUP.

CUBE ( grouping-expression-list )

A CUBE grouping is an extension to the GROUP BY clause that produces a result set that contains all the rows of a ROLLUP aggregation and, in addition, contains "cross-tabulation" rows. Cross-tabulation rows are additional "super-aggregate" rows that are not part of an aggregation with sub-totals. The GROUPING aggregate function can be used to indicate if a row was generated by the super-group.

Like a ROLLUP, a CUBE grouping can also be thought of as a series of grouping-sets. In the case of a CUBE, all permutations of the cubed grouping-expression-list are computed along with the grand total. Therefore, the n elements of a CUBE translate to 2**n (2 to the power n) grouping-sets. For example, a specification of:

   GROUP BY CUBE(a,b,c)

is equivalent to:

   GROUP BY GROUPING SETS((a,b,c)
                          (a,b)
                          (a,c)
                          (b,c)
                          (a)
                          (b)
                          (c)
                          () )

Note that the three elements of the CUBE translate into eight grouping sets.

The order of specification of elements does not matter for CUBE. 'CUBE (DayOfYear, Sales_Person)' and 'CUBE (Sales_Person, DayOfYear)' yield the same result sets. The use of the word 'same' applies to content of the result set, not to its order. The ORDER BY clause is the only way to guarantee the order of the rows in the result set. Example C4 illustrates the use of CUBE.

grouping-expression-list

A grouping-expression-list is used within a CUBE or ROLLUP clause to define the number of elements in the CUBE or ROLLUP operation. This is controlled by using parentheses to delimit elements with multiple grouping-expressions.

The rules for a grouping-expression are described in group-by-clause. For example, suppose that a query is to return the total expenses for the ROLLUP of City within a Province but not within a County. However, the clause:

   GROUP BY ROLLUP(Province, County, City)

results in unwanted subtotal rows for the County. In the clause:

   GROUP BY ROLLUP(Province, (County, City))

the composite (County, City) forms one element in the ROLLUP and, therefore, a query that uses this clause will yield the desired result. In other words, the two-element ROLLUP:

   GROUP BY ROLLUP(Province, (County, City))

generates:

   GROUP BY GROUPING SETS((Province, County, City)
                          (Province)
                          () )

and the three-element ROLLUP generates:

   GROUP BY GROUPING SETS((Province, County, City)
                          (Province, County)
                          (Province)
                          () )

Example C2 also utilizes composite column values.

grand-total: Both CUBE and ROLLUP return a row which is the overall (grand total) aggregation. This may be separately specified with empty parentheses within the GROUPING SET clause. It may also be specified directly in the GROUP BY clause, although there is no effect on the result of the query. Example C4 uses the grand-total syntax.

Combining grouping sets

This can be used to combine any of the types of GROUP BY clauses. When simple grouping-expression fields are combined with other groups, they are "appended" to the beginning of the resulting grouping sets. When ROLLUP or CUBE expressions are combined, they operate like "multipliers" on the remaining expression, forming additional grouping set entries according to the definition of either ROLLUP or CUBE.

For instance, combining grouping-expression elements acts as follows:

  GROUP BY a, ROLLUP(b,c)

is equivalent to

  GROUP BY GROUPING SETS((a,b,c)
                         (a,b)
                         (a) )

Or similarly,

  GROUP BY a, b, ROLLUP(c,d)

is equivalent to

  GROUP BY GROUPING SETS((a,b,c,d)
                         (a,b,c)
                         (a,b) )

Combining of ROLLUP elements acts as follows:

  GROUP BY ROLLUP(a), ROLLUP(b,c)

is equivalent to

  GROUP BY GROUPING SETS((a,b,c)
                         (a,b)
                         (a)
                         (b,c)
                         (b)
                         () )

Similarly,

  GROUP BY ROLLUP(a), CUBE(b,c)

is equivalent to

  GROUP BY GROUPING SETS((a,b,c)
                         (a,b)
                         (a,c)
                         (a)
                         (b,c)
                         (b)
                         (c)
                         () )

Combining of CUBE and ROLLUP elements acts as follows:

  GROUP BY CUBE(a,b), ROLLUP(c,d)

is equivalent to

  GROUP BY GROUPING SETS((a,b,c,d)
                         (a,b,c)
                         (a,b)
                         (a,c,d)
                         (a,c)
                         (a)
                         (b,c,d)
                         (b,c)
                         (b)
                         (c,d)
                         (c)
                         () )

Like a simple grouping-expression, combining grouping sets also eliminates duplicates within each grouping set. For instance,

  GROUP BY a, ROLLUP(a,b)

is equivalent to

  GROUP BY GROUPING SETS((a,b)
                         (a) )

A more complete example of combining grouping sets is to construct a result set that eliminates certain rows that would be returned for a full CUBE aggregation.

For example, consider the following GROUP BY clause:

  GROUP BY Region,
           ROLLUP(Sales_Person, WEEK(Sales_Date)),
           CUBE(YEAR(Sales_Date), MONTH (Sales_Date))

The column listed immediately to the right of GROUP BY is simply grouped, those within the parenthesis following ROLLUP are rolled up, and those within the parenthesis following CUBE are cubed. Thus, the above clause results in a cube of MONTH within YEAR which is then rolled up within WEEK within Sales_Person within the Region aggregation. It does not result in any grand total row or any cross-tabulation rows on Region, Sales_Person or WEEK(Sales_Date) so produces fewer rows than the clause:

  GROUP BY ROLLUP (Region, Sales_Person, WEEK(Sales_Date),
                   YEAR(Sales_Date), MONTH(Sales_Date) )

having-clause


>>-HAVING--search-condition------------------------------------><

The HAVING clause specifies an intermediate result table that consists of those groups of R for which the search-condition is true. R is the result of the previous clause of the subselect. If this clause is not GROUP BY, R is considered to be a single group with no grouping columns.

Each column-name in the search condition must do one of the following:

Unambiguously identify a grouping column of R.
Be specified within an aggregate function.
Be a correlated reference. A column-name is a correlated reference if it identifies a column of a table-reference in an outer subselect.

A group of R to which the search condition is applied supplies the argument for each aggregate function in the search condition, except for any function whose argument is a correlated reference.

If the search condition contains a subquery, the subquery can be thought of as being executed each time the search condition is applied to a group of R, and the results used in applying the search condition. In actuality, the subquery is executed for each group only if it contains a correlated reference. For an illustration of the difference, see Example A6 and Example A7.

A correlated reference to a group of R must either identify a grouping column or be contained within an aggregate function.

When HAVING is used without GROUP BY, the select list can only include column names when they are arguments to an aggregate function, correlated column references, global variables, host variables, literals, special registers, SQL variables, or SQL parameters.

Note: The following expressions can only be specified in a HAVING clause if they are contained within an aggregate function (SQLSTATE 42803):

ROW CHANGE TIMESTAMP FOR table-designator
ROW CHANGE TOKEN FOR table-designator
RID_BIT or RID scalar function

order-by-clause


>>-ORDER BY----------------------------------------------------->

     .-,-------------------------------------------.     
     |                        .-NULLS LAST-.       |     
     V                 .-ASC--+------------+---.   |     
>--+---+-| sort-key |--+-----------------------+-+-+-+---------><
   |   |               |       .-NULLS FIRST-. | |   |   
   |   |               '-DESC--+-------------+-' |   |   
   |   '-ORDER OF--table-designator--------------'   |   
   '-INPUT SEQUENCE----------------------------------'   

sort-key

|--+-simple-column-name--+--------------------------------------|
   +-simple-integer------+   
   '-sort-key-expression-'

The ORDER BY clause specifies an ordering of the rows of the result table. If a single sort specification (one sort-key with associated direction) is identified, the rows are ordered by the values of that sort specification. If more than one sort specification is identified, the rows are ordered by the values of the first identified sort specification, then by the values of the second identified sort specification, and so on. Each sort-key cannot have a data type of CLOB, DBCLOB, BLOB, XML, distinct type on any of these types, or structured type (SQLSTATE 42907).

A named column in the select list may be identified by a sort-key that is a simple-integer or a simple-column-name. An unnamed column in the select list must be identified by an simple-integer or, in some cases, by a sort-key-expression that matches the expression in the select list (see details of sort-key-expression). A column is unnamed if the AS clause is not specified and it is derived from a constant, an expression with operators, or a function.

Ordering is performed in accordance with comparison rules. If an ORDER BY clause contains decimal floating-point columns, and multiple representations of the same number exist in these columns, the ordering of the multiple representations of the same number is unspecified. The null value is higher than all other values. If the ORDER BY clause does not completely order the rows, rows with duplicate values of all identified columns are displayed in an arbitrary order.

simple-column-name

Usually identifies a column of the result table. In this case, simple-column-name must be the column name of a named column in the select list.

The simple-column-name can also identify a column name of a table, view, or nested table identified in the FROM clause if the query is a subselect. This includes columns defined as implicitly hidden. An error occurs if the subselect:

Specifies DISTINCT in the select-clause (SQLSTATE 42822)
Produces a grouped result and the simple-column-name is not a grouping-expression (SQLSTATE 42803).

Determining which column is used for ordering the result is described under "Column names in sort keys" below.

simple-integer

Must be greater than 0 and not greater than the number of columns in the result table (SQLSTATE 42805). The integer n identifies the nth column of the result table.

sort-key-expression

An expression that is not simply a column name or an unsigned integer constant. The query to which ordering is applied must be a subselect to use this form of sort-key. The sort-key-expression cannot include a correlated scalar fullselect (SQLSTATE 42703) or a function with an external action (SQLSTATE 42845).

Any column-name within a sort-key-expression must conform to the rules described under "Column names in sort keys" below.

There are a number of special cases that further restrict the expressions that can be specified.

DISTINCT is specified in the SELECT clause of the subselect (SQLSTATE 42822).
The sort-key-expression must match exactly with an expression in the select list of the subselect (scalar-fullselects are never matched).
The subselect is grouped (SQLSTATE 42803).
The sort-key-expression can:
- be an expression in the select list of the subselect,
- include a grouping-expression from the GROUP BY clause of the subselect
- include an aggregate function, constant or host variable.

ASC

Uses the values of the column in ascending order. This is the default.

DESC

Uses the values of the column in descending order.

ORDER OF table-designator

Specifies that the same ordering used in table-designator should be applied to the result table of the subselect. There must be a table reference matching table-designator in the FROM clause of the subselect that specifies this clause (SQLSTATE 42703). The subselect (or fullselect) corresponding to the specified table-designator must include an ORDER BY clause that is dependant on the data (SQLSTATE 428FI). The ordering that is applied is the same as if the columns of the ORDER BY clause in the nested subselect (or fullselect) were included in the outer subselect (or fullselect), and these columns were specified in place of the ORDER OF clause.

Note that this form is not allowed in a fullselect (other than the degenerative form of a fullselect). For example, the following is not valid:

(SELECT C1 FROM T1
   ORDER BY C1)
UNION
SELECT C1 FROM T2
   ORDER BY ORDER OF T1

The following example is valid:

SELECT C1 FROM
   (SELECT C1 FROM T1
      UNION
    SELECT C1 FROM T2
    ORDER BY C1 ) AS UTABLE
ORDER BY ORDER OF UTABLE

INPUT SEQUENCE

Specifies that, for an INSERT statement, the result table will reflect the input order of ordered data rows. INPUT SEQUENCE ordering can only be specified if an INSERT statement is used in a FROM clause (SQLSTATE 428G4). See table-reference. If INPUT SEQUENCE is specified and the input data is not ordered, the INPUT SEQUENCE clause is ignored.

Notes

Column names in sort keys:
- The column name is qualified.
  The query must be a subselect (SQLSTATE 42877). The column name must unambiguously identify a column of some table, view or nested table in the FROM clause of the subselect (SQLSTATE 42702). The value of the column is used to compute the value of the sort specification.
- The column name is unqualified.
  - The query is a subselect.
    If the column name is identical to the name of more than one column of the result table, the column name must unambiguously identify a column of some table, view or nested table in the FROM clause of the ordering subselect (SQLSTATE 42702). If the column name is identical to one column, that column is used to compute the value of the sort specification. If the column name is not identical to a column of the result table, then it must unambiguously identify a column of some table, view or nested table in the FROM clause of the fullselect in the select-statement (SQLSTATE 42702).
  - The query is not a subselect (it includes set operations such as union, except or intersect).
    The column name must not be identical to the name of more than one column of the result table (SQLSTATE 42702). The column name must be identical to exactly one column of the result table (SQLSTATE 42707), and this column is used to compute the value of the sort specification.
Limits: The use of a sort-key-expression or a simple-column-name where the column is not in the select list may result in the addition of the column or expression to the temporary table used for sorting. This may result in reaching the limit of the number of columns in a table or the limit on the size of a row in a table. Exceeding these limits will result in an error if a temporary table is required to perform the sorting operation.

fetch-first-clause


                .-1-------.                   
>>-FETCH FIRST--+---------+--+-ROW--+--ONLY--------------------><
                '-integer-'  '-ROWS-'

The fetch-first-clause sets a maximum number of rows that can be retrieved. It lets the database manager know that the application does not want to retrieve more than integer rows, regardless of how many rows there might be in the result table when this clause is not specified. An attempt to fetch beyond integer rows is handled the same way as normal end of data (SQLSTATE 02000). The value of integer must be a positive integer (not zero).

Use of the fetch-first-clause influences query optimization of the subselect or fullselect, based on the fact that at most integer rows will be retrieved. If both the fetch-first-clause is specified in the outermost fullselect and the optimize-for-clause is specified for the select statement, the database manager will use the integer from the optimize-for-clause to influence query optimization of the outermost fullselect.

Limiting the result table to the first integer rows can improve performance. The database manager will cease processing the query once it has determined the first integer rows. If both the fetch-first-clause and the optimize-for-clause are specified, the lower of the integer values from these clauses is used to influence the communications buffer size.

If the fullselect contains an SQL data change statement in the FROM clause, all the rows are modified regardless of the limit on the number of rows to fetch.

isolation-clause


>>-+---------------------------------------+-------------------><
   '-WITH--+-RR--+---------------------+-+-'   
           |     '-lock-request-clause-' |     
           +-RS--+---------------------+-+     
           |     '-lock-request-clause-' |     
           +-CS--------------------------+     
           '-UR--------------------------'

The optional isolation-clause specifies the isolation level at which the subselect or fullselect is executed, and whether a specific type of lock is to be acquired.

RR - Repeatable-Read
RS - Read Stability
CS - Cursor Stability
UR - Uncommitted Read

lock-request-clause


>>-USE AND KEEP--+-SHARE-----+--LOCKS--------------------------><
                 +-UPDATE----+          
                 '-EXCLUSIVE-'

The lock-request-clause applies only to queries and to positioning read operations within an insert, update, or delete operation. The insert, update, and delete operations themselves will execute using locking determined by the database manager.

The optional lock-request-clause specifies the type of lock that the database manager is to acquire and hold:

SHARE: Concurrent processes can acquire SHARE or UPDATE locks on the data.
UPDATE: Concurrent processes can acquire SHARE locks on the data, but no concurrent process can acquire an UPDATE or EXCLUSIVE lock.
EXCLUSIVE: Concurrent processes cannot acquire a lock on the data.

isolation-clause restrictions:

The isolation-clause is not supported for a CREATE TABLE, CREATE VIEW, or ALTER TABLE statement (SQLSTATE 42601).
The isolation-clause cannot be specified for a subselect or fullselect that will cause trigger invocation, referential integrity scans, or MQT maintenance (SQLSTATE 42601).
A subselect or fullselect cannot include a lock-request-clause if that subselect or fullselect references any SQL functions that are not declared with the option INHERIT ISOLATION LEVEL WITH LOCK REQUEST (SQLSTATE 42601).
A subselect or fullselect that contains a lock-request-clause are not be eligible for MQT routing.
If an isolation-clause is specified for a subselect or fullselect within the body of an SQL function, SQL method, or trigger, the clause is ignored and a warning is returned.
If an isolation-clause is specified for a subselect or fullselect that is used by a scrollable cursor, the clause is ignored and a warning is returned.
Neither isolation-clause nor lock-request-clause can be specified in the context where they will cause conflict isolation or lock intent on a common table expression (SQLSTATE 42601). This restriction does not apply to aliases or base tables. The following examples create an isolation conflict on a and returns an error:
- View:
```
create view a as (...);
(select * from a with RR USE AND KEEP SHARE LOCKS)
UNION ALL 
(select * from a with UR);
```
- Common table expression:
```
WITH a as (...)
(select * from a with RR USE AND KEEP SHARE LOCKS)
UNION ALL 
(select * from a with UR);
```
An isolation-clause cannot be specified in an XML context (SQLSTATE 2200M).

Examples of subselects

Example A1: Select all columns and rows from the EMPLOYEE table.

  SELECT * FROM EMPLOYEE

Example A2: Join the EMP_ACT and EMPLOYEE tables, select all the columns from the EMP_ACT table and add the employee's surname (LASTNAME) from the EMPLOYEE table to each row of the result.

   SELECT EMP_ACT.*, LASTNAME
     FROM EMP_ACT, EMPLOYEE
     WHERE EMP_ACT.EMPNO = EMPLOYEE.EMPNO

Example A3: Join the EMPLOYEE and DEPARTMENT tables, select the employee number (EMPNO), employee surname (LASTNAME), department number (WORKDEPT in the EMPLOYEE table and DEPTNO in the DEPARTMENT table) and department name (DEPTNAME) of all employees who were born (BIRTHDATE) earlier than 1930.

  SELECT EMPNO, LASTNAME, WORKDEPT, DEPTNAME 
     FROM EMPLOYEE, DEPARTMENT                     
     WHERE WORKDEPT = DEPTNO     
     AND YEAR(BIRTHDATE) < 1930

Example A4: Select the job (JOB) and the minimum and maximum salaries (SALARY) for each group of rows with the same job code in the EMPLOYEE table, but only for groups with more than one row and with a maximum salary greater than or equal to 27000.

   SELECT JOB, MIN(SALARY), MAX(SALARY)
     FROM EMPLOYEE
     GROUP BY JOB
     HAVING COUNT(*) > 1 
     AND MAX(SALARY) >= 27000

Example A5: Select all the rows of EMP_ACT table for employees (EMPNO) in department (WORKDEPT) 'E11'. (Employee department numbers are shown in the EMPLOYEE table.)

  SELECT *
    FROM EMP_ACT  
    WHERE EMPNO IN
             (SELECT EMPNO 
                 FROM EMPLOYEE  
                 WHERE WORKDEPT = 'E11')

Example A6: From the EMPLOYEE table, select the department number (WORKDEPT) and maximum departmental salary (SALARY) for all departments whose maximum salary is less than the average salary for all employees.

  SELECT WORKDEPT, MAX(SALARY) 
    FROM EMPLOYEE 
    GROUP BY WORKDEPT  
    HAVING MAX(SALARY) < (SELECT AVG(SALARY)
                              FROM EMPLOYEE)

The subquery in the HAVING clause would only be executed once in this example.

Example A7: Using the EMPLOYEE table, select the department number (WORKDEPT) and maximum departmental salary (SALARY) for all departments whose maximum salary is less than the average salary in all other departments.

  SELECT WORKDEPT, MAX(SALARY)
     FROM EMPLOYEE EMP_COR
      GROUP BY WORKDEPT 
      HAVING MAX(SALARY) < (SELECT AVG(SALARY)
                              FROM EMPLOYEE
                              WHERE NOT WORKDEPT = EMP_COR.WORKDEPT)

In contrast to Example A6, the subquery in the HAVING clause would need to be executed for each group.

Example A8: Determine the employee number and salary of sales representatives along with the average salary and head count of their departments.

This query must first create a nested table expression (DINFO) in order to get the AVGSALARY and EMPCOUNT columns, as well as the DEPTNO column that is used in the WHERE clause.

 SELECT THIS_EMP.EMPNO, THIS_EMP.SALARY, DINFO.AVGSALARY, DINFO.EMPCOUNT
  FROM EMPLOYEE THIS_EMP,
        (SELECT OTHERS.WORKDEPT AS DEPTNO,
                           AVG(OTHERS.SALARY) AS AVGSALARY,
                           COUNT(*) AS EMPCOUNT
          FROM EMPLOYEE OTHERS
          GROUP BY OTHERS.WORKDEPT
         ) AS DINFO
  WHERE THIS_EMP.JOB = 'SALESREP'
   AND THIS_EMP.WORKDEPT = DINFO.DEPTNO

Using a nested table expression for this case saves the overhead of creating the DINFO view as a regular view. During statement preparation, accessing the catalog for the view is avoided and, because of the context of the rest of the query, only the rows for the department of the sales representatives need to be considered by the view.

Example A9: Display the average education level and salary for 5 random groups of employees.

This query requires the use of a nested table expression to set a random value for each employee so that it can subsequently be used in the GROUP BY clause.

  SELECT RANDID , AVG(EDLEVEL), AVG(SALARY)
    FROM ( SELECT EDLEVEL, SALARY, INTEGER(RAND()*5) AS RANDID
             FROM EMPLOYEE
         ) AS EMPRAND
    GROUP BY RANDID

Example A10: Query the EMP_ACT table and return those project numbers that have an employee whose salary is in the top 10 of all employees.

  SELECT EMP_ACT.EMPNO,PROJNO
     FROM EMP_ACT
     WHERE EMP_ACT.EMPNO IN
         (SELECT EMPLOYEE.EMPNO
          FROM EMPLOYEE
          ORDER BY SALARY DESC
          FETCH FIRST 10 ROWS ONLY)

Example A11: Assuming that PHONES and IDS are two SQL variables with array values of the same cardinality, turn these arrays into a table with three columns (one for each array and one for the position), and one row per array element.

   SELECT T.PHONE, T.ID, T.INDEX FROM UNNEST(PHONES, IDS)
     WITH ORDINALITY AS T(PHONE, ID, INDEX)
     ORDER BY T.INDEX

Examples of joins

Example B1: This example illustrates the results of the various joins using tables J1 and J2. These tables contain rows as shown.

   SELECT * FROM J1

   W   X     
   --- ------
   A       11
   B       12
   C       13

   SELECT * FROM J2

   Y   Z     
   --- ------
   A       21
   C       22
   D       23

The following query does an inner join of J1 and J2 matching the first column of both tables.

   SELECT * FROM J1 INNER JOIN J2 ON W=Y

   W   X      Y   Z     
   --- ------ --- ------
   A       11 A       21
   C       13 C       22

In this inner join example the row with column W='C' from J1 and the row with column Y='D' from J2 are not included in the result because they do not have a match in the other table. Note that the following alternative form of an inner join query produces the same result.

   SELECT * FROM J1, J2 WHERE W=Y

The following left outer join will get back the missing row from J1 with nulls for the columns of J2. Every row from J1 is included.

   SELECT * FROM J1 LEFT OUTER JOIN J2 ON W=Y

   W   X      Y   Z     
   --- ------ --- ------
   A       11 A       21
   B       12 -        -
   C       13 C       22

The following right outer join will get back the missing row from J2 with nulls for the columns of J1. Every row from J2 is included.

   SELECT * FROM J1 RIGHT OUTER JOIN J2 ON W=Y

   W   X      Y   Z     
   --- ------ --- ------
   A       11 A       21
   C       13 C       22
   -        - D       23

The following full outer join will get back the missing rows from both J1 and J2 with nulls where appropriate. Every row from both J1 and J2 is included.

   SELECT * FROM J1 FULL OUTER JOIN J2 ON W=Y

   W   X      Y   Z     
   --- ------ --- ------
   A       11 A       21
   C       13 C       22
   -        - D       23
   B       12 -        -

Example B2: Using the tables J1 and J2 from the previous example, examine what happens when and additional predicate is added to the search condition.

   SELECT * FROM J1 INNER JOIN J2 ON W=Y AND X=13

   W   X      Y   Z     
   --- ------ --- ------
   C       13 C       22

The additional condition caused the inner join to select only 1 row compared to the inner join in Example B1.

Notice what the impact of this is on the full outer join.

   SELECT * FROM J1 FULL OUTER JOIN J2 ON W=Y AND X=13

   W   X      Y   Z     
   --- ------ --- ------
   -        - A       21
   C       13 C       22
   -        - D       23
   A       11 -        -
   B       12 -        -

The result now has 5 rows (compared to 4 without the additional predicate) since there was only 1 row in the inner join and all rows of both tables must be returned.

The following query illustrates that placing the same additional predicate in WHERE clause has completely different results.

   SELECT * FROM J1 FULL OUTER JOIN J2 ON W=Y 
      WHERE X=13

   W   X      Y   Z     
   --- ------ --- ------
   C       13 C       22

The WHERE clause is applied after the intermediate result of the full outer join. This intermediate result would be the same as the result of the full outer join query in Example B1. The WHERE clause is applied to this intermediate result and eliminates all but the row that has X=13. Choosing the location of a predicate when performing outer joins can have significant impact on the results. Consider what happens if the predicate was X=12 instead of X=13. The following inner join returns no rows.

   SELECT * FROM J1 INNER JOIN J2 ON W=Y AND X=12

Hence, the full outer join would return 6 rows, 3 from J1 with nulls for the columns of J2 and 3 from J2 with nulls for the columns of J1.

   SELECT * FROM J1 FULL OUTER JOIN J2 ON W=Y AND X=12

   W   X      Y   Z     
   --- ------ --- ------
   -        - A       21
   -        - C       22
   -        - D       23
   A       11 -        -
   B       12 -        -
   C       13 -        -

If the additional predicate is in the WHERE clause instead, 1 row is returned.

   SELECT * FROM J1 FULL OUTER JOIN J2 ON W=Y 
      WHERE X=12

   W   X      Y   Z     
   --- ------ --- ------
   B       12 -        -

Example B3: List every department with the employee number and last name of the manager, including departments without a manager.

  SELECT DEPTNO, DEPTNAME, EMPNO, LASTNAME
    FROM DEPARTMENT LEFT OUTER JOIN EMPLOYEE
         ON MGRNO = EMPNO

Example B4: List every employee number and last name with the employee number and last name of their manager, including employees without a manager.

  SELECT E.EMPNO, E.LASTNAME, M.EMPNO, M.LASTNAME
    FROM EMPLOYEE E LEFT OUTER JOIN
                                        DEPARTMENT INNER JOIN EMPLOYEE M 
         ON MGRNO = M.EMPNO
         ON E.WORKDEPT = DEPTNO

The inner join determines the last name for any manager identified in the DEPARTMENT table and the left outer join guarantees that each employee is listed even if a corresponding department is not found in DEPARTMENT.

Examples of grouping sets, cube, and rollup

The queries in Example C1 through Example C4 use a subset of the rows in the SALES tables based on the predicate 'WEEK(SALES_DATE) = 13'.

  SELECT WEEK(SALES_DATE) AS WEEK, 
         DAYOFWEEK(SALES_DATE) AS DAY_WEEK, 
         SALES_PERSON, SALES AS UNITS_SOLD 
  FROM SALES 
  WHERE WEEK(SALES_DATE) = 13

which results in:

  WEEK        DAY_WEEK    SALES_PERSON    UNITS_SOLD 
  ----------- ----------- --------------- -----------
           13           6 LUCCHESSI                 3
           13           6 LUCCHESSI                 1
           13           6 LEE                       2
           13           6 LEE                       2
           13           6 LEE                       3
           13           6 LEE                       5
           13           6 GOUNOT                    3
           13           6 GOUNOT                    1
           13           6 GOUNOT                    7
           13           7 LUCCHESSI                 1
           13           7 LUCCHESSI                 2
           13           7 LUCCHESSI                 1
           13           7 LEE                       7
           13           7 LEE                       3
           13           7 LEE                       7
           13           7 LEE                       4
           13           7 GOUNOT                    2
           13           7 GOUNOT                   18
           13           7 GOUNOT                    1

Example C1: Here is a query with a basic GROUP BY clause over 3 columns:

  SELECT WEEK(SALES_DATE) AS WEEK,
         DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
         SALES_PERSON, SUM(SALES) AS UNITS_SOLD       
  FROM SALES
  WHERE WEEK(SALES_DATE) = 13
  GROUP BY WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE), SALES_PERSON
  ORDER BY WEEK, DAY_WEEK, SALES_PERSON

This results in:

  WEEK        DAY_WEEK    SALES_PERSON    UNITS_SOLD 
  ----------- ----------- --------------- -----------
           13           6 GOUNOT                   11
           13           6 LEE                      12
           13           6 LUCCHESSI                 4
           13           7 GOUNOT                   21
           13           7 LEE                      21
           13           7 LUCCHESSI                 4

Example C2: Produce the result based on two different grouping sets of rows from the SALES table.

  SELECT WEEK(SALES_DATE) AS WEEK,
         DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
         SALES_PERSON, SUM(SALES) AS UNITS_SOLD       
  FROM SALES 
  WHERE WEEK(SALES_DATE) = 13
  GROUP BY GROUPING SETS ( (WEEK(SALES_DATE), SALES_PERSON),
                           (DAYOFWEEK(SALES_DATE), SALES_PERSON))
  ORDER BY WEEK, DAY_WEEK, SALES_PERSON

This results in:

  WEEK        DAY_WEEK    SALES_PERSON    UNITS_SOLD 
  ----------- ----------- --------------- -----------
           13           - GOUNOT                   32
           13           - LEE                      33
           13           - LUCCHESSI                 8
            -           6 GOUNOT                   11
            -           6 LEE                      12
            -           6 LUCCHESSI                 4
            -           7 GOUNOT                   21
            -           7 LEE                      21
            -           7 LUCCHESSI                 4

The rows with WEEK 13 are from the first grouping set and the other rows are from the second grouping set.

Example C3: If you use the 3 distinct columns involved in the grouping sets of Example C2 and perform a ROLLUP, you can see grouping sets for (WEEK,DAY_WEEK,SALES_PERSON), (WEEK, DAY_WEEK), (WEEK) and grand total.

SELECT WEEK(SALES_DATE) AS WEEK,
        DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
        SALES_PERSON, SUM(SALES) AS UNITS_SOLD       
  FROM SALES
  WHERE WEEK(SALES_DATE) = 13
  GROUP BY ROLLUP ( WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE), SALES_PERSON )
  ORDER BY WEEK, DAY_WEEK, SALES_PERSON

This results in:

  WEEK        DAY_WEEK    SALES_PERSON    UNITS_SOLD 
  ----------- ----------- --------------- -----------
           13           6 GOUNOT                   11
           13           6 LEE                      12
           13           6 LUCCHESSI                 4
           13           6 -                        27
           13           7 GOUNOT                   21
           13           7 LEE                      21
           13           7 LUCCHESSI                 4
           13           7 -                        46
           13           - -                        73
            -           - -                        73

Example C4: If you run the same query as Example C3 only replace ROLLUP with CUBE, you can see additional grouping sets for (WEEK,SALES_PERSON), (DAY_WEEK,SALES_PERSON), (DAY_WEEK), (SALES_PERSON) in the result.

  SELECT WEEK(SALES_DATE) AS WEEK,
         DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
         SALES_PERSON, SUM(SALES) AS UNITS_SOLD       
  FROM SALES
  WHERE WEEK(SALES_DATE) = 13
  GROUP BY CUBE ( WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE), SALES_PERSON )
  ORDER BY WEEK, DAY_WEEK, SALES_PERSON

This results in:

  WEEK        DAY_WEEK    SALES_PERSON    UNITS_SOLD 
  ----------- ----------- --------------- -----------
           13           6 GOUNOT                   11
           13           6 LEE                      12
           13           6 LUCCHESSI                 4
           13           6 -                        27
           13           7 GOUNOT                   21
           13           7 LEE                      21
           13           7 LUCCHESSI                 4
           13           7 -                        46
           13           - GOUNOT                   32
           13           - LEE                      33
           13           - LUCCHESSI                 8
           13           - -                        73
            -           6 GOUNOT                   11
            -           6 LEE                      12
            -           6 LUCCHESSI                 4
            -           6 -                        27
            -           7 GOUNOT                   21
            -           7 LEE                      21
            -           7 LUCCHESSI                 4
            -           7 -                        46
            -           - GOUNOT                   32
            -           - LEE                      33
            -           - LUCCHESSI                 8
            -           - -                        73

Example C5: Obtain a result set which includes a grand-total of selected rows from the SALES table together with a group of rows aggregated by SALES_PERSON and MONTH.

  SELECT SALES_PERSON,
         MONTH(SALES_DATE) AS MONTH,
         SUM(SALES) AS UNITS_SOLD
  FROM SALES
  GROUP BY GROUPING SETS ( (SALES_PERSON, MONTH(SALES_DATE)),
                           ()        
                         )
  ORDER BY SALES_PERSON, MONTH

This results in:

  SALES_PERSON    MONTH       UNITS_SOLD 
  --------------- ----------- -----------
  GOUNOT                    3          35
  GOUNOT                    4          14
  GOUNOT                   12           1
  LEE                       3          60
  LEE                       4          25
  LEE                      12           6
  LUCCHESSI                 3           9
  LUCCHESSI                 4           4
  LUCCHESSI                12           1
  -                         -         155

Example C6: This example shows two simple ROLLUP queries followed by a query which treats the two ROLLUPs as grouping sets in a single result set and specifies row ordering for each column involved in the grouping sets.

Example C6-1:

  SELECT WEEK(SALES_DATE) AS WEEK,
         DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
         SUM(SALES) AS UNITS_SOLD
  FROM SALES
  GROUP BY ROLLUP ( WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE) )
  ORDER BY WEEK, DAY_WEEK

results in:

  WEEK        DAY_WEEK    UNITS_SOLD 
  ----------- ----------- -----------
           13           6          27
           13           7          46
           13           -          73
           14           1          31
           14           2          43
           14           -          74
           53           1           8
           53           -           8
            -           -         155

Example C6-2:

  SELECT MONTH(SALES_DATE) AS MONTH,
         REGION,
         SUM(SALES) AS UNITS_SOLD
  FROM SALES
  GROUP BY ROLLUP ( MONTH(SALES_DATE), REGION );
  ORDER BY MONTH, REGION

results in:

  MONTH       REGION          UNITS_SOLD 
  ----------- --------------- -----------
            3 Manitoba                 22
            3 Ontario-North             8
            3 Ontario-South            34
            3 Quebec                   40
            3 -                       104
            4 Manitoba                 17
            4 Ontario-North             1
            4 Ontario-South            14
            4 Quebec                   11
            4 -                        43
           12 Manitoba                  2
           12 Ontario-South             4
           12 Quebec                    2
           12 -                         8
            - -                       155

Example C6-3:

  SELECT WEEK(SALES_DATE) AS WEEK,
         DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
         MONTH(SALES_DATE) AS MONTH,
         REGION,
         SUM(SALES) AS UNITS_SOLD
  FROM SALES
  GROUP BY GROUPING SETS ( ROLLUP( WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE) ),
                           ROLLUP( MONTH(SALES_DATE), REGION )  )
ORDER BY WEEK, DAY_WEEK, MONTH, REGION

results in:

  WEEK        DAY_WEEK    MONTH       REGION          UNITS_SOLD 
  ----------- ----------- ----------- --------------- -----------
           13           6           - -                        27
           13           7           - -                        46
           13           -           - -                        73
           14           1           - -                        31
           14           2           - -                        43
           14           -           - -                        74
           53           1           - -                         8
           53           -           - -                         8
            -           -           3 Manitoba                 22
            -           -           3 Ontario-North             8
            -           -           3 Ontario-South            34
            -           -           3 Quebec                   40
            -           -           3 -                       104
            -           -           4 Manitoba                 17
            -           -           4 Ontario-North             1
            -           -           4 Ontario-South            14
            -           -           4 Quebec                   11
            -           -           4 -                        43
            -           -          12 Manitoba                  2
            -           -          12 Ontario-South             4
            -           -          12 Quebec                    2
            -           -          12 -                         8
            -           -           - -                       155
            -           -           - -                       155

Using the two ROLLUPs as grouping sets causes the result to include duplicate rows. There are even two grand total rows.

Observe how the use of ORDER BY has affected the results:

In the first grouped set, week 53 has been repositioned to the end.
In the second grouped set, month 12 has now been positioned to the end and the regions now appear in alphabetic order.
Null values are sorted high.

Example C7: In queries that perform multiple ROLLUPs in a single pass (such as Example C6-3) you may want to be able to indicate which grouping set produced each row. The following steps demonstrate how to provide a column (called GROUP) which indicates the origin of each row in the result set. By origin, we mean which one of the two grouping sets produced the row in the result set.

Step 1: Introduce a way of "generating" new data values, using a query which selects from a VALUES clause (which is an alternate form of a fullselect). This query shows how a table can be derived called "X" having 2 columns "R1" and "R2" and 1 row of data.

  SELECT R1,R2 
  FROM (VALUES('GROUP 1','GROUP 2')) AS X(R1,R2);

results in:

  R1      R2     
  ------- -------
  GROUP 1 GROUP 2

Step 2: Form the cross product of this table "X" with the SALES table. This add columns "R1" and "R2" to every row.

  SELECT R1, R2, WEEK(SALES_DATE) AS WEEK,
         DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
         MONTH(SALES_DATE) AS MONTH,
         REGION,
         SALES AS UNITS_SOLD
  FROM SALES,(VALUES('GROUP 1','GROUP 2')) AS X(R1,R2)

This add columns "R1" and "R2" to every row.

Step 3: Now we can combine these columns with the grouping sets to include these columns in the rollup analysis.

  SELECT R1, R2, 
         WEEK(SALES_DATE) AS WEEK,
         DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
         MONTH(SALES_DATE) AS MONTH,
         REGION, SUM(SALES) AS UNITS_SOLD
  FROM SALES,(VALUES('GROUP 1','GROUP 2')) AS X(R1,R2)
  GROUP BY GROUPING SETS ((R1, ROLLUP(WEEK(SALES_DATE),
                            DAYOFWEEK(SALES_DATE))),
                          (R2,ROLLUP( MONTH(SALES_DATE), REGION ) )  )
  ORDER BY WEEK, DAY_WEEK, MONTH, REGION

results in:

  R1      R2      WEEK     DAY_WEEK  MONTH     REGION          UNITS_SOLD 
  ------- ------- -------- --------- --------- --------------- -----------
  GROUP 1 -             13         6         - -                        27
  GROUP 1 -             13         7         - -                        46
  GROUP 1 -             13         -         - -                        73
  GROUP 1 -             14         1         - -                        31
  GROUP 1 -             14         2         - -                        43
  GROUP 1 -             14         -         - -                        74
  GROUP 1 -             53         1         - -                         8
  GROUP 1 -             53         -         - -                         8
  -       GROUP 2        -         -         3 Manitoba                 22
  -       GROUP 2        -         -         3 Ontario-North             8
  -       GROUP 2        -         -         3 Ontario-South            34
  -       GROUP 2        -         -         3 Quebec                   40
  -       GROUP 2        -         -         3 -                       104
  -       GROUP 2        -         -         4 Manitoba                 17
  -       GROUP 2        -         -         4 Ontario-North             1
  -       GROUP 2        -         -         4 Ontario-South            14
  -       GROUP 2        -         -         4 Quebec                   11
  -       GROUP 2        -         -         4 -                        43
  -       GROUP 2        -         -        12 Manitoba                  2
  -       GROUP 2        -         -        12 Ontario-South             4
  -       GROUP 2        -         -        12 Quebec                    2
  -       GROUP 2        -         -        12 -                         8
  -       GROUP 2        -         -         - -                       155
  GROUP 1 -              -         -         - -                       155

Step 4: Notice that because R1 and R2 are used in different grouping sets, whenever R1 is non-null in the result, R2 is null and whenever R2 is non-null in the result, R1 is null. That means you can consolidate these columns into a single column using the COALESCE function. You can also use this column in the ORDER BY clause to keep the results of the two grouping sets together.

  SELECT COALESCE(R1,R2) AS GROUP,
         WEEK(SALES_DATE) AS WEEK,
         DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
         MONTH(SALES_DATE) AS MONTH,
         REGION, SUM(SALES) AS UNITS_SOLD
  FROM SALES,(VALUES('GROUP 1','GROUP 2')) AS X(R1,R2)
  GROUP BY GROUPING SETS ((R1, ROLLUP(WEEK(SALES_DATE),
                            DAYOFWEEK(SALES_DATE))),
                          (R2,ROLLUP( MONTH(SALES_DATE), REGION ) )  )
  ORDER BY GROUP, WEEK, DAY_WEEK, MONTH, REGION;

results in:

  GROUP   WEEK        DAY_WEEK    MONTH       REGION          UNITS_SOLD 
  ------- ----------- ----------- ----------- --------------- -----------
  GROUP 1          13           6           - -                        27
  GROUP 1          13           7           - -                        46
  GROUP 1          13           -           - -                        73
  GROUP 1          14           1           - -                        31
  GROUP 1          14           2           - -                        43
  GROUP 1          14           -           - -                        74
  GROUP 1          53           1           - -                         8
  GROUP 1          53           -           - -                         8
  GROUP 1           -           -           - -                       155
  GROUP 2           -           -           3 Manitoba                 22
  GROUP 2           -           -           3 Ontario-North             8
  GROUP 2           -           -           3 Ontario-South            34
  GROUP 2           -           -           3 Quebec                   40
  GROUP 2           -           -           3 -                       104
  GROUP 2           -           -           4 Manitoba                 17
  GROUP 2           -           -           4 Ontario-North             1
  GROUP 2           -           -           4 Ontario-South            14
  GROUP 2           -           -           4 Quebec                   11
  GROUP 2           -           -           4 -                        43
  GROUP 2           -           -          12 Manitoba                  2
  GROUP 2           -           -          12 Ontario-South             4
  GROUP 2           -           -          12 Quebec                    2
  GROUP 2           -           -          12 -                         8
  GROUP 2           -           -           - -                       155

Example C8: The following example illustrates the use of various aggregate functions when performing a CUBE. The example also makes use of cast functions and rounding to produce a decimal result with reasonable precision and scale.

  SELECT MONTH(SALES_DATE) AS MONTH,
         REGION,
         SUM(SALES) AS UNITS_SOLD, 
         MAX(SALES) AS BEST_SALE, 
         CAST(ROUND(AVG(DECIMAL(SALES)),2) AS DECIMAL(5,2)) AS AVG_UNITS_SOLD 
  FROM SALES 
  GROUP BY CUBE(MONTH(SALES_DATE),REGION)
  ORDER BY MONTH, REGION

This results in:

MONTH       REGION          UNITS_SOLD  BEST_SALE   AVG_UNITS_SOLD
----------- --------------- ----------- ----------- --------------
          3 Manitoba                 22           7           3.14
          3 Ontario-North             8           3           2.67
          3 Ontario-South            34          14           4.25
          3 Quebec                   40          18           5.00
          3 -                       104          18           4.00
          4 Manitoba                 17           9           5.67
          4 Ontario-North             1           1           1.00
          4 Ontario-South            14           8           4.67
          4 Quebec                   11           8           5.50
          4 -                        43           9           4.78
         12 Manitoba                  2           2           2.00
         12 Ontario-South             4           3           2.00
         12 Quebec                    2           1           1.00
         12 -                         8           3           1.60
          - Manitoba                 41           9           3.73
          - Ontario-North             9           3           2.25
          - Ontario-South            52          14           4.00
          - Quebec                   53          18           4.42
          - -                       155          18           3.87