select-clause

The SELECT clause specifies the columns of the final result table. The column values are produced by the application of the select list to R. The select list is a list of names and expressions specified in the SELECT clause, and R is the result of the previous operation of the subselect. For example, if SELECT, FROM, and WHERE are the only clauses specified, then R is the result of that WHERE clause.

select-clause

ALL

Retains all rows of the final result table and does not eliminate redundant duplicates. This is the default.

DISTINCT

Eliminates all but one of each set of duplicate rows of the final result table.

Two rows are duplicates of one another only if each value in the first row is equal to the corresponding value in the second row. For determining duplicate rows, two null values are considered equal.

When SELECT DISTINCT is specified, no column or expression in the implicit or explicit list can return a value that is a LOB or XML data type. When a column or expression in the list returns a value that is a DECFLOAT data type and multiple bit representations of the same number exists in the intermediate result, the value that is returned is unpredictable. See Numeric comparisons for additional information.

Column access controls do not affect the operation of SELECT DISTINCT. The elimination of duplicated rows is based on the original column values, not the masked values. However, after the application of column masks, the masked values in the final result table might not reflect the uniqueness that is enforced by SELECT DISTINCT.

If a column mask is applied to a column that directly or indirectly derives the result of SELECT DISTINCT, SELECT DISTINCT can return a result that is not deterministic. The following conditions are a few examples of when a result that is not deterministic might be returned:

The definition of the column mask references other columns of the table to which the column mask is applied.
The column is referenced in the argument of a built-in scalar function, such as COALESCE, IFNULL, NULLIF, MAX, MIN, LOCATE, TOTALORDER, etc.
The column is referenced in the argument of an aggregation function.
The column is embedded in an expression and the expression contains a function that is not deterministic or has an external action.

For compatibility with other SQL implementations, UNIQUE can be specified as a synonym for DISTINCT.

Select list notation:

*

Represents a list of columns of table R, excluding any columns that are defined with the IMPLICITLY HIDDEN attribute. The list of names is established when the statement containing the SELECT clause is prepared. Therefore, * does not identify any columns that have been added to a table after the statement has been prepared.

A column that is defined with the IMPLICITLY HIDDEN attribute can be explicitly referenced in the select list.

* cannot be used in the definition of a row permission or a column mask.

expression

Specifies the values of a result column. Each column-name in the expression must unambiguously identify a column of the intermediate result table.

Start of change The result of the expression must not be an array type, except in the following contexts: End of change

The outer select list of a fullselect that does not include a set operator other than UNION ALL, in the definition of a cursor that is not scrollable
The outer select list of a SELECT INTO statement, when the target for the corresponding column of the result table of the fullselect is an array variable
The outer select list of a scalar fullselect that does not include a set operator on the right side of a SET assignment-statement or SQL PL assignment-statement statement, when the corresponding target of the assignment is an array variable

AS new-column-name: Names or renames the result column. The name must not be qualified and does not have to be unique. new-column-name is an SQL identifier of 128 UTF-8 bytes or less.

name.*

Represents a list of columns of name, excluding any columns that are defined as implicitly hidden, in the order the columns are produced by the FROM clause. name can be a table name, view name, or correlation name, and must designate an exposed table, view, or correlation name in the FROM clause that immediately follows the SELECT clause. The first name in the list identifies the first column of the table or view, the second name in the list identifies the second column of the table or view, and so on.

The list of names is established when the statement that contains the SELECT clause is prepared. Therefore, * does not identify any columns that have been added to a table after the statement has been prepared.

name.* cannot be used in the definition of a row permission or a column mask.

SQL statements can be implicitly or explicitly prepared again. The effect of another prepare on statements that include * or name.* is that the list of names is re-established. Therefore, the number of columns returned by the statement might change.

The number of columns in the result of SELECT is the same as the number of expressions in the operational form of the select list (that is, the list established at the time the statement is prepared), and cannot exceed 750. The result of a subquery must be a single column unless the subquery is used in an EXISTS predicate.

Notes:

If the FROM clause contains a MERGE statement:

The SELECT list must not implicitly or explicitly refer to a column that has a LOB data type, a ROWID data type (or a distinct type that is based on a LOB, or ROWID), or an XML data type.

Implicitly hidden columns in the select list:

The result for SELECT * does not include any columns that are defined with the IMPLICITLY HIDDEN attribute. To be included in the result, implicitly hidden columns must be explicitly specified in the select list.

VARBINARY data:

If the identified table has an index on a VARBINARY column or a column that is a distinct type that is based on VARBINARY data type, that index column cannot specify the DESC attribute. To query the identified table, either drop the index or alter the data type of the column to BINARY and then rebuild the index.

Applying the select list:

Some of the results of applying the select list to R depend on whether GROUP BY or HAVING is used. The following three lists describe the results.

IF neither GROUP BY nor HAVING is used:

The select list can include aggregate functions only if it includes other aggregate functions, constants, or expressions that only involve constants.
If the select list does not include aggregate functions, it is applied to each row of R and the result contains as many rows as there are rows in R.
If the select list includes aggregate functions, R is the source of the arguments of the functions and the result of applying the select list is one row, even when R has no rows.
If a column mask is used to mask the values in the final result table, and the select list includes aggregate functions, the definition of the column mask must not reference the following:
- A scalar fullselect
- An aggregate function

If HAVING is used and GROUP BY is not used:

Each expression or column-name in an expression in the select list must be specified within an aggregate function. Constants or expressions that involve only constants can also be in the select list.

If a column mask is used to mask the values in the final result table, the definition of the column mask must not reference the following:

A scalar fullselect
An aggregate function

If GROUP BY is used:

Each expression in the select list must use one or more grouping expressions. Or, each expression or column-name in an expression must:
- Unambiguously identify a grouping column of R.
- Be specified within an aggregate function.
- Be a correlated reference. (A column-name is a correlated reference if it identifies a column of a table or view identified in an outer subselect.)
If an expression in the select list is a scalar fullselect, a correlated reference from the scalar fullselect to a group R must either identify a grouping column or be contained within an aggregate function. For example, the following query fails because the correlated reference T1.C1 || T1.C2 in the select list of the scalar fullselect does not match a grouping column from the outer subselect. (Matching the grouping expression T1.C1 || T1.C2 is not supported.)
```
SELECT MAX(T1.C2) AS X1, 
      (SELECT T1.C1 || T1.C2 FROM T2 GROUP BY T2.C1) AS  Y1
  FROM T1
  GROUP BY  T1.C1, T1.C1 || T1.C2;
```
You cannot use GROUP BY with a name defined using the AS clause unless the name is defined in a nested table expression. Example 6 demonstrates the valid use of AS and GROUP BY in a SELECT statement.

In either case, the nth column of the result contains the values specified by applying the nth expression in the operational form of the select list.

If a column mask is used to mask the column values in the final result table, a column for which the column mask is applied must satisfy one of the following conditions:

The column must be specified in an aggregate function and the definition of the column mask must not reference the following:
- A scalar fullselect
- An aggregate function
The column must identify a column-name in the GROUP BY clause and the column must not be referenced in an expression in the GROUP BY clause. In addition, any columns of the same table as the column for which the column mask is applied and are referenced in the definition of the column mask must be identified with a column-name in the GROUP BY clause. These columns must not be referenced in an expression in the GROUP BY clause.
A column of a non-base tables in the select list must be specified in an aggregate function if a column mask is used to mask the column values in the final result table, and the column of a non-base table maps directly or indirectly to a column name or to an expression in a materialized table expression or view to the table where the column mask is applied.

Effect of column masks on result columns:

When column masks are enabled, they determine the values in the final result table of an outermost select list. When a column mask is enabled for a column, if the column appears in the outermost select list (either implicitly or explicitly), the column mask is applied to the column to produce the values for the final result table. If the column itself does not appear in the outermost select list, but is included in the output (for example, it appears in a materialized table expression or a view), the masked value is included in the result table of the table expression or view so that it can be used in the final result table.

The enabled column masks do not interfere with the operations of other clauses within the statement, such as the WHERE, GROUP BY, HAVING, SELECT DISTINCT, and ORDER BY clauses.

The rows that are returned in the final result table remain the same, except that the values in the result rows might be masked. As such, if a column with masked values also appears in an ORDER BY clause with a sort-key expression, the order is based on the original column values (the masked values in the final result table might not reflect that order). Similarly, the masked values might not reflect the uniqueness enforced by a SELECT DISTINCT. If the masked column is embedded in an expression, the result of the expression might be different because the column mask is applied to the column before the expression is evaluated. For example, a column mask on column SSN can change the result of the function COUNT(DISTINCT SSN) because the DISTINCT operation is performed on the masked values. However, if the expression in the query is the same as the expression that is used to mask the column value in the definition of the column mask, the result of the expression might remain unchanged. For example, the expression in the query is 'XXX-XX-' || SUBSTR(SSN, 8, 4) and the same expression is used in the column mask definition. In this particular example, the expression in the query can be replaced with column SSN to avoid the same expression being evaluated twice.

If a CASE expression appears in the outermost select list, column masks are not applied to the search-condition of the WHEN clause.

When the definition of a column mask is applied to an SQL statement to mask column values in the final result table, the semantics of the column mask might conflict with certain SQL semantics in the statement. In these situations, the combination of the statement and the column mask might return an error.

See ALTER TABLE for more information about the application of enabled column masks.

Null attributes of result columns:

Result columns allow null values if they are derived from one of the following:

Any aggregate function except COUNT or COUNT_BIG
A column that allows null values
A view column in an outer select list that is derived from an arithmetic expression
An arithmetic expression in an outer select list
An arithmetic expression that allows nulls
A scalar function or string expression that allows null values
A host variable that has an indicator variable, an SQL parameter or variable, a global variable, or in the case of Java™, a host variable or expression whose type is able to represent a Java null value
A result of a set operator if at least one of the corresponding items in the select list is nullable

Names of result columns:

In the following cases a result column is considered a named column:

If the AS clause is specified, the name of the result column is the name specified on the AS clause.
If the AS clause is not specified and a column list is specified in the correlation clause, the name of the result column is the corresponding name in the correlation column list.
If neither an AS clause nor a column list in the correlation clause is specified and the result column is derived only from a single column (without any functions or operators), the result column name is the unqualified name of that column.
If neither an AS clause nor a column list in the correlation clause is specified and the result column is derived only from a single SQL variable, global variable, or SQL parameter (without any functions or operators), the result column name is the unqualified name of that SQL variable, global variable, or SQL parameter.

In all other cases, a result column is an unnamed column.

Names of result columns, SQL variables, and global variables are placed into the SQL descriptor area (SQLDA) when the DESCRIBE statement is executed. This allows an interactive SQL processor such as SPUFI, the command line processor, or QMF to use the column names when displaying the results. The names in the SQLDA include those specified by the AS clause.

Data types of result columns:

Each column of the result of SELECT acquires a data type from the expression from which it is derived. The following table shows the data types of result columns.

Table 1. Data types of result columns
When the expression is...	The data type of the result column is...
The name of any numeric column	The same as the data type of the column, with the same precision and scale for decimal columns.
An integer constant	INTEGER.
A decimal or floating-point constant	The same as the data type of the constant, with the same precision and scale for decimal constants. For floating-point constants, the data type is DOUBLE PRECISION.
A decimal floating point constant	DECFLOAT(34)
The name of any numeric host variable	The same as the data type of the variable, with the same precision and scale for decimal variables. The result is decimal if the data type of the host variable is not an SQL data type; for example, DISPLAY SIGN LEADING SEPARATE in COBOL.
An arithmetic or string expression	The same as the data type of the result, with the same precision and scale for decimal results as described in Expressions.
Any function	The data type of the result of the function. For a built-in function, see Built-in functions to determine the data type of the result. For a user-defined function, the data type of the result is what was defined in the CREATE FUNCTION statement for the function.
The name of any string column	The same as the data type of the column, with the same length attribute.
The name of any string host variable	The same as the data type of the variable, with a length attribute equal to the length of the variable. The result is a varying-length character string if the data type of the host variable is not an SQL data type; for example, a NUL-terminated string in C.
A character string constant of length n	VARCHAR(n).
A binary string constant of length n	VARBINARY(n)
A graphic string constant of length n	VARGRAPHIC(n).
The name of a datetime column	The same as the data type of the column.
The name of a ROWID column	Row ID.
The name of a distinct type column	The same as the distinct type of the column, with the same length, precision, and scale attributes, if any.

For information about the CCSID of the result column, see Rules for result data types.