table-reference

¹ The syntax for joined-table is covered in a separate topic; refer to joined-table.
² TABLE can be specified in place of LATERAL.
³ The typed-correlation-clause is required for generic table functions. This clause cannot be specified for any other table functions.
⁴ WITH ORDINALITY can be specified only if the argument to the UNNEST table function is one or more ordinary array variables or functions with ordinary array return types; an associative array variable or function with an associative array return type cannot be specified (SQLSTATE 428HT).
⁵ An XMLTABLE function can be part of a table-reference. In this case, subexpressions within the XMLTABLE expression are in-scope of prior range variables in the FROM clause. For more information, see the description of XMLTABLE.
⁶ Specifying a LIKE clause or at least one column definition is not mandatory for an INSERT INTO <table> SELECT FROM statement, which acts as an implicit LIKE with respect to the INSERT target table.

A table-reference specifies an intermediate result table.

If a single-table-reference is specified without a period-specification or a tablesample-clause, the intermediate result table is the rows of the table. If a period-specification is specified, the intermediate result table consists of the rows of the temporal table where the period matches the specification. If a tablesample-clause is specified, the intermediate result table consists of a sampled subset of the rows of the table.
If a single-view-reference is specified without a period-specification, the intermediate result table is that view. If a period-specification is specified, temporal table references in the view consider only the rows where the period matches the specification.
If a single-nickname-reference is specified, the intermediate result table is the data from the data source for that nickname.
If an only-table-reference is specified, the intermediate result table consists of only the rows of the specified table or view without considering the applicable subtables or subviews.
If an outer-table-reference is specified, the intermediate result table represents a virtual table based on all the subtables of a typed table or the subviews of a typed view.
If an analyze_table-expression is specified, the result table contains the result of executing a specific data mining model by using an in-database analytics provider, a named model implementation, and input data.
If a nested-table-expression is specified, the result table is the result of the specified fullselect.
If a data-change-table-reference is specified, the intermediate result table is the set of rows that are directly changed by the searched UPDATE, searched DELETE, or INSERT statement that is included in the clause.
If a table-function-reference is specified, the intermediate result table is the set of rows that are returned by the table function.
If a collection-derived-table is specified, the intermediate result table is the set of rows that are returned by the UNNEST function.
If an xmltable-expression is specified, the intermediate result table is the set of rows that are returned by the XMLTABLE function.
If a joined-table is specified, the intermediate result table is the result of one or more join operations. For more information, see joined-table.
If an external-table-reference is specified without a tablesample-clause, the intermediate result table is the rows of the external table that is represented by the specified file. If a tablesample-clause is specified, the intermediate result table consists of a sampled subset of the rows of the external table that is represented by the specified file.

single-table-reference

Each table-name specified as a table-reference must identify an existing table at the application server or an existing table at a remote server specified using a remote-object-name. The intermediate result table is the result of the table. If the table-name references a typed table, the intermediate result table is the UNION ALL of the table with all its subtables, with only the columns of the table-name. A period-specification can be used with a temporal table to specify the period from which the rows are returned as the intermediate result table. A tablesample-clause can be used to specify that a sample of the rows be returned as the intermediate result table.

If the CURRENT TEMPORAL SYSTEM_TIME special register is set to a non-null value CTST and table-name identifies a system-period temporal table, the table reference is executed as if it contained the following specification with the special register set to the null value:

   table-name FOR SYSTEM_TIME AS OF CTST

If the CURRENT TEMPORAL BUSINESS_TIME special register is set to a non-null value CTBT and table-name identifies an application-period temporal table, the table reference is executed as if it contained the following specification with the special register set to the null value:

   table-name FOR BUSINESS_TIME AS OF CTBT

single-view-reference

Each view-name specified as a table-reference must identify one of the following objects:

An existing view at the application server
A view at a remote server specified using a remote-object-name
The table-name of a common table expression

The intermediate result table is the result of the view or common table expression. If the view-name references a typed view, the intermediate result table is the UNION ALL of the view with all its subviews, with only the columns of the view-name. A period-specification can be used with a view defined over a temporal table to specify the period from which the rows are returned as the intermediate result table.

If the CURRENT TEMPORAL SYSTEM_TIME special register is set to a non-null value CTST, and view-name identifies a system-period temporal table, the table reference is executed as if it contained the following specification with the special register set to the null value:

view-name FOR SYSTEM_TIME AS OF CTST

If the CURRENT TEMPORAL BUSINESS_TIME special register is set to a non-null value CTBT, and view-name identifies an application-period temporal table, the table reference is executed as if it contained the following specification with the special register set to the null value:

view-name FOR BUSINESS_TIME AS OF CTBT

single-nickname-reference

Each nickname specified as a table-reference must identify an existing nickname at the application server. The intermediate result table is the result of the nickname.

only-table-reference

The use of ONLY(table-name) or ONLY(view-name) means that the rows of the applicable subtables or subviews are not included in the intermediate result table. If the table-name used with ONLY does not have subtables, then ONLY(table-name) is equivalent to specifying table-name. If the view-name used with ONLY does not have subviews, then ONLY(view-name) is equivalent to specifying view-name.

The use of ONLY requires the SELECT privilege on every subtable of table-name or subview of view-name.

outer-table-reference

The use of OUTER(table-name) or OUTER(view-name) represents a virtual table. If the table-name or view-name used with OUTER does not have subtables or subviews, then specifying OUTER is equivalent to not specifying OUTER. If the table-name does have subtables, the intermediate result table from OUTER(table-name) is derived from table-name as follows:

The columns include the columns of table-name followed by the additional columns introduced by each of its subtables, if any. The additional columns are added on the right, traversing the subtable hierarchy in depth-first order. Subtables that have a common parent are traversed in creation order of their types.
The rows include all the rows of table-name and all the rows of its subtables. Null values are returned for columns that are not in the subtable for the row.

If the view-name does have subviews, the intermediate result table from OUTER(view-name) is derived from view-name as follows:

The columns include the columns of view-name followed by the additional columns introduced by each of its subviews, if any. The additional columns are added on the right, traversing the subview hierarchy in depth-first order. Subviews that have a common parent are traversed in creation order of their types.
The rows include all the rows of view-name and all the rows of its subviews. Null values are returned for columns that are not in the subview for the row.

The use of OUTER requires the SELECT privilege on every subtable of table-name or subview of view-name.

analyze_table-expression

table-nameview-nameThe table-name or view-name variable must identify an existing table or view or identify the table-name of a common table expression that you define preceding the fullselect containing the table-reference. You can specify a nickname. However, in-database analytics are intended for local data, and retrieving the data for a nickname from another data source does not take advantage of the intended performance benefits.

ANALYZE_TABLE

Returns the result of executing a specific data mining model by using an in-database analytics provider, a named model implementation, and input data. A query referencing the ANALYZE_TABLE parameter cannot be a static SQL statement or a data definition language (DDL) statement. Input or output values cannot be of the following types:

CHAR FOR BIT DATA or VARCHAR FOR BIT DATA
BINARY or VARBINARY
BLOB, CLOB, DBCLOB, or NCLOB
BOOLEAN
XML
DB2SECURITYLABEL

IMPLEMENTATION 'string'

Specifies how the expression is to be evaluated. The string parameter is a string constant whose maximum length is 1024 bytes. The specified value is used to establish a session with an in-database analytic provider. When you specify SAS as the provider, you must specify values for the following case-insensitive parameters:

PROVIDER: Currently, the only supported provider value is SAS.
ROUTINE_SOURCE_TABLE: Specifies a user table containing the DS2 code (and, optionally, any required format or metadata) to implement the algorithm that is specified by the ROUTINE_SOURCE_NAME parameter. DS2 is a procedural language processor for SAS, designed for data modeling, stored procedures, and data extraction, transformation, and load (ETL) processing.
The routine source table has a defined structure (see the examples at the end of the analyze_table-expression section) and, in a partitioned database environment, must be on the catalog database partition. The table cannot be a global temporary table. The MODELDS2 column for a particular row must not be empty or contain the null value. If the value of the MODELFORMATS or MODELMETADATA column is not null, the value must have a length greater than 0. If you do not specify a table schema name, the value of the CURRENT SCHEMA special register is used.
ROUTINE_SOURCE_NAME: Specifies the name of the algorithm to use.

For example:

IMPLEMENTATION
  'PROVIDER=SAS;
  ROUTINE_SOURCE_TABLE=ETLIN.SOURCE_TABLE;
  ROUTINE_SOURCE_NAME=SCORING_FUN1;'

If the table name, schema name, or algorithm name contains lowercase or mixed-case letters, specify delimited identifiers, as shown in the following example:

IMPLEMENTATION
  'PROVIDER=SAS;
  ROUTINE_SOURCE_TABLE="ETLin"."Source_Table";
  ROUTINE_SOURCE_NAME="Scoring_Fun1";'

The following examples show you how to use the ANALYZE_TABLE expression.

SAS tooling helps you to define a table to store model implementations for scoring functions. A row in this table stores an algorithm that is written in DS2, with any required SAS format information and metadata. The MODELNAME column serves as the primary key. For a particular value of the ROUTINE_SOURCE_NAME parameter, at most one row is retrieved from the table that the ROUTINE_SOURCE_TABLE parameter specifies. For example:

   CREATE TABLE ETLIN.SOURCE_TABLE (
     MODELNAME VARCHAR(128) NOT NULL PRIMARY KEY,
     MODELDS2 BLOB(4M) NOT NULL,
     MODELFORMATS BLOB(4M),
     MODELMETADATA BLOB(4M)
   );

The MODELNAME column contains the name of the algorithm. The MODELDS2 column contains the DS2 source code that implements the algorithm. The MODELFORMATS column contains the aggregated SAS format definition that the algorithm requires. If the algorithm does not require a SAS format, this column contains the null value. The MODELMETADATA column contains any additional metadata that the algorithm requires. If the algorithm does not require any additional metadata, this column contains the null value. If the SAS EP installer creates the table, it might include additional columns.

Use the data in columns C1 and C2 in table T1 as input data with the scoring model SCORING_FUN1, whose implementation is stored in ETLIN.SOURCE_TABLE:

   WITH sas_score_in (c1,c2) AS
     (SELECT c1,c2 FROM t1)
     SELECT *
       FROM sas_score_in ANALYZE_TABLE(
         IMPLEMENTATION
           'PROVIDER=SAS;
            ROUTINE_SOURCE_TABLE=ETLIN.SOURCE_TABLE;
            ROUTINE_SOURCE_NAME=SCORING_FUN1;');

Use all the data in the table T2 with the scoring model SCORING_FUN2, whose implementation is stored in ETLIN.SOURCE_TABLE:

   SELECT *
     FROM t2 ANALYZE_TABLE(
       IMPLEMENTATION
         'PROVIDER=SAS;
          ROUTINE_SOURCE_TABLE=ETLIN.SOURCE_TABLE;
          ROUTINE_SOURCE_NAME=SCORING_FUN2;');

Use all the data in view V1 with the scoring model SCORING_FUN3, whose implementation is stored in ETLIN.SOURCE_TABLE, and return the output in ascending order of the first output column:

   SELECT *
     FROM v1 ANALYZE_TABLE(
       IMPLEMENTATION
         'PROVIDER=SAS;
          ROUTINE_SOURCE_TABLE=ETLIN.SOURCE_TABLE;
          ROUTINE_SOURCE_NAME=SCORING_FUN3;')
     ORDER BY 1;

nested-table-expression

A fullselect in parentheses is called a nested table expression. The intermediate result table is the result of that fullselect. The columns of the result do not need unique names, but a column with a non-unique name cannot be explicitly referenced. If LATERAL is specified, the fullselect can include correlated references to results columns of table references specified to the left of the nested table expression. If the nested table expression involves data from a federated data source, a continue-handler can be specified to tolerate certain error conditions from the data source.

An expression in the select list of a nested table expression that is referenced within, or is the target of, a data change statement within a fullselect is valid only when it does not include:

A function that reads or modifies SQL data
A function that is non-deterministic
A function that has external action
An OLAP function

If a view is referenced directly in, or as the target of a nested table expression in a data change statement within a FROM clause, the view must meet either of the following conditions:

Be symmetric (have WITH CHECK OPTION specified)
Satisfy the restriction for a WITH CHECK OPTION view

If the target of a data change statement within a FROM clause is a nested table expression, the following restrictions apply:

Modified rows are not requalified
WHERE clause predicates are not reevaluated
ORDER BY or FETCH FIRST operations are not redone

A nested table expression can be used in the following situations:

In place of a view to avoid creating the view (when general use of the view is not required)
When the required intermediate result table is based on host variables

data-change-table-reference

A data-change-table-reference clause specifies an intermediate result table. This table is based on the rows that are directly changed by the searched UPDATE, searched DELETE, or INSERT statement that is included in the clause. A data-change-table-reference can be specified as the only table-reference in the FROM clause of the outer fullselect that is used in a select-statement, a SELECT INTO statement, or a common table expression. A data-change-table-reference can be specified as the only table reference in the only fullselect in a SET Variable statement (SQLSTATE 428FL). The target table or view of the data change statement is considered to be a table or view that is referenced in the query; therefore, the authorization ID of the query must have SELECT privilege on that target table or view. A data-change-table-reference clause cannot be specified in a view definition, materialized query table definition, or FOR statement (SQLSTATE 428FL).

The target of the UPDATE, DELETE, or INSERT statement cannot be a temporary view defined in a common table expression (SQLSTATE 42807) or a nickname (SQLSTATE 25000).

Expressions in the select list of a view or fullselect as target of a data change statement in a table-reference can be selected only if OLD TABLE is specified or the expression does not include the following elements (SQLSTATE 428G6):

A subquery
A function that reads or modifies SQL data
A function is that is non-deterministic or has an external action
An OLAP function
A NEXT VALUE FOR sequence reference

FINAL TABLE: Specifies that the rows of the intermediate result table represent the set of rows that are changed by the SQL data change statement as they exist at the completion of the data change statement. If there are AFTER triggers or referential constraints that result in further operations on the table that is the target of the SQL data change statement, an error is returned (SQLSTATE 560C6). If the target of the SQL data change statement is a view that is defined with an INSTEAD OF trigger for the type of data change, an error is returned (SQLSTATE 428G3).
NEW TABLE: Specifies that the rows of the intermediate result table represent the set of rows that are changed by the SQL data change statement before the application of referential constraints and AFTER triggers. Data in the target table at the completion of the statement might not match the data in the intermediate result table because of additional processing for referential constraints and AFTER triggers.
OLD TABLE: Specifies that the rows of the intermediate result table represent the set of rows that are changed by the SQL data change statement as they existed before the application of the data change statement.; Specifies a searched UPDATE statement. A WHERE clause or a SET clause in the UPDATE statement cannot contain correlated references to columns outside of the UPDATE statement.; Specifies a searched DELETE statement. A WHERE clause in the DELETE statement cannot contain correlated references to columns outside of the DELETE statement.; Specifies an INSERT statement. A fullselect in the INSERT statement cannot contain correlated references to columns outside of the fullselect of the INSERT statement.

The content of the intermediate result table for a data-change-table-reference is determined when the cursor opens. The intermediate result table contains all manipulated rows, including all the columns in the specified target table or view. All the columns of the target table or view for an SQL data change statement are accessible using the column names from the target table or view. If an INCLUDE clause was specified within a data change statement, the intermediate result table will contain these additional columns.

table-function-reference

In general, a table function, together with its argument values, can be referenced in the FROM clause of a SELECT in exactly the same way as a table or view. Each function-name together with the types of its arguments, specified as a table reference must resolve to an existing table function at the application server. There are, however, some special considerations which apply.

Table function column names: Unless alternative column names are provided following the correlation-name, the column names for the table function are those specified in the RETURNS or RETURNS GENERIC TABLE clause of the CREATE FUNCTION statement. This is analogous to the names of the columns of a table, which are defined in the CREATE TABLE statement.
Table function resolution: The arguments specified in a table function reference, together with the function name, are used by an algorithm called function resolution to determine the exact function to be used. This is no different from what happens with other functions (such as scalar functions) that are used in a statement.

Table function arguments: As with scalar function arguments, table function arguments can generally be any valid SQL expression. The following examples are valid syntax:

  Example 1:  SELECT c1
              FROM TABLE( tf1('Zachary') ) AS z
              WHERE c2 = 'FLORIDA';

  Example 2:  SELECT c1
              FROM TABLE( tf2 (:hostvar1, CURRENT DATE) ) AS z;

  Example 3:  SELECT c1
              FROM t
              WHERE c2 IN
                       (SELECT c3 FROM
                        TABLE( tf5(t.c4) ) AS z  -- correlated reference
                       )                         -- to previous FROM clause

  Example 4:  SELECT c1
              FROM TABLE( tf6('abcd') )                -- tf6 is a generic
                  AS z (c1 int, c2 varchar(100)) -- java table function

Table functions that modify SQL data: Table functions that are specified with the MODIFIES SQL DATA option can be used only as the last table reference in a select-statement, common-table-expression, or RETURN statement that is a subselect, a SELECT INTO, or a row-fullselect in a SET statement. Only one table function is allowed in one FROM clause, and the table function arguments must be correlated to all other table references in the subselect (SQLSTATE 429BL). The following examples have valid syntax for a table function with the MODIFIES SQL DATA property:

  Example 1:  SELECT c1
              FROM TABLE( tfmod('Jones') ) AS z

  Example 2:  SELECT c1
              FROM t1, t2, TABLE( tfmod(t1.c1, t2.c1) ) AS z

  Example 3:  SET var =
              (SELECT c1
              FROM TABLE( tfmod('Jones') ) AS z

  Example 4:  RETURN SELECT c1
              FROM TABLE( tfmod('Jones') ) AS z

  Example 5:  WITH v1(c1) AS
              (SELECT c1
              FROM TABLE( tfmod(:hostvar1) ) AS z)
              SELECT c1 
              FROM v1, t1 WHERE v1.c1 = t1.c1

  Example 6: SELECT z.* 
             FROM t1, t2, TABLE( tfmod(t1.c1, t2.c1) ) 
             AS z (col1 int)

collection-derived-table

A collection-derived-table can be used to convert the elements of an array into values of a column in separate rows. If WITH ORDINALITY is specified, an extra column of data type INTEGER is appended. This column contains the position of the element in the array. The columns can be referenced in the select list and the in rest of the subselect by using the names specified for the columns in the correlation-clause. The collection-derived-table clause can be used only in a context where arrays are supported (SQLSTATE 42887). See the "UNNEST table function" for details.

xmltable-expressionAn xmltable-expression specifies an invocation of the built-in XMLTABLE function which determines the intermediate result table. See XMLTABLE for more information.

external-table-referenceAn external table resides in a text-based, delimited or non-delimited file outside of a database. An external-table-reference specifies the name of the file that contains an external table.

Important: A transient external-table-reference cannot be specified in a view definition (SQLSTATE 428IE).

column-definitionThe attributes of a column.

column-nameNames a column of the table. The name cannot be qualified, and the same name cannot be used for more than one column of the table (SQLSTATE 42711).

built-in-typeOne of the following built-in data types:

SMALLINT

A small integer.

[INTEGER | INT]

A large integer.

BIGINT

A big integer.

[DECIMAL | DEC | NUMERIC | NUM](precision-integer, scale-integer)

A decimal number.

The precision integer specifies the total number of digits. It must be in the range 1 ‑ 31. The default is 5.
The scale integer specifies the number of digits to the right of the decimal point. It cannot be negative and cannot exceed the precision. The default is 0.

FLOAT(integer)

A single or double-precision floating-point number. If the specified length is in the range:

1 - 24, the number uses single precision
25 - 53, the number uses double-precision

Instead of FLOAT, you can specify:

REAL: For single precision floating-point.
DOUBLE: For double-precision floating-point.
DOUBLE PRECISION: For double-precision floating-point.
FLOAT: For double-precision floating-point.

DECFLOAT(precision-integer)

A decimal floating-point number. The precision integer specifies the total number of digits, which can be either 16 or 34. The default is 34.

[CHARACTER | CHAR](integer [OCTETS | CODEUNITS32])

A fixed-length character string of the specified number of code units. This number can range from 1 ‑ 255 OCTETS or from 1 - 63 CODEUNITS32. The default is 1.

[VARCHAR | CHARACTER VARYING | CHAR VARYING](integer [OCTETS | CODEUNITS32])

A varying-length character string with a maximum length of the specified number of code units. This number can range from 1 ‑ 32672 OCTETS or from 1 - 8168 CODEUNITS32.

FOR BIT DATA

Specifies that the contents of the column are to be treated as bit (binary) data. During data exchange with other systems, code page conversions are not performed. Comparisons are done in binary, irrespective of the database collating sequence.

The FOR BIT DATA clause cannot be specified with string units CODEUNITS32 (SQLSTATE 42613).

[CLOB | CHARACTER LARGE OBJECT | CHAR LARGE OBJECT](integer [K] [OCTETS | CODEUNITS32])

A character large object string with a maximum length of the specified number of code units. The default maximum length is 65,535 bytes.

If you want to multiply the length integer by 1024, specify a K (kilo) multiplier.

Regardless of whether you use a K multiplier, the resulting length is limited by the maximum length of a CLOB column in an external table, which is 65,535 OCTETS, 32,767 CODEUNITS16, or 16,383 CODEUNITS32. Note that 64K OCTETS and 16K CODEUNITS32 each exceed the maximum length by one, and so are not allowed.
Any number of spaces (including zero spaces) are allowed between the data type and the length specification or between the length integer and the K multiplier. For example, the following specifications are all equivalent and valid:
```
CLOB(50K)
CLOB(50 K)
CLOB (50   K)
```
The K multiplier can be specified in either uppercase or lowercase.

In a Unicode database, the default string units for a character string data type are determined by the value of the NLS_STRING_UNITS global variable or string_units database configuration parameter. In a non-Unicode database, the default string units for character string data types are OCTETS.

OCTETS: Specifies that the units of the length attribute are bytes.
CODEUNITS32: Specifies that the units of the length attribute are Unicode UTF-32 code units, which approximates counting in characters. This does not affect the underlying code page of the data type. The actual length of a data value is determined by counting the UTF-32 code units as if the data were converted to UTF-32. CODEUNITS32 can be specified only in a Unicode database (SQLSTATE 560AA).

GRAPHIC(integer [CODEUNITS16 | CODEUNITS32])

A fixed-length graphic string of the specified length, which can range from 1 ‑ 127 double bytes, 1 ‑ 127 CODEUNITS16, or 1 ‑ 63 CODEUNITS32. The default length is 1.

VARGRAPHIC(integer [CODEUNITS16 | CODEUNITS32])

A varying-length graphic string of the specified maximum length, which can range from 1 ‑ 16336 double bytes, 1 ‑ 16336 CODEUNITS16, or 1 ‑ 8168 CODEUNITS32.

DBCLOB(integer [K] [CODEUNITS16 | CODEUNITS32])

A character large object string of the specified maximum length in double bytes, Unicode UTF-16 code units, or Unicode UTF-32 code units. The default maximum length is 32,767 double bytes.

If you want to multiply the length integer by 1024, specify a K (kilo) multiplier.

Regardless of whether you use a K multiplier, the resulting length is limited by the maximum length of a DBCLOB column in an external table, which is 32,767 CODEUNITS16 or 16,383 CODEUNITS32. Note that 32K CODEUNITS16 and 16K CODEUNITS32 each exceed the maximum length by one, and so are not allowed.
Any number of spaces (including zero spaces) are allowed between the data type and the length specification or between the length integer and the K multiplier. For example, the following specifications are all equivalent and valid:
```
DBCLOB(50K)
DBCLOB(50 K)
DBCLOB (50   K)
```
The K multiplier can be specified in either uppercase or lowercase.

CODEUNITS16: Specifies that the units of the length attribute are Unicode UTF-16 code units, which is the same as counting in double bytes. CODEUNITS16 can be specified only in a Unicode database (SQLSTATE 560AA).
CODEUNITS32: Specifies that the units of the length attribute are Unicode UTF-32 code units. This does not affect the underlying code page of the data type. The actual length of a data value is determined by counting the UTF-32 code units as if the data were converted to UTF-32. CODEUNITS32 can be specified only in a Unicode database (SQLSTATE 560AA).

[NATIONAL CHARACTER | NATIONAL CHAR | NCHAR](integer)

A fixed-length string of the specified length. The default length is 1.

The NATIONAL CHARACTER type maps to either a fixed-length character or a fixed-length graphic string, depending on the value of the nchar_mapping database configuration parameter, which also defines the string units.

[NATIONAL CHARACTER VARYING | NATIONAL CHAR VARYING | NCHAR VARYING | NVARCHAR](integer)

A varying-length string of the specified maximum length.

The NATIONAL CHARACTER VARYING type maps to either a varying-length character or a varying-length graphic string, depending on the value of the nchar_mapping database configuration parameter, which also defines the string units.

[NATIONAL CHARACTER LARGE OBJECT | NCHAR LARGE OBJECT | NCLOB](integer [K])

A large object string of the specified maximum length. The default maximum length is 16,383 double bytes.

This data type maps to either a character large object (CLOB) or a double-byte character large object (DBCLOB), depending on the current value of the nchar_mapping database configuration parameter, which also defines the string units. See the description of the CLOB or DBCLOB parameter (whichever applies) for information about possible values for the length integer and how to use a K (kilo) multiplier.

BINARY(integer)

A fixed-length binary string of the specified length, which must be in the range 1 ‑ 255 bytes. The default length is 1.

[VARBINARY | BINARY VARYING](integer)

A varying-length binary string of the specified maximum length, which must be in the range 1 ‑ 32672 bytes.

[BLOB | BINARY LARGE OBJECT](integer [K])

A binary large object string with a maximum length of the specified number of code units. The default maximum length is 65,535 bytes.

If you want to multiply the length integer by 1024, specify a K (kilo) multiplier.

Regardless of whether you use a K multiplier, the resulting length is limited by the maximum length of a BLOB column in an external table, which is 65,535 bytes. Note that 64K exceeds the maximum length by one, and so is not allowed.
Any number of spaces (including zero spaces) are allowed between the data type and the length specification or between the length integer and the K multiplier. For example, the following specifications are all equivalent and valid:
```
BLOB(50K)
BLOB(50 K)
BLOB (50   K)
```
The K multiplier can be specified in either uppercase or lowercase.

DATE

A date.

TIME

A time.

TIMESTAMP(integer) or TIMESTAMP

A timestamp. The integer specifies the number of decimal places for fractions of seconds, from 0 (seconds) to 12 (picoseconds). The default is 6 (microseconds).

BOOLEAN

A Boolean value.

LIKE table-name1 or view-name or nickname

Specifies that the columns of the table have the same name and description as the columns of the specified table (table-name1), view (view-name), or nickname (nickname). The specified table, view, or nickname must either exist in the catalog or must be a declared temporary table. A typed table or typed view cannot be specified (SQLSTATE 428EC).

The use of LIKE is an implicit definition of n columns, where n is the number of columns in the identified table (including implicitly hidden columns), view, or nickname. A column of the new table that corresponds to an implicitly hidden column in the existing table will also be defined as implicitly hidden. The implicit definition depends on what is specified after LIKE:

If a table is specified, then the implicit definition includes the column name, data type, hidden attribute, and nullability characteristic of each of the columns of that table. If EXCLUDING COLUMN DEFAULTS is not specified, then the column default is also included.
If a view is specified, then the implicit definition includes the column name, data type, and nullability characteristic of each of the result columns of the fullselect defined in that view. The data types of the view columns must be data types that are valid for columns of a table.
If a nickname is specified, then the implicit definition includes the column name, data type, and nullability characteristic of each column of that nickname.
If a protected table is specified, the new table inherits the same security policy and protected columns as the identified table.
If a table is specified and if that table contains a row-begin column, row-end column, or transaction-start-ID column, the corresponding column of the new table inherits only the data type of the source column. The new column is not considered a generated column.
If a table that includes a period is specified, the new table does not inherit the period definition.
If a system-period temporal table is specified, the new table is not a system-period temporal table.
If a random distribution table that uses the random by generation method is specified, and if the new table that is being created does not share the same table distribution, the RANDOM_DISTRIBUTION_KEY column that is used to generate the random distribution values is not included.

Column default and identity column attributes can be included or excluded, based on the copy-attributes clauses. The implicit definition does not include any other attributes of the identified table, view, or nickname. Consequently, the new table does not have any primary key, unique constraints, foreign key constraints, referential integrity constraints, triggers, indexes, ORGANIZE BY specification, or PARTITIONING KEY specification. The table is created in the table space implicitly or explicitly specified by the IN clause, and the table has any other optional clause only if the optional clause is specified.

When a table is identified in the LIKE clause and that table contains a ROW CHANGE TIMESTAMP column, the corresponding column of the new table inherits only the data type of the ROW CHANGE TIMESTAMP column. The new column is not considered to be a generated column.

If a table is specified, and if row or column level access control is activated for that table, it is not inherited by the new table.

option

The following options control the loading of data to or retrieval of data from an external-table file. The value of each option is a text string and is not case-sensitive.

BINARYNUMERICS

Specifies that numeric data is being input in binary format as opposed to ASCII character representation. Does not affect other, non-numeric data. Supported only with fixed-length format data in text format.

TRUE or ON

Numeric data (with the exception of DECIMAL type) must be input in binary format.

Supported numeric types are SMALLINT, INT, BIGINT, REAL, and DOUBLE. No conversions between datatypes are done, with the exception of conversions done between integer types (SMALLINT, INT, and BIGINT).

Data lengths must match their target column definitions (with the exception of integer conversions; e.g. 2, 4, and 8 bytes are all valid zone lengths for a BIGINT column).

All numeric input in the data file is assumed to be big-endian regardless of the platform on which the load operation is running.

REAL and DOUBLE values must be in IEEE Floating Point format.

FALSE or OFF: Numeric data must be input in standard ASCII character representation.

BOOLSTYLE or BOOLEAN_STYLE

During a load operation, all Boolean values must use the same style. This option specifies the Boolean style that is to be used:

1_0 (this is the default)
T_F
Y_N
YES_NO
TRUE_FALSE

CARDINALITY

Non-zero positive integer value to override the estimation of the expected number of returned rows.

CCSID

The coded character set identifier (CCSID) of the input data file. The value can be any valid integer value from the CCSID specification. There is no default value. The CCSID and ENCODING options are mutually exclusive when the value of the ENCODING option is UTF8, LATIN9, or INTERNAL.

Which styles are used for dates and times depends on whether a CCSID is specified:

When a CCSID is specified, and when DATESTYLE, TIMESTYLE, DATEDELIM, or TIMEDELIM are not specified, the values or defaults for DATE_FORMAT, TIME_FORMAT, and TIMESTAMP_FORMAT are used.
When a CCSID is not specified, and when TIMESTAMP_FORMAT, DATE_FORMAT or TIME_FORMAT are not specified, the values or defaults for DATESTYLE, TIMESTYLE, DATEDELIM, and TIMEDELIM are used.

COMPRESS

For a load operation or an unload operation, whether the data file data is compressed:

GZIP: The data file data is compressed by using the GZIP compression algorithm.
NO: The data file data is not compressed. This is the default.
LZ4: The data file data is compressed by using the LZ4 compression algorithm.

The COMPRESS option cannot be specified if the value of the REMOTESOURCE option is GZIP or LZ4.

CRINSTRING

How to interpret an unescaped carriage-return (CR) or carriage-return line-feed (CRLF) character:

TRUE or ON: An unescaped CR character is interpreted as data, not as a record delimiter.
FALSE or OFF: An unescaped CR is interpreted as a record delimiter. This is the default.; Use fixed-length format for CRINSTRING only if the value of the CtrlChars option is set to OFF.

CTRLCHARS

Whether to allow an ASCII value 1 - 31 in a CHAR or VARCHAR field. Any NULL, CR, or LF characters must be escaped. Allowed values are:

TRUE or ON

An ASCII value 1 - 31 in a CHAR or VARCHAR field is allowed.

If fixed-length format is enabled, all unescaped characters are allowed.

FALSE or OFF

An ASCII value 1 - 31 in a CHAR or VARCHAR field is not allowed. This is the default.

If fixed-length format is enabled, unescaped characters cause an error.

Exceptions for fixed-length format:

\t, \n
\r if the CRinString option is set to ON

DATAOBJECT or FILE_NAME

The fully-qualified name of the file (or any medium that can be treated as a file) that is to contain the external table to be created. This option is mandatory when the name of the file is not specified immediately after the table name; otherwise, it is not allowed.

When both the REMOTESOURCE option is set to LOCAL (this is its default value) and the extbl_strict_io configuration parameter is set to NO, the path to the external table file is an absolute path and must be one of the paths specified by the extbl_location configuration parameter. Otherwise, the path to the external table file is relative to the path that is specified by the extbl_location configuration parameter followed by the authorization ID of the table definer. For example, if extbl_location is set to /home/xyz and the authorization ID of the table definer is user1, the path to the external table file is relative to /home/xyz/user1/.

The file name must be a valid UTF-8 string.
For a load operation, the following conditions apply:

The file must already exist.
Required permissions:
- If the external table is a named external table, the owner must have read permission for the file and write permission for the LOGDIR directory.
- If the external table is a transient external table, the authorization ID of the statement must have read permission for the file and write permission for the LOGDIR directory.

For an unload operation, the following conditions apply:

If the file exists, it is overwritten.
Required permissions:
- If the external table is a named external table, the owner must have read and write permission for the directory of this file.
- If the external table is transient, the authorization ID of the statement must have read and write permission for the directory of this file.

DATEDELIM

The delimiter character that separates the components of a date, according to the format specified by the DATESTYLE option. If you specify an empty string, there is no delimiter between the date components, and days and months must be specified as two-digit numbers. When DATESTYLE is set to MONDY or MONDY2, the default DATEDELIM value is a space. The TIMESTAMP_FORMAT and DATEDELIM options are mutually exclusive.

DATESTYLE

How to interpret the date format. For days or months in the range 1 ‑ 9, use 1 digit, 2 digits, or a space followed by a single digit. When the DATEDELIM option is a space, you can specify a comma after the day. An error occurs if you:

Specify zero for a day, month, or year
Specify a nonexistent date (for example, 32 August or 30 February)

The DATESTYLE option and the DATE_FORMAT or TIMESTAMP_FORMAT option are mutually exclusive.

Table 1. Possible values for the DateStyle option. The example shows how the date 21 March 2014 would be represented when DATEDELIM is set to '-'.
Value	Description	Example
YMD	4-digit year, 2-digit month, 2-digit day. This is the default.	2014-03-21
DMY	2-digit day, 2-digit month, 4-digit year.	21-03-2014
MDY	2-digit month, 2-digit day, 4-digit year.	03-21-2014
MONDY	3-character month, 2-digit day, 4-digit year.	Mar 21 2014
DMONY	2-digit day, 3-character month, 4-digit year.	21-Mar-2014
Y2MD	2-digit year, 2-digit month, 2-digit day. Not supported for unloads.	14-03-21
DMY2	2-digit day, 2-digit month, 2-digit year. Not supported for unloads.	21-03-14
MDY2	2-digit month, 2-digit day, 2-digit year. Not supported for unloads.	03-21-14
MONDY2	3-character month, 2-digit day, 2-digit year. Not supported for unloads.	Mar 21 14
DMONY2	2-digit day, 3-character month, 2-digit year. Not supported for unloads.	21-Mar-14

DATETIMEDELIM

A single-byte character that separates the date component and time component of the timestamp data type.

The default delimiter is a space (' ').

Between the date component and the time component, a delimiter is not required. For example, both of the following values are valid:

2010-10-10 10:10:10
2010-10-1010:10:10

DATE_FORMAT

The format of the date field in the data file. The value can be any of the date format strings that are accepted by the TIMESTAMP_FORMAT scalar function. The default is YYYY-MM-DD. The DATE_FORMAT option and the DATEDELIM or DATESTYLE option are mutually exclusive.

DECIMALDELIM or DECIMAL_CHARACTER

The decimal delimiter for the data types FLOAT, DOUBLE, TIME, and TIMESTAMP. Allowed values are ',' and '.'.

DECPLUSBLANK

Specifies how the positive decimal value is represented during the unload operation.

You can specify one of the following values for this option:

NONE: This is the default.; This value represents a positive decimal value without a sign.
PLUS: Specifies that a positive decimal value is represented by a '+' sign.
BLANK: Specifies that a positive decimal value is represented by a blank sign instead of a '+' sign.

If you specify the DECPLUSBLANK option for the load operation, the output is not affected.

Examples for a table test with ddl (decimal (6,2)) and all the available values for the DECPLUSBLANK option:

1234
-4563

Create external table '/tmp/unload.txt' using (DECPLUSBLANK NONE) as select * from test:
```
unload.txt
1234.00
-4563.00
```
Create external table '/tmp/unload.txt' using (DECPLUSBLANK PLUS) as select * from test:
```
unload.txt
+1234.00
-4563.00
```
Create external table '/tmp/unload.txt' using (DECPLUSBLANK BLANK) as select * from test:
```
unload.txt
 1234.00
-4563.00
```

DELIMITER or COLUMN_DELIMITER

The character that is used to delimit the fields of an input or output record. The default is a vertical bar ('|').

You can specify a character in the 7-bit ASCII range (decimal 1 ‑ 127) in any of the following ways:

As a single character (for example DELIMITER ';')
By specifying its corresponding ASCII decimal value (for example, DELIMITER 59 or DELIMITER '59')
By specifying its corresponding ASCII hex value (for example, DELIMITER x'3B')

The decimal range 128 - 255 is supported only with the ISO character set input file by specifying its corresponding ASCII decimal value or hex value. If the input file is in the UTF8 character set, this delimiter value range is not supported.

ENCODING

The type of data in the file:

UTF8: The file uses UTF8 encoding for all character data.
LATIN9: The file uses LATIN9 encoding for all character data.
INTERNAL: The file uses a mixture of both UTF8 and LATIN9 encoding, or you are unsure which type of encoding is used. The system checks the data and encodes the data as needed. Because this checking of the data reduces overall performance, use this value only when necessary. This is the default.
INTERNAL: This is the default option.; The file uses a mixture of both UTF8 and LATIN9 encoding.; Files are encoded in Netezza internal format and therefore should be used only for files that are extracted from Netezza by using ENCODING (INTERNAL).; When the target column is CODEUINTS32 (NCHAR/VARCHAR), the input data is validated to be valid UTF-8 characters.; This option is supported only in a Unicode database.
DBCS_GRAPHIC: This value is allowed only for a load operation, not an unload operation. If this value is specified, the CCSID option must also be specified. During the load operation, fields of type GRAPHIC or VARGRAPHIC are encoded using the double-byte character set of the specified CCSID; fields of all other types are encoded using the mixed-byte character set of the specified CCSID.
Note: ENCODING cannot be set to DBCS_GRAPHIC for a DEL file that was created by the EXPORT utility, because such DEL files are encoded using a single character set.

The CCSID and ENCODING options are mutually exclusive when the value of the ENCODING option is UTF8, LATIN9, or INTERNAL.

ESCAPECHAR or ESCAPE_CHARACTER

Which character is to be regarded as an escape character. An escape character indicates that the character that follows it, which would otherwise be treated as a field-delimiter character or end-of-row sequence character, is instead treated as part of the value in the field. The escape character is ignored for graphic-string data. There is no default.

FILLRECORD

For a load operation, the field of a record are loaded into the columns of a target table from left to right. This option specifies whether an input record can contain fewer fields than there are columns defined for the target table:

TRUE or ON: An input line can contain fewer fields, provided that all columns for which a value is missing are nullable. Missing values are set to NULL. If one or more columns for which a value is missing is not nullable, the record is rejected.
FALSE or OFF: An input line that contains fewer columns is rejected. This is the default.

FORMAT or FILE_FORMAT

The data format of the source file:

TEXT

The data to be loaded or unloaded is in text-delimited format. This is the default.

INTERNAL

The data is in an internal format used by Netezza Platform Software (NPS). This value is valid only when loading data from a file to the database, not when unloading data to a file. If this value is specified for the FORMAT option, the following options, and only these options, must also be specified:

DATAOBJECT or FILE_NAME.
REMOTESOURCE, SWIFT or S3. If the REMOTESOURCE option is specified, it must have the value LOCAL or YES.
COMPRESS. This must be set to GZIP.

BINARY

The data is in an internal format that is used by Db2.

FIXED

The data is in fixed-length format.

Fixed-length format is supported only for load operations.

Files in fixed-length format use ordinal positions, which are offsets, to identify where fields are within the record.

Note:

The following external table options are not supported for the fixed-length format:
- Delimiter
- Encoding
- EscapeChar
- FillRecord
- IgnoreZero
- IncludeZeroSeconds
- Lfinstring
- QuotedValue
- RequireQuotes
- TimeExtraZeros
- TruncString
There are no field delimiters.
An end-of-record delimiter is required even for the last record.
Usually, data in fixed-length format files does not have decimal delimiters or time delimiters because delimiters are not necessary and use space.
The locations of delimiters are fixed and specified in the layout definition because the fields are fixed in size. This definition comes with the fixed-length format data file.
To load fixed-format data into the database, you must define the target data type for the fields and the locations within the record.
You do not have to load all fields in a fixed-length format file. You can skip them by using the filler specification.
The order of fields in the data file must match the order in the target table. Alternatively, you must create an external table definition that specifies the order of the fields as database columns.
You can change the field order by using an external table definition in combination with an insert-select statement.
Typically, unknown values or null values are represented by known data patterns that are classified as representing null.

The following parameters apply when the FORMAT option of the external table is set to FIXED:

LAYOUT

Mandatory.

A layout is an ordered collection of zone or field definitions. It defines the location of the fields of the input record.

Specify comma-separated zone definitions within braces { }.

Each zone definition is made up of mutually exclusive, non-overlapping clauses.

No default value.

The clauses must be in the following order, even if some of them are optional and can be empty:

USE TYPE

Optional.

Indicates whether a zone is a normal data zone, a reference zone, or a filler zone.

For data zones, this value is omitted.

A reference zone is specified as REF. This specification implies that the zone is referred by another zone for zone length or null values.

A filler zone is specified as FILLER. Filler zones specify that the bytes or characters are treated as fillers in a data file.

NAME

Optional.

The name of the zone.

Currently, this definition is not used. Typically, it is provided to identify the field.

TYPE

Optional.

Defines the type of the zone.

If you do not specify the type, it gets the default value of the corresponding type of a table column.

Valid values are as follows:

CHAR
VARCHAR
NCHAR
NVARCHAR
SMALLINT
BIGINT
BINARY
VARBINARY
GRAPHIC
VARGRAPHIC
FLOAT
DOUBLE
DEC, NUM, or NUMERIC
DECFLOAT
BOOLEAN
DATE
TIME
TIMESTAMP

STYLE

Optional.

Defines the zone representation.

The default representation is based on zone type and format option.

All other styles are valid only for their corresponding non-textual zone types.

Valid values are as follows:

INTERNAL
Valid only for textual zones, that is, char, varchar, nchar, and nvarchar.
DECIMAL
Valid for integer and numeric zone types.
DECIMALDELIM <'decimal-delim'>
Valid for numeric, float, double, and time style (time and timestamp) zone types.
FLOATING
Valid for float or double zone types.
EXPONENTIAL
Valid for float or double zone types.
YMD <'date-delim'>
Valid for date zones, including other date styles that are supported for the DateStyle and DateDelim external table options.
12Hour <'time-delim'>
Valid for time zones, including other time styles that are supported for the TimeStyle and TimeDelim external table options.
24Hour <'time-delim'>
Valid for time zones, including other time styles that are supported for the TimeStyle and TimeDelim external table options.
YMD <'date-delim'> 24Hour <'time-delim'>
Valid for timestamp zones, including other combinations of date and time styles that are supported for the DateStyle, DateDelim, TimeStyle, and TimeDelim external table options.
TRUE_FALSE, Y_N, 1_0
Valid for boolean zones, including other boolean styles that are supported for the BoolStyle external table option. The style must be in accordance with the format.

LENGTH

Optional.

Specified as bytes or characters followed by the number or the internal reference to the reference zone.

Number of bytes or characters as provided or as referenced by the reference zone.

For reference zones or filler zones, you cannot use internal references. For reference zones, the number of bytes specifies how the data is read from the data file to get the referred value.

You can use plus signs and minus signs as follows:

BYTES @2 + 10
BYTES @2 - 10

NULLIF

Optional.

Definition of the zone NULLESS attribute.

Specifies a known data pattern within the field that, when it is present, signifies that the field is null.

The length is equal to or less than the column width. The maximum length is 39 bytes.

The NULLVALUE option is not supported if the NULLIF clause is used on any zone within a table.

You can use the following types of references:

Internal reference to numeric zones.

Exact match of the numeric value.

External reference.

Exact match of the specified value.

Isolated reference.

Leading spaces and trailing spaces are to be skipped with the exact string match.

Nulls are detailed in the following examples:

Table 2. Layout example
Use type	Name	Type	Style	Length	Nullif
NA	f1	int4	DECIMAL	Bytes 10	Nullif & = 0
NA	f2	date	YMD	Bytes 10	Nullif &= '2000-10-10'
NA	f3	char(20)	INTERNAL	Chars 10	Nullif &&='ab'
Filler	f4	char(10)	NA	Bytes 10	NA

Remember:

The referred zone in a length clause must be of type integer.
You must not specify the NULLIF option for reference zones or filler zones.
Reference zones and filler zones cannot have variable lengths.
Variable length zones cannot refer themselves.
Define the referred zone in a length clause as REF.
Length-clause references can use only the INTERNAL (@) reference. External or isolated references are not supported.
Between the referred zone of a length clause and the zone itself, reference zones are not allowed.
If the reference type is INTERNAL (@), the NULLIF clause cannot refer to itself.
If the column is non-nullable, it may not have the NULLIF clause.
Variable length is allowed only for the string type of zones.
The NULLIF clause can refer only to REF zones or the zones themselves.
Between the zone that is referred by the NULLIF clause and the zone itself, other referred zones are not allowed, except for the zone that is referred in the length clause.
The record length can point to zone 1 only for reference.
A REF must have a zone that refers it.
The NULLIF clause can have external references only if the REF zone is non-integer.

Recordlength

Specifies the length of the entire record, where null-indicator bytes are included if they exist, and the record delimiter is excluded if it exists.

The value is a constant integer.

The value can also be an internal reference to the reference zone in the layout definition.

There is no default value.

You can use plus signs and minus signs for an internal reference as follows:

RECORDLENGTH @1 + 10
RECORDLENGTH @1 - 10

IGNOREZERO or TRIM_NULLS

Specifies whether the binary value zero in CHAR fields and VARCHAR fields is to be discarded.

TRUE or ON: The byte value zero is ignored.
FALSE or OFF: The byte value zero is not ignored. This is the default.
KEEP: The binary value zero is accepted and allowed as part of the input field.

INCLUDEHEADER or COLUMN_NAMES

For an unload operation, whether the table column names are to be included as headers in the external-table file:

TRUE or ON: The table column names are to be included as headers.
FALSE or OFF: The table column names are not to be included as headers. This is the default.

INCLUDEZEROSECONDS

For an unload operation, whether to specify 00 as the value for seconds when no value for seconds is available:

TRUE or ON: Specify 00 as the value for seconds.
FALSE or OFF: Do not specify a value for seconds. This is the default.

INCLUDEHIDDEN

For a load operation, specify whether hidden column values are present in a data file.

The INCLUDEHIDDEN option works when you are creating an external table by using the LIKE or SAMEAS clause, and base table has hidden columns.

TRUE: A data file contains values against hidden column.
FALSE: A data file does not contain values against hidden column. This is the default. You can change the default value by using the registry variable DB2_EXTBL_INCLUDE_HIDDEN_COLS.

LFINSTRING

Specifies how to interpret an unescaped line-feed (sometimes called an LF or newline) character within string data:

TRUE or ON: An unescaped LF character is interpreted as a record delimiter only if it is in the last field of a record; otherwise, it is treated as data. To cause an LF character that is in the last field of a record to be treated as data, enclose the value of that field in single or double quotation marks.
FALSE or OFF: An unescaped LF character is interpreted as a record delimiter regardless of its position. This is the default.

This option is not supported for unload operations.

Attention: This SQL compatibility enhancement is only available in Db2 Version 11.5 Mod Pack 2 and later versions.

LOGDIR or ERROR_LOG

The directory to which the following files are written:

<database>.<schema>.<external-table-name>.<file-name>.<application-handle>.<id>.bad: A file containing rejected records (that is, records that could not be processed).
<database>.<schema>.<external-table-name>.<file-name>.<application-handle>.<id>.log: A log file.

The default is the directory to which the external-table file is written. If the length of the name that is constructed for a .bad or .log file would exceed the allowed maximum, the name of the file that contains the external table (indicated by <file-name>) is truncated so that the maximum is not exceeded.

If a .log or .bad file is generated while carrying out an operation on a partition, the name of the generated file is suffixed with a period followed by the 3-digit partition number.

MAXERRORS or MAX_ERRORS

For a load operation, the threshold for the number of rejected records at which the system stops processing and immediately rolls back the load. The default is 1 (that is, a single rejected record results in a rollback).

For fixed-length format, the following conditions apply:

The parser reports errors for each field or zone rather than one error for the row.
Multiple errors can be reported for the same row.
When the parser detects an error in a field or zone, it recovers by using the field length or zone length. It then continues from the next field or zone until the end of record is reached, or an unrecoverable error occurs, or the MaxErrors limit is reached.
Unrecoverable errors include the following errors:
- RecordLength mismatch.
- RecordDelimiter is not found.
- The RecordLength value is not valid, that is, the value is a negative value or zero.
- The zone length is not valid, that is, the value is a negative value.
- The UTF-8 initial byte is not valid.
- The UTF-8 continuation bytes are not valid.

MULTIPARTSIZEMB

When the DB2_ENABLE_COS_SDK registry variable is set to ON, Db2 remote storage communication with cloud object storage is facilitated through an embedded vendor COS SDK which allows Db2 to stream objects/files to cloud object storage in multiple parts (aka ‘multipart upload’). This parameter specifies the part size for multipart upload, in megabytes (MB), for the file being unloaded, and overrides the value specified in the MULTIPARTSIZEMB dbm config parameter. This option is available starting in Version 11.5 Modification Pack 7, in Linux (x86) environments only.

MAXROWS or MAX_ROWS

If set to a positive integer, this specifies the maximum number of records (rows) in the external table that are to be processed. If set to 0 (the default), there is no limit and all rows are processed. During a load operation, if MAXROWS is set to a positive value, after that number of rows are processed, regardless of whether some of the rows were rejected or skipped, the system ends the load operation and commits all inserted records.

MERIDIANDELIM

A single-byte character that separates the seconds component from the AM token or PM token in the 12-hour delimited and undelimited formats of a time value.

The default delimiter is a space (' ').

Between the seconds component and the AM token or PM token, a delimiter is not required. For example, both of the following values are valid:

1:02:46.12345 AM
1:02:46.12345AM

NOLOG

Specifies whether the .log file for the external table is created.

This option does not apply to .bad files.

Possible values are:

TRUE: No .log file is created.
FALSE: The .log file is created.; This is the default.

NULLVALUE or NULL_VALUE

The UTF-8 string of at most 4 bytes that is to be used to indicate a null value. The default is 'NULL'.

PARTITION

If the Database Partitioning Feature (DPF) is enabled for the database, an external table can be partitioned into several files. The name of each of the data files that comprise an external table are suffixed with a period followed by a 3-digit number from 000 to 999 that indicates the number of the partition. For example, if an external table with the name dataFile.txt is divided into three partitions, the files that comprise it have the names dataFile.txt.000, dataFile.txt.001, and dataFile.txt.002. These files must be accessible from all members.

For a partitioned external table, the PARTITION option specifies to which partition or partitions the statement applies:

PARTITION ALL: The statement applies to all of the partitions that comprise the external table. For an unload operation, this is the only value that is allowed.
PARTITION (n TO n): The statement applies to all of the partitions in the specified range, for example, PARTITION (54 TO 62).
PARTITION (n,n,…): The statement applies only to the specified partition or partitions, for example, PARTITION (53) or PARTITION (51,57,58). If more than one partition number is specified, they must be in ascending order (sqlcode SQL0263N with SQLSTATE=42615) and there can be no duplicates (sqlcode SQL0265N with SQLSTATE=42615).

If a .log or .bad file is generated while carrying out an operation on a partitioned external table, the name of the generated file is suffixed with a period followed by the 3-digit partition number.

If the DPF is enabled and the PARTITION option is not specified, the external table is treated as single-partitioned table on the coordinator member. The names of the external table file and the .log and .bad files are not suffixed with a partition number.

If the DPF is not enabled, the PARTITION option can be specified, but only with the value ALL, (0 to 0), or (0) (SQL0644N). It will have no effect.

The REMOTESOURCE and PARTITION options are mutually exclusive.

QUOTEDNULL

For a load operation, how to interpret a value that is enclosed in single or double quotation marks and that matches the null value specified by the NULLVALUE or NULL_VALUE option (for example, "NULL" or 'NULL'):

TRUE or ON: The value is interpreted as a null value. This is the default.
FALSE or OFF: The value is interpreted as a character string.

QUOTEDVALUE or STRING_DELIMITER

Whether data values are enclosed in quotation marks:

SINGLE or YES: Data values are enclosed in single quotation marks (').
DOUBLE: Data values are enclosed in double quotation marks (").
NO: Data values are not enclosed in quotation marks. This is the default.

RECORDDELIM or RECORD_DELIMITER

The string literal that is to be interpreted as a row (record) delimiter. The default is '\n'.

When CRINSTRING is set to TRUE, RECORDDELIM cannot contain a CR ('\r') character - with the sole exception of a CRLF ('\r\n') delimiter allowed with CRINSTRING for text format only.

REMOTESOURCE

Where the external-table file resides and, if it resides on a remote system, whether the file data is to be compressed:

LOCAL: The file resides on the local server, that is, the system that hosts the database. This is the default.
YES: The file resides on a system other than the local server. For example, specify YES if a client system is connected to the database and the file resides on that system. File data is not compressed before it is transferred.
GZIP: Similar to YES, except that the file data is compressed using the GZIP compression algorithm before the data is transferred, and is decompressed after it is received. This improves overall performance when a large amount of compressible data is being transferred.
LZ4: Similar to YES, except that the file data is compressed using the LZ4 compression algorithm before the data is transferred, and is decompressed after it is received. This improves overall performance when a large amount of compressible data is being transferred.

The REMOTESOURCE, SWIFT, and S3 options are mutually exclusive. The REMOTESOURCE and PARTITION options are mutually exclusive. The COMPRESS option cannot be specified if the value of the REMOTESOURCE option is GZIP or LZ4.

REQUIREQUOTES

Whether quotation marks are mandatory:

TRUE or ON: Quotation marks are mandatory. The QUOTEDVALUE option must be set to YES, SINGLE, or DOUBLE.
FALSE or OFF: Quotation marks are not mandatory. This is the default.

SKIPROWS or SKIP_ROWS

For a load operation, the number of rows to skip before beginning to load the data. The default is 0. Because skipped rows are processed before they are skipped, a skipped row is still capable of causing a processing error.

SOCKETBUFSIZE

The size, in bytes, of the chunks of data that are read from the source file. Valid values range from 64 KB ‑ 800 MB. If you specify a value outside this range, the value is set to the nearest valid value. The default is 8 MB.

STRICTNUMERIC

For a load operation, how to treat a value that is to be inserted into a DECIMAL field when its scale exceeds that defined for the field:

TRUE or ON

The row containing the value to be inserted is rejected. For example, if any of the following values were to be loaded into a DECIMAL(5,3) field, the row containing that value would be rejected:

12.666666666
-98.34496862785
0.00089

FALSE or OFF

The row containing the value to be inserted is accepted, and the portion of the decimal fraction that exceeds the scale defined for the field is truncated. This is the default. For example, the values in the previous example would be converted to:

12.666
-98.344
0.000

SWIFT

Specifies that the source data file is located in a Swift object store. The REMOTESOURCE, SWIFT, and S3 options are mutually exclusive. Use the DATAOBJECT option to specify the file name.

Syntax:

SWIFT (endpoint, authKey1, authKey2, bucket)

where:

endpoint

A character string that specifies the URL of the SWIFT web service.

authKey1

A character string that specifies the access ID or username of the Swift open stack account used to validate the user.

authKey2

A character string that specifies the password of the Swift open stack account used to validate the user.

bucket

The name of the Swift open stack container (bucket) in which the file resides.

Example:

CREATE EXTERNAL TABLE exttab1(a int) using 
  (dataobject 'datafile1.dat' 
   swift('https://dal05.objectstorage.softlayer.net/auth/v1.0/', 
    'XXXOS123456-2:xxx123456',
    'b207c6e974020737d92174esdf6d5be9382aa4c335945a14eaa9172c70f8df16', 
    'my_dev'
   )
  )

S3

Specifies that the source data file is located in an S3 compatible object store. The REMOTESOURCE, SWIFT, and S3 options are mutually exclusive. Use the DATAOBJECT option to specify the file name.

Syntax:

S3 (endpoint, authKey1, authKey2, bucket)

where:

endpoint

A character string that specifies the URL of the S3 compatible web service.

authKey1

A character string that specifies the S3 access key ID of the access keys used to validate the user and all user actions. For IBM Cloud Object Storage, this is the access key ID from the HMAC credentials.

authKey2

A character string that specifies the S3 secret key of the access keys that are used to validate the user and all user actions. For IBM Cloud Object Storage, this is the secret access key from the HMAC credentials.

bucket

The name of the S3 bucket in which the file resides.

Note: For IBM Cloud Object Storage, to create HMAC credentials, when creating new service credentials, specify {"HMAC:true} in the Add Inline Configuration Parameters field.

Example using AWS S3:

CREATE EXTERNAL TABLE exttab2(a int) using 
  (dataobject 'datafile2.dat' 
   s3('s3.amazonaws.com', 
    'XXXOS123456-2:xxx123456',
    'bs07c6e974040737d92174e5e96d5be9382aa4c33xxx5a14eaa9172c70f8df16', 
    'my_dev'
   )
  )

Example using IBM Cloud Object Storage:

CREATE EXTERNAL TABLE exttab2(a int) using 
  (dataobject 'datafile2.dat' 
   s3('s3-api.us-geo.objectstorage.softlayer.net',
    '1a2bkXXXsaddntLo0xX0',
    'XXxxiEPjJ7T7WBUz74E6abcdABCDE8Q7RgU4gYY9',
    'my_dev'
   )
  )

AZURE

Specifies that the source data file is located in Microsoft Azure Blob Storage. The REMOTESOURCE, SWIFT, S3, and AZURE options are mutually exclusive. Use the DATAOBJECT option to specify the file name. Syntax:

Syntax:

AZURE (endpoint, authKey1, authKey2, bucket)

where:

endpoint

A character string that specifies the URL of the AZURE web service.

authKey1

A character string that specifies the access ID or username of the Azure Blob Storage account used to validate the user.

authKey2

A character string that specifies the access key of the Azure Blob Storage account used to validate the user.

bucket

The name of the Azure Blob Storage container (bucket) in which the file resides.

Example:

CREATE EXTERNAL TABLE exttab1(a int) using 
  (dataobject 'datafile1.dat' 
   azure('https://my_account.blob.core.windows.net', 
    'my_account',
    'lW+oHjmZecPS++IKgThAHlMUOaFUA5C6Z2RlFmc9JPpK34RO/ZIOywzILxJnzGPHz6d/yDrcQDAwH5wySbOZMQ==', 
    'my_bucket'
   )
  )

Example using IBM Cloud Object Storage:

CREATE EXTERNAL TABLE exttab2(a int) using 
  (dataobject 'datafile2.dat' 
   s3('s3-api.us-geo.objectstorage.softlayer.net',
    '1a2bkXXXsaddntLo0xX0',
    'XXxxiEPjJ7T7WBUz74E6abcdABCDE8Q7RgU4gYY9',
    'my_dev'
   )
  )

TIMEDELIM

The single-byte character that is to separate time components (hours, minutes, and seconds). The default is ':'. If TIMEDELIM is set to an empty string, hours, minutes, and seconds must all be specified as two-digit numbers. The TIMESTAMP_FORMAT and TIMEDELIM options are mutually exclusive.

TIMEROUNDNANOS or TIMEEXTRAZEROS

Note: This option applies only to TIMESTAMP columns.

Specifies whether records that contain time values whose non-zero precision exceeds six decimal places are to be accepted (and rounded to the nearest microsecond) or rejected:

TRUE: All records are accepted. Their time values are rounded to the nearest microsecond.
FALSE: Only those records that can be stored without a loss of precision (for example, '08.15.32.123' or '08.15.32.12345600000', but not '08.15.32.1234567') are accepted. All other records are rejected. This is the default.

TIMESTYLE

The time format that is to be used in the data file:

24HOUR: 24-hour format, for example 23:55. This is the default.
12HOUR: 12-hour format, for example 11:55 PM. An AM or PM token can be preceded by a single space and is not case-sensitive.

The TIMESTYLE option and the TIME_FORMAT or TIMESTAMP_FORMAT option are mutually exclusive.

TIMESTAMP_FORMAT

The format of the timestamp field in the data file. The value can be any of the format strings that are accepted by the TIMESTAMP_FORMAT scalar function. The default is 'YYYY-MM-DD HH.MI.SS'. The TIMESTAMP_FORMAT option and the TIMEDELIM, DATEDELIM, TIMESTYLE, or DATESTYLE option are mutually exclusive.

TIME_FORMAT

The format of the time field in the data file. The value can be any of the time format strings that are accepted by the TIMESTAMP_FORMAT scalar function. The default is HH.MI.SS. The TIME_FORMAT option and a TIMEDELIM or TIMESTYLE option are mutually exclusive.

TRIMBLANKS

How an external table is to treat leading or trailing blanks (that is, leading or trailing space characters) in a string:

LEADING: All leading blanks (that is, blanks that precede the first non-blank character) are removed.
TRAILING: All trailing blanks (that is, blanks that follow the last non-blank character) are removed.
BOTH: All leading and trailing blanks are removed.
NONE: No blanks are removed. This is the default.

When reading data from a file and loading it into an external table:

If QUOTEDVALUE or STRING_DELIMITER is specified with the values SINGLE, YES, or DOUBLE, leading and trailing blanks within quotation marks are not removed.
For CHAR and NCHAR data, the values TRAILING or BOTH will not have any effect on trailing blanks, because the string will automatically be re-padded with trailing blanks.

TRUNCSTRING or TRUNCATE_STRING

How the system processes a CHAR or VARCHAR string that exceeds its declared storage size:

TRUE: The system truncates a string value that exceeds its declared storage size.
FALSE: The system returns an error when a string value exceeds its declared storage size. This is the default.

Y2BASE

The year that is the beginning of the 100-year range. Years that are specified as 2 digits are counted from this year. The default is 2000. This option must be specified when DATESTYLE is set to Y2MD, MDY2, DMY2, MONDY2 or DMONY2.

Table 3. Options
Option	Default	Applies to Load	Applies to Unload
Azure	(no default)	Y	Y
BOOLSTYLE or BOOLEAN_STYLE	1_0	Y	Y
CARDINALITY	(no default)	Y	Y
CCSID	(no default)	Y	Y
COMPRESS	NO	Y	Y
CRINSTRING	FALSE	Y	Y
CTRLCHARS	FALSE	Y	N
DATAOBJECT or FILE_NAME	(no default)	Y	Y
DATEDELIM	'-'	Y	Y
DATETIMEDELIM	A space (' ')	Y	Y
DATESTYLE	YMD	Y	Y
DATE_FORMAT	YYYY-MM-DD	Y	Y
DECIMALDELIM or DECIMAL_CHARACTER	'.'	Y	Y
DELIMITER	'\|'	Y	Y
ENCODING	INTERNAL	Y	Y¹
ESCAPECHAR or ESCAPE_CHARACTER	(no default)	Y	Y
FILLRECORD	FALSE	Y	N
FORMAT or FILE_FORMAT	TEXT	Y	Y
IGNOREZERO or TRIM_NULLS	FALSE	Y	N
INCLUDEHEADER or COLUMN_NAMES	FALSE	N	Y
INCLUDEZEROSECONDS	FALSE	Y	Y
INCLUDEHIDDEN	FALSE	Y	N
LFINSTRING	FALSE	Y	N
LOGDIR or ERROR_LOG	target directory of external-table file	Y	N
MULTIPARTSIZEMB	value specified by the MULTIPARTSIZEMB dbm config parameter.	Y	N
MAXERRORS or MAX_ERRORS	1	Y	N
MAXROWS or MAX_ROWS	0	Y	N
MERIDIANDELIM	A space (' ')	Y	Y
NOLOG	FALSE	Y	Y
NULLVALUE or NULL_VALUE	'NULL'	Y	Y
PARTITION	(no default)	Y	Y
QUOTEDNULL	TRUE	Y	N
QUOTEDVALUE	NO	Y	N
RECORDDELIM or RECORD_DELIMITER	'\n'	Y	N
REMOTESOURCE	LOCAL	Y	Y
REQUIREQUOTES	FALSE	Y	N
SKIPROWS or SKIP_ROWS	0	Y	N
SOCKETBUFSIZE	8 MB	Y	Y
STRICTNUMERIC	FALSE	Y	N
SWIFT	(no default)	Y	Y
S3	(no default)	Y	Y
TIMEDELIM	':'	Y	Y
TIMEROUNDNANOS or TIMEEXTRAZEROS	FALSE	Y	N
TIMESTAMP_FORMAT	'YYYY-MM-DD HH.MI.SS'	Y	Y
TIMESTYLE	24HOUR	Y	Y
TIME_FORMAT	HH.MI.SS	Y	Y
TRIMBLANKS	NONE	Y	Y
TRUNCSTRING or TRUNCATE_STRING	FALSE	Y	N
Y2BASE	2000	Y	N
¹ Only for the values INTERNAL, UTF8, and LATIN9.

joined-table

A joined-table specifies an intermediate result set that is the result of one or more join operations. For more information, see joined-table.

period-specification

A period-specification identifies an intermediate result table consisting of the rows of the referenced table where the period matches the specification. A period-specification can be specified following the name of a temporal table or the name of a view. The same period name must not be specified more than once for the same table reference (SQLSTATE 428HY). The rows of the table reference are derived by application of the period specifications.

If the table is a system-period temporal table and a period-specification for the SYSTEM_TIME period is not specified, the table reference includes all current rows and does not include any historical rows of the table. If the table is an application-period temporal table and a period-specification for the BUSINESS_TIME period is not specified, the table reference includes all rows of the table. If the table is a bitemporal table and a period-specification is not specified for both SYSTEM_TIME and BUSINESS_TIME, the table reference includes all current rows of the table and does not include any historical rows of the table.

If the table reference is a single-view-reference, the rows of the view reference are derived by application of the period specifications to all of the temporal tables accessed when computing the result table of the view. If the view does not access any temporal table, then the period-specification has no effect on the result table of the view. If period-specification is used, the view definition or any view definitions referenced when computing the result table of the view must not include any references to compiled SQL functions or external functions with a data access indication other than NO SQL (SQLSTATE 428HY).

If the CURRENT TEMPORAL SYSTEM_TIME special register is set to a value other than the null value, then a period-specification that references SYSTEM_TIME must not be specified for the table reference or view reference, unless the value in effect for the SYSTIMESENSITIVE bind option is NO (SQLSTATE 428HY).

If the CURRENT TEMPORAL BUSINESS_TIME special register is set to a value other than the null value, then a period specification that references BUSINESS_TIME must not be specified for the table reference or view reference, unless the value in effect for the BUSTIMESENSITIVE bind option is NO (SQLSTATE 428HY).

FOR SYSTEM_TIME

Specifies that the SYSTEM_TIME period is used for the period-specification. If the clause is specified following a table-name, the table must be a system-period temporal table (SQLSTATE 428HY). FOR SYSTEM_TIME must not be specified if the value of the CURRENT TEMPORAL SYSTEM_TIME special register is not the null value and the SYSTIMESENSITIVE bind option is set to YES (SQLSTATE 428HY).

FOR BUSINESS_TIME

Specifies that the BUSINESS_TIME period is used for the period-specification. If the clause is specified following a table-name, BUSINESS_TIME must be a period defined in the table (SQLSTATE 4274M). FOR BUSINESS_TIME must not be specified if the value of the CURRENT TEMPORAL BUSINESS_TIME special register is not the null value and the BUSTIMESENSITIVE bind option is set to YES (SQLSTATE 428HY).

valuevalue1value2

The value, value1, and value2 expressions return the null value or a value of one of the following built-in data types (SQLSTATE 428HY): a DATE, a TIMESTAMP, or a character string that is not a CLOB or DBCLOB. If the argument is a character string, it must be a valid character string representation of a timestamp or a date (SQLSTATE 22007). For the valid formats of string representations of timestamp values, see the section String representations of datetime values in the topic Datetime values.

Each expression can contain any of the following supported operands (SQLSTATE 428HY):

Constant
Special register
Variable. For details, refer to References to variables.
Parameter marker
Scalar function whose arguments are supported operands (user-defined functions and non-deterministic functions cannot be used)
CAST specification where the cast operand is a supported operand
Expression using arithmetic operators and operands

AS OF value

Specifies that the table reference includes each row for which the value of the begin column for the specified period is less than or equal to value, and the value of the end column for the period is greater than value. If value is the null value, the table reference is an empty table.

Example: The following query returns the insurance coverage information for insurance policy number 100 on August 31, 2010.

SELECT coverage FROM policy_info FOR BUSINESS_TIME
		AS OF '2010-08-31' WHERE policy_id = '100'

FROM value1 TO value2

Specifies that the table reference includes rows that exist for the period specified from value1 to value2. A row is included in the table reference if the value of the begin column for the specified period in the row is less than value2, and the value of the end column for the specified period in the row is greater than value1. The table reference contains zero rows if value1 is greater than or equal to value2. If value1 or value2 is the null value, the table reference is an empty table.

Example: The following query returns the insurance coverage information for insurance policy 100, during the year 2009 (from January 1, 2009 at 12:00 AM until before January 1, 2010).

SELECT coverage FROM policy_info FOR BUSINESS_TIME
		FROM '2009-01-01' TO '2010-01-01' WHERE policy_id = '100'

BETWEEN value1 AND value2

Specifies that the table reference includes rows in which the specified period overlaps at any point in time between value1 and value2. A row is included in the table reference if the value of the begin column for the specified period in the row is less than or equal to value2 and the value of the end column for the specified period in the row is greater than value1. The table reference contains zero rows if value1 is greater than value2. If value1 is equal to value2, the expression is equivalent to AS OF value1. If value1 or value2 is the null value, the table reference is an empty table.

Example: The following query returns the insurance coverage information for insurance policy number 100, during the year 2008 (between January 1, 2008 and December 31, 2008 inclusive).

SELECT coverage FROM policy_info FOR BUSINESS_TIME 
		BETWEEN '2008-01-01' AND '2008-12-31' WHERE policy_id = '100'

Following are syntax alternatives for period-specification clauses:

AS OF TIMESTAMP can be specified in place of FOR SYSTEM_TIME AS OF
VERSIONS BETWEEN TIMESTAMP can be specified in place of FOR SYSTEM_TIME BETWEEN

correlation-clause

The exposed names of all table references must be unique. An exposed name is:

A correlation-name
A table-name that is not followed by a correlation-name
A view-name that is not followed by a correlation-name
A nickname that is not followed by a correlation-name
An alias-name that is not followed by a correlation-name

If a correlation-clause clause does not follow a function-name reference, xmltable-expression expression, nested table expression, or data-change-table-reference reference, or if a typed-correlation-clause clause does not follow a function-name reference, then there is no exposed name for that table reference.

Each correlation-name is defined as a designator of the immediately preceding table-name, view-name, nickname, function-name reference, xmltable-expression, nested table expression, or data-change-table-reference. Any qualified reference to a column must use the exposed name. If the same table name, view, or nickname is specified twice, at least one specification must be followed by a correlation-name. The correlation-name is used to qualify references to the columns of the table, view or nickname. When a correlation-name is specified, column-names can also be specified to give names to the columns of the table reference. If the correlation-clause does not include column-names, the exposed column names are determined as follows:

Column names of the referenced table, view, or nickname when the table-reference is a table-name, view-name, nickname, or alias-name
Column names specified in the RETURNS clause of the CREATE FUNCTION statement when the table-reference is a function-name reference
Column names specified in the COLUMNS clause of the xmltable-expression when the table-reference is an xmltable-expression
Column names exposed by the fullselect when the table-reference is a nested-table-expression
Column names from the target table of the data change statement, along with any defined INCLUDE columns when the table-reference is a data-change-table-reference

tablesample-clause

The optional tablesample-clause can be used to obtain a random subset (a sample) of the rows from the specified table-name, rather than the entire contents of that table-name, for this query. This sampling is in addition to any predicates that are specified in the where-clause. Unless the optional REPEATABLE clause is specified, each execution of the query will usually yield a different sample, except in degenerate cases where the table is so small relative to the sample size that any sample must return the same rows. The size of the sample is controlled by the numeric-expression1 in parentheses, representing an approximate percentage (P) of the table to be returned.

TABLESAMPLE

The method by which the sample is obtained is specified after the TABLESAMPLE keyword, and can be either BERNOULLI or SYSTEM. The exact number of rows in the sample might be different for each execution of the query, but on average is approximately P percent of the table, before any predicates further reduce the number of rows.

The table-name must be a stored table. It can be a materialized query table (MQT) name, but not a subselect or table expression for which an MQT has been defined, because there is no guarantee that the database manager will route to the MQT for that subselect.

Semantically, sampling of a table occurs before any other query processing, such as applying predicates or performing joins. Repeated accesses of a sampled table within a single execution of a query (such as in a nested-loop join or a correlated subquery) will return the same sample. More than one table can be sampled in a query.

BERNOULLI

BERNOULLI sampling considers each row individually. It includes each row in the sample with probability P/100 (where P is the value of numeric-expression1), and excludes each row with probability 1 - P/100, independently of the other rows. So if the numeric-expression1 evaluated to the value 10, representing a ten percent sample, each row would be included with probability 0.1, and excluded with probability 0.9.

SYSTEM

SYSTEM sampling permits the database manager to determine the most efficient manner in which to perform the sampling. In most cases, SYSTEM sampling applied to a table-name means that each page of table-name is included in the sample with probability P/100, and excluded with probability 1 - P/100. All rows on each page that is included qualify for the sample. SYSTEM sampling of a table-name generally executes much faster than BERNOULLI sampling, because fewer data pages are retrieved. However, SYSTEM sampling can often yield less accurate estimates for aggregate functions, such as SUM(SALES), especially if the rows of table-name are clustered on any columns referenced in that query. The optimizer might in certain circumstances decide that it is more efficient to perform SYSTEM sampling as if it were BERNOULLI sampling. An example is when a predicate on table-name can be applied by an index and is much more selective than the sampling rate P.

numeric-expression1

The numeric-expression1 specifies the size of the sample to be obtained from table-name, expressed as a percentage. It must be a constant numeric expression that cannot contain columns. The expression must evaluate to a positive number that is less than or equal to 100, but can be between 1 and 0. For example, a value of 0.01 represents one one-hundredth of a percent, meaning that 1 row in 10 000 is sampled, on average. A numeric-expression1 that evaluates to 100 is handled as if the tablesample-clause were not specified. If numeric-expression1 evaluates to the null value, or to a value that is greater than 100 or less than 0, an error is returned (SQLSTATE 2202H).

REPEATABLE (numeric-expression2)

It is sometimes desirable for sampling to be repeatable from one execution of the query to the next; for example, during regression testing or query debugging. This can be accomplished by specifying the REPEATABLE clause. The REPEATABLE clause requires the specification of a numeric-expression2 in parentheses, which serves the same role as the seed in a random number generator. Adding the REPEATABLE clause to the tablesample-clause of any table-name ensures that repeated executions of that query (using the same value for numeric-expression2) return the same sample, assuming that the data itself has not been updated, reorganized, or repartitioned. To guarantee that the same sample of table-name is used across multiple queries, use of a global temporary table is recommended. Alternatively, the multiple queries can be combined into one query, with multiple references to a sample that is defined using the WITH clause.

Examples:

Request a 10% Bernoulli sample of the Sales table for auditing purposes.
```
  SELECT * FROM Sales
    TABLESAMPLE BERNOULLI(10)
```
Compute the total sales revenue in the Northeast region for each product category, using a random 1% SYSTEM sample of the Sales table. The semantics of SUM are for the sample itself, so to extrapolate the sales to the entire Sales table, the query must divide that SUM by the sampling rate (0.01).
```
SELECT SUM(Sales.Revenue) / (0.01)
  FROM Sales TABLESAMPLE SYSTEM(1)
  WHERE Sales.RegionName = 'Northeast'
  GROUP BY Sales.ProductCategory
```
Using the REPEATABLE clause, modify the previous query to ensure that the same (yet random) result is obtained each time the query is executed. The value of the constant enclosed by parentheses is arbitrary.
```
SELECT SUM(Sales.Revenue) / (0.01)
  FROM Sales TABLESAMPLE SYSTEM(1) REPEATABLE(3578231)
  WHERE Sales.RegionName = 'Northeast'
  GROUP BY Sales.ProductCategory
```

table-UDF-cardinality-clause

The table-UDF-cardinality clause can be specified for each user-defined table function reference within the FROM clause. This option indicates the expected number of rows to be returned only for the SELECT statement that contains it. The CARDINALITY and CARDINALITY MULTIPLIER clauses are not allowed if the table function is an inlined SQL table function (SQLSTATE 42887).

CARDINALITY integer-constant

Specifies an estimate of the expected number of rows that are returned by the reference to the user-defined function. The value range of integer-constant is from 0 to 9 223 372 036 854 775 807 inclusive.

CARDINALITY MULTIPLIER numeric-constant

The product of the specified CARDINALITY MULTIPLIER numeric-constant and the reference cardinality value are used by the database server as the expected number of rows that are returned by the table function reference.

In this case, numeric-constant can be in the integer, decimal, or floating-point format. The value must be greater than or equal to zero. If the decimal number notation is used, the number of digits can be up to 31. An integer value is treated as a decimal number with no fraction. If zero is specified or the computed cardinality is less than 1, the cardinality of the reference to the user-defined table function is assumed to be 1.

The value in the CARDINALITY column of SYSSTAT.ROUTINES for the table function name is used as the reference cardinality value. If no value is set in the CARDINALITY column of SYSSTAT.ROUTINES, a finite value is assumed as its default value for the reference cardinality value. This finite value is the same value that is assumed for tables for which the RUNSTATS utility has not gathered statistics.

Only a numeric constant can follow the keyword CARDINALITY or CARDINALITY MULTIPLIER. A host variable or parameter marker is not supported. Specifying a cardinality value in a table function reference does not change the CARDINALITY column value for the function in the SYSSTAT.ROUTINES catalog view.

The CARDINALITY value for external and compiled SQL user-defined table functions can be changed by updating the CARDINALITY column in the SYSSTAT.ROUTINES catalog view. The CARDINALITY value for an external table function can also be initialized by specifying the CARDINALITY option in the CREATE FUNCTION (external table) statement when a user-defined table function is created.

typed-correlation-clause

A typed-correlation-clause clause defines the appearance and contents of the table generated by a generic table function. This clause must be specified when the table-function-references is a generic table function and cannot be specified for any other table reference. The following data-type values are supported in generic table functions:

Table 4. Data types supported in generic table functions
SQL column data type	Equivalent Java™ data type
SMALLINT	short
INTEGER	int
BIGINT	long
REAL	float
DOUBLE	double
DECIMAL(p,s)	java.math.BigDecimal
NUMERIC(p,s)	java.math.BigDecimal
CHAR(n)	java.lang.String
CHAR(n) FOR BIT DATA	COM.ibm.db2.app.Blob
VARCHAR(n)	java.lang.String
VARCHAR(n) FOR BIT DATA	COM.ibm.db2.app.Blob
GRAPHIC(n)	java.lang.String
VARGRAPHIC(n)	String
BLOB(n)	COM.ibm.db2.app.Blob
CLOB(n)	COM.ibm.db2.app.Clob
DBCLOB(n)	COM.ibm.db2.app.Clob
DATE	String
TIME	String
TIMESTAMP	String
XML AS CLOB(n)	COM.ibm.db2.jcc.DB2Xml

continue-handler

Certain errors that occur within a nested-table-expression can be tolerated, and instead of returning an error, the query can continue and return a result. This is referred to as an error tolerant nested-table-expression.

Specifying the RETURN DATA UNTIL clause will cause any rows that are returned from the fullselect before the indicated condition is encountered to make up the result set from the fullselect. This means that a partial result set (which can also be an empty result set) from the fullselect is acceptable as the result for the nested-table-expression.

The FEDERATED keyword restricts the condition to handle only errors that occur at a remote data source.

The condition can be specified as an SQLSTATE value, with a string-constant length of 5. You can optionally specify an SQLCODE value for each specified SQLSTATE value. For portable applications, specify SQLSTATE values as much as possible, because SQLCODE values are generally not portable across platforms and are not part of the SQL standard.

Only certain conditions can be tolerated. Errors that do not allow the rest of the query to be executed cannot be tolerated, and an error is returned for the whole query. The specific-condition-value might specify conditions that cannot actually be tolerated by the database manager, even if a specific SQLSTATE or SQLCODE value is specified, and for these cases, an error is returned.

A query or view containing an error tolerant nested-table-expression is read-only.

The fullselect of an error tolerant nested-table-expression is not optimized using materialized query tables.

specific-condition-value

The following SQLSTATE values and SQLCODE values have the potential, when specified, to be tolerated by the database manager:

SQLSTATE 08001; SQLCODEs -1336, -30080, -30081, -30082
SQLSTATE 08004
SQLSTATE 42501
SQLSTATE 42704; SQLCODE -204
SQLSTATE 42720
SQLSTATE 28000

Correlated references in table-references

Correlated references can be used in nested table expressions or as arguments to table functions. The basic rule that applies for both of these cases is that the correlated reference must be from a table-reference at a higher level in the hierarchy of subqueries. This hierarchy includes the table-references that have already been resolved in the left-to-right processing of the FROM clause. For nested table expressions, the LATERAL keyword must exist before the fullselect. The following examples have valid syntax:

  Example 1:  SELECT t.c1, z.c5
              FROM t, TABLE( tf3(t.c2) ) AS z      -- t precedes tf3
              WHERE t.c3 = z.c4;                   -- in FROM, so t.c2
                                                   -- is known

  Example 2:  SELECT t.c1, z.c5
              FROM t, TABLE( tf4(2 * t.c2) ) AS z  -- t precedes tf4
              WHERE t.c3 = z.c4;                   -- in FROM, so t.c2
                                                   -- is known

  Example 3:  SELECT d.deptno, d.deptname,
                     empinfo.avgsal, empinfo.empcount
              FROM department d,
                   LATERAL (SELECT AVG(e.salary) AS avgsal,
                              COUNT(*) AS empcount
                       FROM employee e           -- department precedes nested
                       WHERE e.workdept=d.deptno -- table expression and
                      ) AS empinfo;              -- LATERAL is specified, 
                                                 -- so d.deptno is known

But the following examples are not valid:

  Example 4:  SELECT t.c1, z.c5
              FROM TABLE( tf6(t.c2) ) AS z, t  -- cannot resolve t in t.c2!
              WHERE t.c3 = z.c4;               -- compare to Example 1 above.

  Example 5:  SELECT a.c1, b.c5
              FROM TABLE( tf7a(b.c2) ) AS a, TABLE( tf7b(a.c6) ) AS b
              WHERE a.c3 = b.c4;               -- cannot resolve b in b.c2!

  Example 6:  SELECT d.deptno, d.deptname,
                     empinfo.avgsal, empinfo.empcount
              FROM department d,
                   (SELECT AVG(e.salary) AS avgsal,
                              COUNT(*) AS empcount
                       FROM employee e           -- department precedes nested
                       WHERE e.workdept=d.deptno -- table expression but 
                      ) AS empinfo;              -- LATERAL is not specified,
                                                 -- so d.deptno is unknown