table-reference
A table-reference specifies an intermediate result table.
- 1 The syntax for joined-table is covered in a separate topic; refer to joined-table.
- 2 TABLE can be specified in place of LATERAL.
- 3 The typed-correlation-clause is required for generic table functions. This clause cannot be specified for any other table functions.
- 4 WITH ORDINALITY can be specified only if the argument to the UNNEST table function is one or more ordinary array variables or functions with ordinary array return types; an associative array variable or function with an associative array return type cannot be specified (SQLSTATE 428HT).
- 5 An XMLTABLE function can be part of a table-reference. In this case, subexpressions within
the XMLTABLE expression are in-scope of prior range variables in the FROM clause. For more
information, see the description of
XMLTABLE
. - 6 Specifying a LIKE clause or at least one column definition is not mandatory for an
INSERT INTO <table> SELECT FROM
statement, which acts as an implicit LIKE with respect to the INSERT target table.
- If a single-table-reference is specified without a period-specification or a tablesample-clause, the intermediate result table is the rows of the table. If a period-specification is specified, the intermediate result table consists of the rows of the temporal table where the period matches the specification. If a tablesample-clause is specified, the intermediate result table consists of a sampled subset of the rows of the table.
- If a single-view-reference is specified without a period-specification, the intermediate result table is that view. If a period-specification is specified, temporal table references in the view consider only the rows where the period matches the specification.
- If a single-nickname-reference is specified, the intermediate result table is the data from the data source for that nickname.
- If an only-table-reference is specified, the intermediate result table consists of only the rows of the specified table or view without considering the applicable subtables or subviews.
- If an outer-table-reference is specified, the intermediate result table represents a virtual table based on all the subtables of a typed table or the subviews of a typed view.
- If an analyze_table-expression is specified, the result table contains the result of executing a specific data mining model by using an in-database analytics provider, a named model implementation, and input data.
- If a nested-table-expression is specified, the result table is the result of the specified fullselect.
- If a data-change-table-reference is specified, the intermediate result table is the set of rows that are directly changed by the searched UPDATE, searched DELETE, or INSERT statement that is included in the clause.
- If a table-function-reference is specified, the intermediate result table is the set of rows that are returned by the table function.
- If a collection-derived-table is specified, the intermediate result table is the set of rows that are returned by the UNNEST function.
- If an xmltable-expression is specified, the intermediate result table is the set of rows that are returned by the XMLTABLE function.
- If a joined-table is specified, the intermediate result table is the result of one or more join operations. For more information, see joined-table.
- If an external-table-reference is specified without a tablesample-clause, the intermediate result table is the rows of the external table that is represented by the specified file. If a tablesample-clause is specified, the intermediate result table consists of a sampled subset of the rows of the external table that is represented by the specified file.
- single-table-reference
Each table-name specified as a table-reference must identify an existing table at the application server or an existing table at a remote server specified using a remote-object-name. The intermediate result table is the result of the table. If the table-name references a typed table, the intermediate result table is the UNION ALL of the table with all its subtables, with only the columns of the table-name. A period-specification can be used with a temporal table to specify the period from which the rows are returned as the intermediate result table. A tablesample-clause can be used to specify that a sample of the rows be returned as the intermediate result table.
If the CURRENT TEMPORAL SYSTEM_TIME special register is set to a non-null value CTST and table-name identifies a system-period temporal table, the table reference is executed as if it contained the following specification with the special register set to the null value:
If the CURRENT TEMPORAL BUSINESS_TIME special register is set to a non-null value CTBT and table-name identifies an application-period temporal table, the table reference is executed as if it contained the following specification with the special register set to the null value:table-name FOR SYSTEM_TIME AS OF CTST
table-name FOR BUSINESS_TIME AS OF CTBT
- single-view-reference
- Each view-name specified as a table-reference must identify one of the following objects:
- An existing view at the application server
- A view at a remote server specified using a remote-object-name
- The table-name of a common table expression
If the CURRENT TEMPORAL SYSTEM_TIME special register is set to a non-null value CTST, and view-name identifies a system-period temporal table, the table reference is executed as if it contained the following specification with the special register set to the null value:view-name FOR SYSTEM_TIME AS OF CTST
view-name FOR BUSINESS_TIME AS OF CTBT
- single-nickname-reference
Each nickname specified as a table-reference must identify an existing nickname at the application server. The intermediate result table is the result of the nickname.
- only-table-reference
The use of ONLY(table-name) or ONLY(view-name) means that the rows of the applicable subtables or subviews are not included in the intermediate result table. If the table-name used with ONLY does not have subtables, then ONLY(table-name) is equivalent to specifying table-name. If the view-name used with ONLY does not have subviews, then ONLY(view-name) is equivalent to specifying view-name.
The use of ONLY requires the SELECT privilege on every subtable of table-name or subview of view-name.
- outer-table-reference
The use of OUTER(table-name) or OUTER(view-name) represents a virtual table. If the table-name or view-name used with OUTER does not have subtables or subviews, then specifying OUTER is equivalent to not specifying OUTER. If the table-name does have subtables, the intermediate result table from OUTER(table-name) is derived from table-name as follows:
- The columns include the columns of table-name followed by the additional columns introduced by each of its subtables, if any. The additional columns are added on the right, traversing the subtable hierarchy in depth-first order. Subtables that have a common parent are traversed in creation order of their types.
- The rows include all the rows of table-name and all the rows of its subtables. Null values are returned for columns that are not in the subtable for the row.
If the view-name does have subviews, the intermediate result table from OUTER(view-name) is derived from view-name as follows:
- The columns include the columns of view-name followed by the additional columns introduced by each of its subviews, if any. The additional columns are added on the right, traversing the subview hierarchy in depth-first order. Subviews that have a common parent are traversed in creation order of their types.
- The rows include all the rows of view-name and all the rows of its subviews. Null values are returned for columns that are not in the subview for the row.
The use of OUTER requires the SELECT privilege on every subtable of table-name or subview of view-name.
- analyze_table-expression
- table-name | view-name
- The table-name or view-name variable must identify an existing table or view or identify the table-name of a common table expression that you define preceding the fullselect containing the table-reference. You can specify a nickname. However, in-database analytics are intended for local data, and retrieving the data for a nickname from another data source does not take advantage of the intended performance benefits.
- ANALYZE_TABLE
- Returns
the result of executing a specific data mining model by using an in-database analytics provider, a
named model implementation, and input data. A query referencing the ANALYZE_TABLE parameter cannot
be a static SQL statement or a data definition language (DDL) statement. Input or output values
cannot be of the following types:
- CHAR FOR BIT DATA or VARCHAR FOR BIT DATA
- BINARY or VARBINARY
- BLOB, CLOB, DBCLOB, or NCLOB
- BOOLEAN
- XML
- DB2SECURITYLABEL
- IMPLEMENTATION 'string'
- Specifies how the expression is to be evaluated. The string parameter is a string constant whose maximum length
is 1024 bytes. The specified value is used to establish a session
with an in-database analytic provider. When you specify SAS as the
provider, you must specify values for the following case-insensitive
parameters:
- PROVIDER
- Currently, the only supported provider value is SAS.
- ROUTINE_SOURCE_TABLE
- Specifies a user table containing the DS2 code (and, optionally,
any required format or metadata) to implement the algorithm that is
specified by the ROUTINE_SOURCE_NAME parameter. DS2 is a procedural
language processor for SAS, designed for data modeling, stored procedures,
and data extraction, transformation, and load (ETL) processing.
The routine source table has a defined structure (see the examples at the end of the
analyze_table-expression
section) and, in a partitioned database environment, must be on the catalog database partition. The table cannot be a global temporary table. The MODELDS2 column for a particular row must not be empty or contain the null value. If the value of the MODELFORMATS or MODELMETADATA column is not null, the value must have a length greater than 0. If you do not specify a table schema name, the value of the CURRENT SCHEMA special register is used. - ROUTINE_SOURCE_NAME
- Specifies the name of the algorithm to use.
If the table name, schema name, or algorithm name contains lowercase or mixed-case letters, specify delimited identifiers, as shown in the following example:IMPLEMENTATION 'PROVIDER=SAS; ROUTINE_SOURCE_TABLE=ETLIN.SOURCE_TABLE; ROUTINE_SOURCE_NAME=SCORING_FUN1;'
IMPLEMENTATION 'PROVIDER=SAS; ROUTINE_SOURCE_TABLE="ETLin"."Source_Table"; ROUTINE_SOURCE_NAME="Scoring_Fun1";'
The following examples show you how to use the ANALYZE_TABLE expression.
SAS tooling helps you to define a table to store model implementations for scoring functions. A row in this table stores an algorithm that is written in DS2, with any required SAS format information and metadata. The MODELNAME column serves as the primary key. For a particular value of the ROUTINE_SOURCE_NAME parameter, at most one row is retrieved from the table that the ROUTINE_SOURCE_TABLE parameter specifies. For example:
The MODELNAME column contains the name of the algorithm. The MODELDS2 column contains the DS2 source code that implements the algorithm. The MODELFORMATS column contains the aggregated SAS format definition that the algorithm requires. If the algorithm does not require a SAS format, this column contains the null value. The MODELMETADATA column contains any additional metadata that the algorithm requires. If the algorithm does not require any additional metadata, this column contains the null value. If the SAS EP installer creates the table, it might include additional columns.CREATE TABLE ETLIN.SOURCE_TABLE ( MODELNAME VARCHAR(128) NOT NULL PRIMARY KEY, MODELDS2 BLOB(4M) NOT NULL, MODELFORMATS BLOB(4M), MODELMETADATA BLOB(4M) );
- Use the data in columns C1 and C2 in table T1 as input data with
the scoring model SCORING_FUN1, whose implementation is stored in
ETLIN.SOURCE_TABLE:
WITH sas_score_in (c1,c2) AS (SELECT c1,c2 FROM t1) SELECT * FROM sas_score_in ANALYZE_TABLE( IMPLEMENTATION 'PROVIDER=SAS; ROUTINE_SOURCE_TABLE=ETLIN.SOURCE_TABLE; ROUTINE_SOURCE_NAME=SCORING_FUN1;');
- Use all the data in the table T2 with the scoring model SCORING_FUN2,
whose implementation is stored in ETLIN.SOURCE_TABLE:
SELECT * FROM t2 ANALYZE_TABLE( IMPLEMENTATION 'PROVIDER=SAS; ROUTINE_SOURCE_TABLE=ETLIN.SOURCE_TABLE; ROUTINE_SOURCE_NAME=SCORING_FUN2;');
- Use all the data in view V1 with the scoring model SCORING_FUN3,
whose implementation is stored in ETLIN.SOURCE_TABLE, and return the
output in ascending order of the first output column:
SELECT * FROM v1 ANALYZE_TABLE( IMPLEMENTATION 'PROVIDER=SAS; ROUTINE_SOURCE_TABLE=ETLIN.SOURCE_TABLE; ROUTINE_SOURCE_NAME=SCORING_FUN3;') ORDER BY 1;
- nested-table-expression
A fullselect in parentheses is called a nested table expression. The intermediate result table is the result of that fullselect. The columns of the result do not need unique names, but a column with a non-unique name cannot be explicitly referenced. If LATERAL is specified, the fullselect can include correlated references to results columns of table references specified to the left of the nested table expression. If the nested table expression involves data from a federated data source, a continue-handler can be specified to tolerate certain error conditions from the data source.
An expression in the select list of a nested table expression that is referenced within, or is the target of, a data change statement within a fullselect is valid only when it does not include:- A function that reads or modifies SQL data
- A function that is non-deterministic
- A function that has external action
- An OLAP function
If a view is referenced directly in, or as the target of a nested table expression in a data change statement within a FROM clause, the view must meet either of the following conditions:- Be symmetric (have WITH CHECK OPTION specified)
- Satisfy the restriction for a WITH CHECK OPTION view
If the target of a data change statement within a FROM clause is a nested table expression, the following restrictions apply:- Modified rows are not requalified
- WHERE clause predicates are not reevaluated
- ORDER BY or FETCH FIRST operations are not redone
A nested table expression can be used in the following situations:- In place of a view to avoid creating the view (when general use of the view is not required)
- When the required intermediate result table is based on host variables
- data-change-table-reference
A data-change-table-reference clause specifies an intermediate result table. This table is based on the rows that are directly changed by the searched UPDATE, searched DELETE, or INSERT statement that is included in the clause. A data-change-table-reference can be specified as the only table-reference in the FROM clause of the outer fullselect that is used in a select-statement, a SELECT INTO statement, or a common table expression. A data-change-table-reference can be specified as the only table reference in the only fullselect in a SET Variable statement (SQLSTATE 428FL). The target table or view of the data change statement is considered to be a table or view that is referenced in the query; therefore, the authorization ID of the query must have SELECT privilege on that target table or view. A data-change-table-reference clause cannot be specified in a view definition, materialized query table definition, or FOR statement (SQLSTATE 428FL).
The target of the UPDATE, DELETE, or INSERT statement cannot be a temporary view defined in a common table expression (SQLSTATE 42807) or a nickname (SQLSTATE 25000).
Expressions in the select list of a view or fullselect as target of a data change statement in a table-reference can be selected only if OLD TABLE is specified or the expression does not include the following elements (SQLSTATE 428G6):- A subquery
- A function that reads or modifies SQL data
- A function is that is non-deterministic or has an external action
- An OLAP function
- A NEXT VALUE FOR sequence reference
- FINAL TABLE
- Specifies that the rows of the intermediate result table represent the set of rows that are changed by the SQL data change statement as they exist at the completion of the data change statement. If there are AFTER triggers or referential constraints that result in further operations on the table that is the target of the SQL data change statement, an error is returned (SQLSTATE 560C6). If the target of the SQL data change statement is a view that is defined with an INSTEAD OF trigger for the type of data change, an error is returned (SQLSTATE 428G3).
- NEW TABLE
- Specifies that the rows of the intermediate result table represent the set of rows that are changed by the SQL data change statement before the application of referential constraints and AFTER triggers. Data in the target table at the completion of the statement might not match the data in the intermediate result table because of additional processing for referential constraints and AFTER triggers.
- OLD TABLE
- Specifies that the rows of the intermediate result table represent the set of rows that are changed by the SQL data change statement as they existed before the application of the data change statement.
- (searched-update-statement)
- Specifies a searched UPDATE statement. A WHERE clause or a SET clause in the UPDATE statement cannot contain correlated references to columns outside of the UPDATE statement.
- (searched-delete-statement)
- Specifies a searched DELETE statement. A WHERE clause in the DELETE statement cannot contain correlated references to columns outside of the DELETE statement.
- (insert-statement)
- Specifies an INSERT statement. A fullselect in the INSERT statement cannot contain correlated references to columns outside of the fullselect of the INSERT statement.
The content of the intermediate result table for a data-change-table-reference is determined when the cursor opens. The intermediate result table contains all manipulated rows, including all the columns in the specified target table or view. All the columns of the target table or view for an SQL data change statement are accessible using the column names from the target table or view. If an INCLUDE clause was specified within a data change statement, the intermediate result table will contain these additional columns.
- table-function-reference
- In general, a table function, together with its argument values, can be referenced in the FROM clause of a SELECT in exactly the same way as a table or view. Each function-name together with the types of its arguments, specified as a table reference must resolve to an existing table function at the application server. There are, however, some special considerations which apply.
- Table function column names: Unless alternative column names are provided following the correlation-name, the column names for the table function are those specified in the RETURNS or RETURNS GENERIC TABLE clause of the CREATE FUNCTION statement. This is analogous to the names of the columns of a table, which are defined in the CREATE TABLE statement.
- Table function resolution: The arguments specified in a table function reference, together with the function name, are used by an algorithm called function resolution to determine the exact function to be used. This is no different from what happens with other functions (such as scalar functions) that are used in a statement.
- Table function arguments: As with scalar function arguments,
table function arguments can generally be any valid SQL expression.
The following examples are valid syntax:
Example 1: SELECT c1 FROM TABLE( tf1('Zachary') ) AS z WHERE c2 = 'FLORIDA'; Example 2: SELECT c1 FROM TABLE( tf2 (:hostvar1, CURRENT DATE) ) AS z; Example 3: SELECT c1 FROM t WHERE c2 IN (SELECT c3 FROM TABLE( tf5(t.c4) ) AS z -- correlated reference ) -- to previous FROM clause
Example 4: SELECT c1 FROM TABLE( tf6('abcd') ) -- tf6 is a generic AS z (c1 int, c2 varchar(100)) -- java table function
- Table functions that modify SQL data: Table functions that
are specified with the MODIFIES SQL DATA option can be used only as
the last table reference in a select-statement, common-table-expression, or RETURN statement
that is a subselect, a SELECT INTO, or a row-fullselect in a SET statement. Only one table function is allowed in
one FROM clause, and the table function arguments must be correlated
to all other table references in the subselect (SQLSTATE 429BL). The
following examples have valid syntax for a table function with the
MODIFIES SQL DATA property:
Example 1: SELECT c1 FROM TABLE( tfmod('Jones') ) AS z Example 2: SELECT c1 FROM t1, t2, TABLE( tfmod(t1.c1, t2.c1) ) AS z Example 3: SET var = (SELECT c1 FROM TABLE( tfmod('Jones') ) AS z Example 4: RETURN SELECT c1 FROM TABLE( tfmod('Jones') ) AS z Example 5: WITH v1(c1) AS (SELECT c1 FROM TABLE( tfmod(:hostvar1) ) AS z) SELECT c1 FROM v1, t1 WHERE v1.c1 = t1.c1
Example 6: SELECT z.* FROM t1, t2, TABLE( tfmod(t1.c1, t2.c1) ) AS z (col1 int)
- collection-derived-table
A collection-derived-table can be used to convert the elements of an array into values of a column in separate rows. If WITH ORDINALITY is specified, an extra column of data type INTEGER is appended. This column contains the position of the element in the array. The columns can be referenced in the select list and the in rest of the subselect by using the names specified for the columns in the correlation-clause. The collection-derived-table clause can be used only in a context where arrays are supported (SQLSTATE 42887). See the "UNNEST table function" for details.
- xmltable-expression
- An xmltable-expression specifies an invocation of the built-in XMLTABLE function which determines the intermediate result table. See XMLTABLE for more information.
- external-table-reference
- An external table resides in a text-based, delimited or
non-delimited file outside of a database. An external-table-reference
specifies the name of the file that contains an external table.Important: A transient external-table-reference cannot be specified in a view definition (SQLSTATE 428IE).
- column-definition
- The attributes of a column.
- column-name
- Names a column of the table. The name cannot be qualified, and the same name cannot be used for more than one column of the table (SQLSTATE 42711).
- built-in-type
- One of the following built-in data types:
- SMALLINT
- A small integer.
- [INTEGER | INT]
- A large integer.
- BIGINT
- A big integer.
- [DECIMAL | DEC | NUMERIC | NUM](precision-integer, scale-integer)
- A decimal number.
- The precision integer specifies the total number of digits. It must be in the range 1 ‑ 31. The default is 5.
- The scale integer specifies the number of digits to the right of the decimal point. It cannot be negative and cannot exceed the precision. The default is 0.
- FLOAT(integer)
- A single or double-precision floating-point number. If the specified length is in the range:
- 1 - 24, the number uses single precision
- 25 - 53, the number uses double-precision
Instead of FLOAT, you can specify:- REAL
- For single precision floating-point.
- DOUBLE
- For double-precision floating-point.
- DOUBLE PRECISION
- For double-precision floating-point.
- FLOAT
- For double-precision floating-point.
- DECFLOAT(precision-integer)
- A decimal floating-point number. The precision integer specifies the total number of digits, which can be either 16 or 34. The default is 34.
- [CHARACTER | CHAR](integer [OCTETS | CODEUNITS32])
- A fixed-length character string of the specified number of code units. This number can range from 1 ‑ 255 OCTETS or from 1 - 63 CODEUNITS32. The default is 1.
- [VARCHAR | CHARACTER VARYING | CHAR VARYING](integer [OCTETS | CODEUNITS32])
- A varying-length character string with a maximum length of the specified number of code units. This number can range from 1 ‑ 32672 OCTETS or from 1 - 8168 CODEUNITS32.
- FOR BIT DATA
- Specifies that the contents of the column are to be treated as bit (binary) data. During data exchange with other systems, code page conversions are not performed. Comparisons are done in binary, irrespective of the database collating sequence.
- [CLOB | CHARACTER LARGE OBJECT | CHAR LARGE OBJECT](integer [K] [OCTETS | CODEUNITS32])
- A character large object string with a maximum length of the specified number of code units. The
default maximum length is 65,535
bytes.If you want to multiply the length integer by 1024, specify a K (kilo) multiplier.
- Regardless of whether you use a K multiplier, the resulting length is limited by the maximum length of a CLOB column in an external table, which is 65,535 OCTETS, 32,767 CODEUNITS16, or 16,383 CODEUNITS32. Note that 64K OCTETS and 16K CODEUNITS32 each exceed the maximum length by one, and so are not allowed.
- Any number of spaces (including zero spaces) are allowed between the data
type and the length specification or between the length integer and the
K multiplier. For
example, the following specifications are all equivalent and
valid:
CLOB(50K) CLOB(50 K) CLOB (50 K)
- The K multiplier can be specified in either uppercase or lowercase.
The default string units are OCTETS.
- OCTETS
- Specifies that the units of the length attribute are bytes.
- CODEUNITS32
- Specifies that the units of the length attribute are Unicode UTF-32 code units, which approximates counting in characters. This does not affect the underlying code page of the data type. The actual length of a data value is determined by counting the UTF-32 code units as if the data were converted to UTF-32.
- GRAPHIC(integer [CODEUNITS16 | CODEUNITS32])
- A fixed-length graphic string of the specified length, which can range from 1 ‑ 127 double bytes, 1 ‑ 127 CODEUNITS16, or 1 ‑ 63 CODEUNITS32. The default length is 1.
- VARGRAPHIC(integer [CODEUNITS16 | CODEUNITS32])
- A varying-length graphic string of the specified maximum length, which can range from 1 ‑ 16336 double bytes, 1 ‑ 16336 CODEUNITS16, or 1 ‑ 8168 CODEUNITS32.
- DBCLOB(integer [K] [CODEUNITS16 | CODEUNITS32])
- A character large object string of the specified maximum length in double bytes, Unicode UTF-16
code units, or Unicode UTF-32 code units. The default maximum length is 32,767 double bytes.If you want to multiply the length integer by 1024, specify a K (kilo) multiplier.
- Regardless of whether you use a K multiplier, the resulting length is limited by the maximum length of a DBCLOB column in an external table, which is 32,767 CODEUNITS16 or 16,383 CODEUNITS32. Note that 32K CODEUNITS16 and 16K CODEUNITS32 each exceed the maximum length by one, and so are not allowed.
- Any number of spaces (including zero spaces) are allowed between the data
type and the length specification or between the length integer and the
K multiplier. For
example, the following specifications are all equivalent and
valid:
DBCLOB(50K) DBCLOB(50 K) DBCLOB (50 K)
- The K multiplier can be specified in either uppercase or lowercase.
The default string units are CODEUNITS16.
- CODEUNITS16
- Specifies that the units of the length attribute are Unicode UTF-16 code units, which is the same as counting in double bytes.
- CODEUNITS32
- Specifies that the units of the length attribute are Unicode UTF-32 code units. This does not affect the underlying code page of the data type. The actual length of a data value is determined by counting the UTF-32 code units as if the data were converted to UTF-32.
- [NATIONAL CHARACTER | NATIONAL CHAR | NCHAR](integer)
- A fixed-length string of the specified length. The default length is
1.
The NATIONAL CHARACTER type maps to a fixed-length character string with string units CODEUNITS32.
- [NATIONAL CHARACTER VARYING | NATIONAL CHAR VARYING | NCHAR VARYING | NVARCHAR](integer)
- A varying-length string of the specified maximum
length.
The NATIONAL CHARACTER VARYING type maps to a varying-length character string with string units CODEUNITS32.
- [NATIONAL CHARACTER LARGE OBJECT | NCHAR LARGE OBJECT | NCLOB](integer [K])
- A large object string of the specified maximum length. The default maximum length is 16,383
double
bytes.
This data type maps to a character large object (CLOB) with string units CODEUNITS32. See the description of the CLOB parameter for information about possible values for the length integer and how to use a K (kilo) multiplier.
- BINARY(integer)
- A fixed-length binary string of the specified length, which must be in the range 1 ‑ 255 bytes. The default length is 1.
- [VARBINARY | BINARY VARYING](integer)
- A varying-length binary string of the specified maximum length, which must be in the range 1 ‑ 32672 bytes.
- [BLOB | BINARY LARGE OBJECT](integer [K])
- A binary large object string with a maximum length of the specified number of code units. The
default maximum length is 65,535 bytes.If you want to multiply the length integer by 1024, specify a K (kilo) multiplier.
- Regardless of whether you use a K multiplier, the resulting length is limited by the maximum length of a BLOB column in an external table, which is 65,535 bytes. Note that 64K exceeds the maximum length by one, and so is not allowed.
- Any number of spaces (including zero spaces) are allowed between the data
type and the length specification or between the length integer and the
K multiplier. For
example, the following specifications are all equivalent and
valid:
BLOB(50K) BLOB(50 K) BLOB (50 K)
- The K multiplier can be specified in either uppercase or lowercase.
- DATE
- A date.
- TIME
- A time.
- TIMESTAMP(integer) or TIMESTAMP
- A timestamp. The integer specifies the number of decimal places for fractions of seconds, from 0 (seconds) to 12 (picoseconds). The default is 6 (microseconds).
- BOOLEAN
- A Boolean value.
- LIKE table-name1 or view-name or nickname
- Specifies that the columns of the table have the same name and description as the columns of the
specified table (table-name1), view (view-name),
or nickname (nickname). The specified table, view, or nickname must either
exist in the catalog or must be a declared temporary table. A typed table or typed view cannot be
specified (SQLSTATE 428EC). The use of LIKE is an implicit definition of n columns, where n is the number of columns in the identified table (including implicitly hidden columns), view, or nickname. A column of the new table that corresponds to an implicitly hidden column in the existing table will also be defined as implicitly hidden. The implicit definition depends on what is specified after LIKE:
- If a table is specified, then the implicit definition includes the column name, data type, hidden attribute, and nullability characteristic of each of the columns of that table. If EXCLUDING COLUMN DEFAULTS is not specified, then the column default is also included.
- If a view is specified, then the implicit definition includes the column name, data type, and nullability characteristic of each of the result columns of the fullselect defined in that view. The data types of the view columns must be data types that are valid for columns of a table.
- If a nickname is specified, then the implicit definition includes the column name, data type, and nullability characteristic of each column of that nickname.
- If a protected table is specified, the new table inherits the same security policy and protected columns as the identified table.
- If a table is specified and if that table contains a row-begin column, row-end column, or transaction-start-ID column, the corresponding column of the new table inherits only the data type of the source column. The new column is not considered a generated column.
- If a table that includes a period is specified, the new table does not inherit the period definition.
- If a system-period temporal table is specified, the new table is not a system-period temporal table.
- If a random distribution table that uses the random by generation method is specified, and if the new table that is being created does not share the same table distribution, the RANDOM_DISTRIBUTION_KEY column that is used to generate the random distribution values is not included.
Column default and identity column attributes can be included or excluded, based on the copy-attributes clauses. The implicit definition does not include any other attributes of the identified table, view, or nickname. Consequently, the new table does not have any primary key, unique constraints, foreign key constraints, referential integrity constraints, triggers, indexes, ORGANIZE BY specification, or PARTITIONING KEY specification. The table is created in the table space implicitly or explicitly specified by the IN clause, and the table has any other optional clause only if the optional clause is specified.
When a table is identified in the LIKE clause and that table contains a ROW CHANGE TIMESTAMP column, the corresponding column of the new table inherits only the data type of the ROW CHANGE TIMESTAMP column. The new column is not considered to be a generated column.
If a table is specified, and if row or column level access control is activated for that table, it is not inherited by the new table.
- option
- The following options control the loading of data to or retrieval of data from an external-table
file. The value of each option is a text string and is not case-sensitive.
- BOOLSTYLE or BOOLEAN_STYLE
- During a load operation, all Boolean values must use the same style. This option specifies the
Boolean style that is to be used:
- 1_0 (this is the default)
- T_F
- Y_N
- YES_NO
- TRUE_FALSE
- CARDINALITY
- Non-zero positive integer value to override the estimation of the expected number of returned rows.
- CCSID
- The coded character set identifier (CCSID) of the input data file. The value can be any valid
integer value from the CCSID specification. There is no default value. The CCSID and ENCODING options are mutually exclusive when
the value of the ENCODING option is UTF8, LATIN9, or INTERNAL.Which styles are used for dates and times depends on whether a CCSID is specified:
- When a CCSID is specified, and when DATESTYLE, TIMESTYLE, DATEDELIM, or TIMEDELIM are not specified, the values or defaults for DATE_FORMAT, TIME_FORMAT, and TIMESTAMP_FORMAT are used.
- When a CCSID is not specified, and when TIMESTAMP_FORMAT, DATE_FORMAT or TIME_FORMAT are not specified, the values or defaults for DATESTYLE, TIMESTYLE, DATEDELIM, and TIMEDELIM are used.
- COMPRESS
- For a load operation or an unload operation, whether the data file data is compressed:
- GZIP
- The data file data is compressed by using the GZIP compression algorithm.
- NO
- The data file data is not compressed. This is the default.
- LZ4
- The data file data is compressed by using the LZ4 compression algorithm.
- CRINSTRING
- How to interpret an unescaped carriage-return (CR) or carriage-return line-feed (CRLF) character:
- TRUE or ON
- An unescaped CR character is interpreted as data, not as a record delimiter. An unescaped CRLF character is split into a CR character, which is interpreted as data, and an LF character, which is interpreted as a record delimiter. The value of the LFINSTRING option does not affect how this LF character is interpreted.
- FALSE or OFF
- An unescaped CR or CRLF character is interpreted as a record delimiter. This is the default.
- CTRLCHARS
- Whether to allow an ASCII value 1 - 31 in a CHAR or VARCHAR field. Any NULL, CR, or LF
characters must be escaped. Allowed values are:
- TRUE or ON
- An ASCII value 1 - 31 in a CHAR or VARCHAR field is allowed.
- FALSE or OFF
- An ASCII value 1 - 31 in a CHAR or VARCHAR field is not allowed. This is the default.
- DATAOBJECT or FILE_NAME
- The fully-qualified name of the file (or any medium that can be treated as a file) that is to
contain the external table to be created. This option is mandatory when the name of the file is not
specified immediately after the table name; otherwise, it is not
allowed.
- For a cloud-based managed service such as Db2® Warehouse on Cloud,
when the REMOTESOURCE option is set to LOCAL (this is its default value), the path to the external
table file is relative to your Db2 managed services home directory.
Otherwise, the path to the external table file is relative to the path
/home/
followed by the authorization ID of the table definer. For example, if the authorization ID of the table definer isuser1
, the path to the external table file is relative to/home/user1/
. - For an on-premise product such as Db2 Warehouse, when
both the REMOTESOURCE option is set to LOCAL (this is its default value) and the
extbl_strict_io configuration parameter is set to NO, the path to the external
table file is an absolute path and must be one of the paths specified by the
extbl_location configuration parameter. Otherwise, the path to the external
table file is relative to the path that is specified by the extbl_location
configuration parameter followed by the authorization ID of the table definer. For example, if
extbl_location is set to
/home/xyz
and the authorization ID of the table definer isuser1
, the path to the external table file is relative to/home/xyz/user1/
.
The file name must be a valid UTF-8 string.
For a load operation, the following conditions apply:- The file must already exist.
- Required permissions:
- If the external table is a named external table, the owner must have read permission for the file and write permission for the LOGDIR directory.
- If the external table is a transient external table, the authorization ID of the statement must have read permission for the file and write permission for the LOGDIR directory.
- If the file exists, it is overwritten.
- Required permissions:
- If the external table is a named external table, the owner must have read and write permission for the directory of this file.
- If the external table is transient, the authorization ID of the statement must have read and write permission for the directory of this file.
- For a cloud-based managed service such as Db2® Warehouse on Cloud,
when the REMOTESOURCE option is set to LOCAL (this is its default value), the path to the external
table file is relative to your Db2 managed services home directory.
Otherwise, the path to the external table file is relative to the path
- DATEDELIM
- The delimiter character that separates the components of a date, according to the format specified by the DATESTYLE option. If you specify an empty string, there is no delimiter between the date components, and days and months must be specified as two-digit numbers. When DATESTYLE is set to MONDY or MONDY2, the default DATEDELIM value is a space. The TIMESTAMP_FORMAT and DATEDELIM options are mutually exclusive.
- DATESTYLE
- How to interpret the date format. For days or months in the range 1 ‑ 9, use 1
digit, 2 digits, or a space followed by a single digit. When the DATEDELIM option is a space, you
can specify a comma after the day. An error occurs if you:
- Specify zero for a day, month, or year
- Specify a nonexistent date (for example, 32 August or 30 February)
Table 1. Possible values for the DateStyle option. The example shows how the date 21 March 2014 would be represented when DATEDELIM is set to '-'. Value Description Example YMD 4-digit year, 2-digit month, 2-digit day. This is the default. 2014-03-21 DMY 2-digit day, 2-digit month, 4-digit year. 21-03-2014 MDY 2-digit month, 2-digit day, 4-digit year. 03-21-2014 MONDY 3-character month, 2-digit day, 4-digit year. Mar 21 2014 DMONY 2-digit day, 3-character month, 4-digit year. 21-Mar-2014 Y2MD 2-digit year, 2-digit month, 2-digit day. Not supported for unloads. 14-03-21 DMY2 2-digit day, 2-digit month, 2-digit year. Not supported for unloads. 21-03-14 MDY2 2-digit month, 2-digit day, 2-digit year. Not supported for unloads. 03-21-14 MONDY2 3-character month, 2-digit day, 2-digit year. Not supported for unloads. Mar 21 14 DMONY2 2-digit day, 3-character month, 2-digit year. Not supported for unloads. 21-Mar-14 - DATETIMEDELIM
- A single-byte character that separates the date component and time component of the timestamp data type.
- DATE_FORMAT
- The format of the date field in the data file. The value can be any of the date format strings that are accepted by the TIMESTAMP_FORMAT scalar function. The default is YYYY-MM-DD. The DATE_FORMAT option and the DATEDELIM or DATESTYLE option are mutually exclusive.
- DECIMALDELIM or DECIMAL_CHARACTER
- The decimal delimiter for the data types FLOAT, DOUBLE, TIME, and TIMESTAMP. Allowed values are
','
and'.'
. - DECPLUSBLANK
- Specifies how the positive decimal value is represented during the unload operation.
- DELIMITER or COLUMN_DELIMITER
- The character that is used to delimit the fields of an input or output record. The default is a
vertical bar (
'|'
).You can specify a character in the 7-bit ASCII range (decimal 1 ‑ 127) in any of the following ways:- As a single character (for example
DELIMITER ';'
) - By specifying its corresponding ASCII decimal value (for example,
DELIMITER 59
orDELIMITER '59'
) - By specifying its corresponding ASCII
hex value (for example,
DELIMITER x'3B'
)
The decimal range 128 - 255 is supported only with the ISO character set input file by specifying its corresponding ASCII decimal value or hex value. If the input file is in the UTF8 character set, this delimiter value range is not supported.
- As a single character (for example
- ENCODING
- The type of data in the file:
- UTF8
- The file uses UTF8 encoding for all character data.
- LATIN9
- The file uses LATIN9 encoding for all character data.
- INTERNAL
- The file uses a mixture of both UTF8 and LATIN9 encoding, or you are unsure which type of encoding is used. The system checks the data and encodes the data as needed. Because this checking of the data reduces overall performance, use this value only when necessary. This is the default.
- INTERNAL
- This is the default option.
- DBCS_GRAPHIC
- This value is allowed only for a load operation, not an unload operation. If this value is
specified, the CCSID option must also be specified. During the load operation, fields of type
GRAPHIC or VARGRAPHIC are encoded using the double-byte character set of the specified CCSID; fields
of all other types are encoded using the mixed-byte character set of the specified
CCSID.Note: ENCODING cannot be set to DBCS_GRAPHIC for a DEL file that was created by the EXPORT utility, because such DEL files are encoded using a single character set.
- ESCAPECHAR or ESCAPE_CHARACTER
- Which character is to be regarded as an escape character. An escape character indicates that the character that follows it, which would otherwise be treated as a field-delimiter character or end-of-row sequence character, is instead treated as part of the value in the field. The escape character is ignored for graphic-string data. There is no default.
- FILLRECORD
- For a load operation, the field of a record are loaded into the columns of a target table from
left to right. This option specifies whether an input record can contain fewer fields than there are
columns defined for the target table:
- TRUE or ON
- An input line can contain fewer fields, provided that all columns for which a value is missing are nullable. Missing values are set to NULL. If one or more columns for which a value is missing is not nullable, the record is rejected.
- FALSE or OFF
- An input line that contains fewer columns is rejected. This is the default.
- FORMAT or FILE_FORMAT
- The data format of the source file:
- TEXT
- The data to be loaded or unloaded is in text-delimited format. This is the default.
- INTERNAL
- The data is in an internal format used by Netezza Platform Software (NPS). This value is valid
only when loading data from a file to the database, not when unloading data to a file. If this value
is specified for the FORMAT option, the following options, and only these options, must also be specified:
- DATAOBJECT or FILE_NAME.
- COMPRESS. This must be set to GZIP.
- BINARY
- The data is in an internal format that is used by Db2.
- FIXED
- The data is in fixed-length format.
- IGNOREZERO or TRIM_NULLS
- Specifies whether the binary value zero in CHAR fields and VARCHAR fields is to be discarded.
- TRUE or ON
- The byte value zero is ignored.
- FALSE or OFF
- The byte value zero is not ignored. This is the default.
- KEEP
- The binary value zero is accepted and allowed as part of the input field.
- INCLUDEHEADER or COLUMN_NAMES
- For an unload operation, whether the table column names are to be included as headers in the
external-table file:
- TRUE or ON
- The table column names are to be included as headers.
- FALSE or OFF
- The table column names are not to be included as headers. This is the default.
- INCLUDEZEROSECONDS
- For an unload operation, whether to specify 00 as the value for seconds when no value for
seconds is available:
- TRUE or ON
- Specify 00 as the value for seconds.
- FALSE or OFF
- Do not specify a value for seconds. This is the default.
- INCLUDEHIDDEN
- For a load operation, specify whether hidden column values are present in a data file.
- LFINSTRING
- Specifies how to interpret an unescaped line-feed (sometimes called an LF or newline) character
within string data:
- TRUE or ON
- An unescaped LF character is interpreted as a record delimiter only if it is in the last field of a record; otherwise, it is treated as data. To cause an LF character that is in the last field of a record to be treated as data, enclose the value of that field in single or double quotation marks.
- FALSE or OFF
- An unescaped LF character is interpreted as a record delimiter regardless of its position. This is the default.
This option is not supported for unload operations. And applies only to line-feed characters, not carriage-return line-feed (CRLF) characters.
Attention: This SQL compatibility enhancement is only available in Db2 Version 11.5 Mod Pack 2 and later versions. - LOGDIR or ERROR_LOG
- The directory to which the following files are written:
- <database>.<schema>.<external-table-name>.<file-name>.<application-handle>.<id>.bad
- A file containing rejected records (that is, records that could not be processed).
- <database>.<schema>.<external-table-name>.<file-name>.<application-handle>.<id>.log
- A log file.
<file-name>
) is truncated so that the maximum is not exceeded.If a .log or .bad file is generated while carrying out an operation on a partition, the name of the generated file is suffixed with a period followed by the 3-digit partition number.
- MAXERRORS or MAX_ERRORS
- For a load operation, the threshold for the number of rejected records at which the system stops processing and immediately rolls back the load. The default is 1 (that is, a single rejected record results in a rollback).
- MAXROWS or MAX_ROWS
- If set to a positive integer, this specifies the maximum number of records (rows) in the external table that are to be processed. If set to 0 (the default), there is no limit and all rows are processed. During a load operation, if MAXROWS is set to a positive value, after that number of rows are processed, regardless of whether some of the rows were rejected or skipped, the system ends the load operation and commits all inserted records.
- MERIDIANDELIM
- A single-byte character that separates the seconds component from the AM token or PM token in the 12-hour delimited and undelimited formats of a time value.
- NOLOG
- Specifies whether the .log file for the external table is created.
- NULLVALUE or NULL_VALUE
- The UTF-8 string of at most 4 bytes that is to be used to indicate a null value. The default is
'NULL'
. - PARTITION
- If you have an MPP
deployment, an external table can be partitioned into several files. The name of each of the
data files that comprise an external table are suffixed with a period followed by a 3-digit number
from 000 to 999 that indicates the number of the partition. For example, if an external table with
the name dataFile.txt is divided into three partitions, the files that comprise it have the
names dataFile.txt.000, dataFile.txt.001, and dataFile.txt.002. These files
must be accessible from all members.For a partitioned external table, the PARTITION option specifies to which partition or partitions the statement applies:
- PARTITION ALL
- The statement applies to all of the partitions that comprise the external table. For an unload operation, this is the only value that is allowed.
- PARTITION (n TO n)
- The statement applies to all of the partitions in the specified range, for example, PARTITION (54 TO 62).
- PARTITION (n,n,…)
- The statement applies only to the specified partition or partitions, for example, PARTITION (53) or PARTITION (51,57,58). If more than one partition number is specified, they must be in ascending order (sqlcode SQL0263N with SQLSTATE=42615) and there can be no duplicates (sqlcode SQL0265N with SQLSTATE=42615).
If a .log or .bad file is generated while carrying out an operation on a partitioned external table, the name of the generated file is suffixed with a period followed by the 3-digit partition number.
If you have an MPP deployment and the PARTITION option is not specified, the external table is treated as single-partitioned table on the coordinator member. The names of the external table file and the .log and .bad files are not suffixed with a partition number.
If you do not have an MPP deployment, the PARTITION option can be specified, but only with the value ALL, (0 to 0), or (0) (sqlcode SQL0644N with SQLSTATE=42615). It will have no effect.
The REMOTESOURCE and PARTITION options are mutually exclusive.
- QUOTEDNULL
- For a load operation, how to interpret a value that is enclosed in single or double quotation
marks and that matches the null value specified by the NULLVALUE or NULL_VALUE option (for example,
"NULL" or 'NULL'):
- TRUE or ON
- The value is interpreted as a null value. This is the default.
- FALSE or OFF
- The value is interpreted as a character string.
- QUOTEDVALUE or STRING_DELIMITER
- Whether data values are enclosed in quotation marks:
- SINGLE or YES
- Data values are enclosed in single quotation marks (').
- DOUBLE
- Data values are enclosed in double quotation marks (").
- NO
- Data values are not enclosed in quotation marks. This is the default.
- RECORDDELIM or RECORD_DELIMITER
- The string literal that is to be interpreted as a row (record) delimiter. The default is
'\n'
. - REMOTESOURCE
- Where the external-table file resides and, if it resides on a remote system, whether the file
data is to be compressed:
- LOCAL
- The file resides on the local server, that is, the system that hosts the database. This is the default.
- YES
- The file resides on a system other than the local server. For example, specify YES if a client system is connected to the database and the file resides on that system. File data is not compressed before it is transferred.
- GZIP
- Similar to YES, except that the file data is compressed using the GZIP compression algorithm before the data is transferred, and is decompressed after it is received. This improves overall performance when a large amount of compressible data is being transferred.
- LZ4
- Similar to YES, except that the file data is compressed using the LZ4 compression algorithm before the data is transferred, and is decompressed after it is received. This improves overall performance when a large amount of compressible data is being transferred.
The REMOTESOURCE and PARTITION options are mutually exclusive. The COMPRESS option cannot be specified if the value of the REMOTESOURCE option is GZIP or LZ4.
- REQUIREQUOTES
- Whether quotation marks are mandatory:
- TRUE or ON
- Quotation marks are mandatory. The QUOTEDVALUE option must be set to YES, SINGLE, or DOUBLE.
- FALSE or OFF
- Quotation marks are not mandatory. This is the default.
- SKIPROWS or SKIP_ROWS
- For a load operation, the number of rows to skip before beginning to load the data. The default is 0. Because skipped rows are processed before they are skipped, a skipped row is still capable of causing a processing error.
- SOCKETBUFSIZE
- The size, in bytes, of the chunks of data that are read from the source file. Valid values range from 64 KB ‑ 800 MB. If you specify a value outside this range, the value is set to the nearest valid value. The default is 8 MB.
- STRICTNUMERIC
- For a load operation, how to treat a value that is to be inserted into a DECIMAL field when its
scale exceeds that defined for the field:
- TRUE or ON
- The row containing the value to be inserted is rejected. For example, if any of the following
values were to be loaded into a DECIMAL(5,3) field, the row containing that value would be
rejected:
12.666666666 -98.34496862785 0.00089
- FALSE or OFF
- The row containing the value to be inserted is accepted, and the portion of the decimal fraction
that exceeds the scale defined for the field is truncated. This is the default. For example, the
values in the previous example would be converted to:
12.666 -98.344 0.000
- SWIFT
- Specifies that the source data file is located in a Swift object store. Use
the DATAOBJECT option to specify the file name.Syntax:
where:SWIFT (endpoint, authKey1, authKey2, bucket)
- endpoint
- A character string that specifies the URL of the SWIFT web service.
- authKey1
- A character string that specifies the access ID or username of the Swift open stack account used to validate the user.
- authKey2
- A character string that specifies the password of the Swift open stack account used to validate the user.
- bucket
- The name of the Swift open stack container (bucket) in which the file resides.
Example:CREATE EXTERNAL TABLE exttab1(a int) using (dataobject 'datafile1.dat' swift('https://dal05.objectstorage.softlayer.net/auth/v1.0/', 'XXXOS123456-2:xxx123456', 'b207c6e974020737d92174esdf6d5be9382aa4c335945a14eaa9172c70f8df16', 'my_dev' ) )
- S3
- Specifies that the source data file is located in an S3 compatible object store. Use
the DATAOBJECT option to specify the file
name.Syntax:where:
S3 (endpoint, authKey1, authKey2, bucket)
- endpoint
- A character string that specifies the URL of the S3 compatible web service.
- authKey1
- A character string that specifies the S3 access key ID of the access keys used to validate the user and all user actions. For IBM Cloud Object Storage, this is the access key ID from the HMAC credentials.
- authKey2
- A character string that specifies the S3 secret key of the access keys that are used to validate the user and all user actions. For IBM Cloud Object Storage, this is the secret access key from the HMAC credentials.
- bucket
- The name of the S3 bucket in which the file resides.
Note: For IBM Cloud Object Storage, to create HMAC credentials, when creating new service credentials, specify{"HMAC:true}
in the Add Inline Configuration Parameters field.Example using AWS S3:CREATE EXTERNAL TABLE exttab2(a int) using (dataobject 'datafile2.dat' s3('s3.amazonaws.com', 'XXXOS123456-2:xxx123456', 'bs07c6e974040737d92174e5e96d5be9382aa4c33xxx5a14eaa9172c70f8df16', 'my_dev' ) )
Example using IBM Cloud Object Storage:CREATE EXTERNAL TABLE exttab2(a int) using (dataobject 'datafile2.dat' s3('s3-api.us-geo.objectstorage.softlayer.net', '1a2bkXXXsaddntLo0xX0', 'XXxxiEPjJ7T7WBUz74E6abcdABCDE8Q7RgU4gYY9', 'my_dev' ) )
- AZURE
- Specifies
that the source data file is located in Microsoft Azure Blob Storage. The REMOTESOURCE, SWIFT, S3,
and AZURE options are mutually exclusive. Use the DATAOBJECT option to specify the file name.
Syntax:Syntax:where:
AZURE (endpoint, authKey1, authKey2, bucket)
- endpoint
- A character string that specifies the URL of the AZURE web service.
- authKey1
- A character string that specifies the access ID or username of the Azure Blob Storage account used to validate the user.
- authKey2
- A character string that specifies the access key of the Azure Blob Storage account used to validate the user.
- bucket
- The name of the Azure Blob Storage container (bucket) in which the file resides.
Example:CREATE EXTERNAL TABLE exttab1(a int) using (dataobject 'datafile1.dat' azure('https://my_account.blob.core.windows.net', 'my_account', 'lW+oHjmZecPS++IKgThAHlMUOaFUA5C6Z2RlFmc9JPpK34RO/ZIOywzILxJnzGPHz6d/yDrcQDAwH5wySbOZMQ==', 'my_bucket' ) )
Example using IBM Cloud Object Storage:CREATE EXTERNAL TABLE exttab2(a int) using (dataobject 'datafile2.dat' s3('s3-api.us-geo.objectstorage.softlayer.net', '1a2bkXXXsaddntLo0xX0', 'XXxxiEPjJ7T7WBUz74E6abcdABCDE8Q7RgU4gYY9', 'my_dev' ) )
- TIMEDELIM
- The single-byte character that is to separate time components (hours, minutes, and seconds). The
default is
':'
. If TIMEDELIM is set to an empty string, hours, minutes, and seconds must all be specified as two-digit numbers. The TIMESTAMP_FORMAT and TIMEDELIM options are mutually exclusive. - TIMEROUNDNANOS or TIMEEXTRAZEROS
-
Note: This option applies only to TIMESTAMP columns.
- TIMESTYLE
- The time format that is to be used in the data file:
- 24HOUR
- 24-hour format, for example 23:55. This is the default.
- 12HOUR
- 12-hour format, for example 11:55 PM. An AM or PM token can be preceded by a single space and is not case-sensitive.
- TIMESTAMP_FORMAT
- The format of the timestamp field in the data file. The value can be any of the format strings that are accepted by the TIMESTAMP_FORMAT scalar function. The default is 'YYYY-MM-DD HH.MI.SS'. The TIMESTAMP_FORMAT option and the TIMEDELIM, DATEDELIM, TIMESTYLE, or DATESTYLE option are mutually exclusive.
- TIME_FORMAT
- The format of the time field in the data file. The value can be any of the time format strings that are accepted by the TIMESTAMP_FORMAT scalar function. The default is HH.MI.SS. The TIME_FORMAT option and a TIMEDELIM or TIMESTYLE option are mutually exclusive.
- TRIMBLANKS
- How an external table is to treat leading or trailing blanks (that is, leading or trailing space
characters) in a string:
- LEADING
- All leading blanks (that is, blanks that precede the first non-blank character) are removed.
- TRAILING
- All trailing blanks (that is, blanks that follow the last non-blank character) are removed.
- BOTH
- All leading and trailing blanks are removed.
- NONE
- No blanks are removed. This is the default.
- TRUNCSTRING or TRUNCATE_STRING
- How the system processes a CHAR or VARCHAR string that exceeds its declared storage size:
- TRUE
- The system truncates a string value that exceeds its declared storage size.
- FALSE
- The system returns an error when a string value exceeds its declared storage size. This is the default.
- Y2BASE
- The year that is the beginning of the 100-year range. Years that are specified as 2 digits are counted from this year. The default is 2000. This option must be specified when DATESTYLE is set to Y2MD, MDY2, DMY2, MONDY2 or DMONY2.
Table 3. Options Option Default Applies to Load Applies to Unload Azure (no default) Y Y BOOLSTYLE or BOOLEAN_STYLE 1_0 Y Y CARDINALITY (no default) Y Y CCSID (no default) Y Y COMPRESS NO Y Y CRINSTRING FALSE Y Y CTRLCHARS FALSE Y N DATAOBJECT or FILE_NAME (no default) Y Y DATEDELIM '-' Y Y DATETIMEDELIM A space (' ') Y Y DATESTYLE YMD Y Y DATE_FORMAT YYYY-MM-DD Y Y DECIMALDELIM or DECIMAL_CHARACTER '.' Y Y DELIMITER '|' Y Y ENCODING INTERNAL Y Y1 ESCAPECHAR or ESCAPE_CHARACTER (no default) Y Y FILLRECORD FALSE Y N FORMAT or FILE_FORMAT TEXT Y Y IGNOREZERO or TRIM_NULLS FALSE Y N INCLUDEHEADER or COLUMN_NAMES FALSE N Y INCLUDEZEROSECONDS FALSE Y Y INCLUDEHIDDEN FALSE Y N LFINSTRING FALSE Y N LOGDIR or ERROR_LOG target directory of external-table file Y N MULTIPARTSIZEMB value specified by the MULTIPARTSIZEMB dbm config parameter. Y N MAXERRORS or MAX_ERRORS 1 Y N MAXROWS or MAX_ROWS 0 Y N MERIDIANDELIM A space (' ') Y Y NOLOG FALSE Y Y NULLVALUE or NULL_VALUE 'NULL' Y Y PARTITION (no default) Y Y QUOTEDNULL TRUE Y N QUOTEDVALUE NO Y N RECORDDELIM or RECORD_DELIMITER '\n' Y N REMOTESOURCE LOCAL Y Y REQUIREQUOTES FALSE Y N SKIPROWS or SKIP_ROWS 0 Y N SOCKETBUFSIZE 8 MB Y Y STRICTNUMERIC FALSE Y N SWIFT (no default) Y Y S3 (no default) Y Y TIMEDELIM ':' Y Y TIMEROUNDNANOS or TIMEEXTRAZEROS FALSE Y N TIMESTAMP_FORMAT 'YYYY-MM-DD HH.MI.SS' Y Y TIMESTYLE 24HOUR Y Y TIME_FORMAT HH.MI.SS Y Y TRIMBLANKS NONE Y Y TRUNCSTRING or TRUNCATE_STRING FALSE Y N Y2BASE 2000 Y N 1 Only for the values INTERNAL, UTF8, and LATIN9. - joined-table
A joined-table specifies an intermediate result set that is the result of one or more join operations. For more information, see joined-table.
- period-specification
A period-specification identifies an intermediate result table consisting of the rows of the referenced table where the period matches the specification. A period-specification can be specified following the name of a temporal table or the name of a view. The same period name must not be specified more than once for the same table reference (SQLSTATE 428HY). The rows of the table reference are derived by application of the period specifications.
If the table is a system-period temporal table and a period-specification for the SYSTEM_TIME period is not specified, the table reference includes all current rows and does not include any historical rows of the table. If the table is an application-period temporal table and a period-specification for the BUSINESS_TIME period is not specified, the table reference includes all rows of the table. If the table is a bitemporal table and a period-specification is not specified for both SYSTEM_TIME and BUSINESS_TIME, the table reference includes all current rows of the table and does not include any historical rows of the table.
If the table reference is a single-view-reference, the rows of the view reference are derived by application of the period specifications to all of the temporal tables accessed when computing the result table of the view. If the view does not access any temporal table, then the period-specification has no effect on the result table of the view. If period-specification is used, the view definition or any view definitions referenced when computing the result table of the view must not include any references to compiled SQL functions or external functions with a data access indication other than NO SQL (SQLSTATE 428HY).
If the CURRENT TEMPORAL SYSTEM_TIME special register is set to a value other than the null value, then a period-specification that references SYSTEM_TIME must not be specified for the table reference or view reference, unless the value in effect for the SYSTIMESENSITIVE bind option is NO (SQLSTATE 428HY).
If the CURRENT TEMPORAL BUSINESS_TIME special register is set to a value other than the null value, then a period specification that references BUSINESS_TIME must not be specified for the table reference or view reference, unless the value in effect for the BUSTIMESENSITIVE bind option is NO (SQLSTATE 428HY).
- FOR SYSTEM_TIME
Specifies that the SYSTEM_TIME period is used for the period-specification. If the clause is specified following a table-name, the table must be a system-period temporal table (SQLSTATE 428HY). FOR SYSTEM_TIME must not be specified if the value of the CURRENT TEMPORAL SYSTEM_TIME special register is not the null value and the SYSTIMESENSITIVE bind option is set to YES (SQLSTATE 428HY).
- FOR BUSINESS_TIME
Specifies that the BUSINESS_TIME period is used for the period-specification. If the clause is specified following a table-name, BUSINESS_TIME must be a period defined in the table (SQLSTATE 4274M). FOR BUSINESS_TIME must not be specified if the value of the CURRENT TEMPORAL BUSINESS_TIME special register is not the null value and the BUSTIMESENSITIVE bind option is set to YES (SQLSTATE 428HY).
- value, value1, and value2
The value, value1, and value2 expressions return the null value or a value of one of the following built-in data types (SQLSTATE 428HY): a DATE, a TIMESTAMP, or a character string that is not a CLOB or DBCLOB. If the argument is a character string, it must be a valid character string representation of a timestamp or a date (SQLSTATE 22007). For the valid formats of string representations of timestamp values, see the section
String representations of datetime values
in the topic Datetime values.Each expression can contain any of the following supported operands (SQLSTATE 428HY):- Constant
- Special register
- Variable.
For details, refer to
References to variables
in Identifiers. - Parameter marker
- Scalar function whose arguments are supported operands (user-defined functions and non-deterministic functions cannot be used)
- CAST specification where the cast operand is a supported operand
- Expression using arithmetic operators and operands
- AS OF value
Specifies that the table reference includes each row for which the value of the begin column for the specified period is less than or equal to value, and the value of the end column for the period is greater than value. If value is the null value, the table reference is an empty table.
Example: The following query returns the insurance coverage information for insurance policy number 100 on August 31, 2010.SELECT coverage FROM policy_info FOR BUSINESS_TIME AS OF '2010-08-31' WHERE policy_id = '100'
- FROM value1 TO value2
Specifies that the table reference includes rows that exist for the period specified from value1 to value2. A row is included in the table reference if the value of the begin column for the specified period in the row is less than value2, and the value of the end column for the specified period in the row is greater than value1. The table reference contains zero rows if value1 is greater than or equal to value2. If value1 or value2 is the null value, the table reference is an empty table.
Example: The following query returns the insurance coverage information for insurance policy 100, during the year 2009 (from January 1, 2009 at 12:00 AM until before January 1, 2010).SELECT coverage FROM policy_info FOR BUSINESS_TIME FROM '2009-01-01' TO '2010-01-01' WHERE policy_id = '100'
- BETWEEN value1 AND value2
Specifies that the table reference includes rows in which the specified period overlaps at any point in time between value1 and value2. A row is included in the table reference if the value of the begin column for the specified period in the row is less than or equal to value2 and the value of the end column for the specified period in the row is greater than value1. The table reference contains zero rows if value1 is greater than value2. If value1 is equal to value2, the expression is equivalent to AS OF value1. If value1 or value2 is the null value, the table reference is an empty table.
Example: The following query returns the insurance coverage information for insurance policy number 100, during the year 2008 (between January 1, 2008 and December 31, 2008 inclusive).SELECT coverage FROM policy_info FOR BUSINESS_TIME BETWEEN '2008-01-01' AND '2008-12-31' WHERE policy_id = '100'
Following are syntax alternatives for period-specification clauses:- AS OF TIMESTAMP can be specified in place of FOR SYSTEM_TIME AS OF
- VERSIONS BETWEEN TIMESTAMP can be specified in place of FOR SYSTEM_TIME BETWEEN
- correlation-clause
- The exposed names of all table references must be unique. An exposed name is:
- A correlation-name
- A table-name that is not followed by a correlation-name
- A view-name that is not followed by a correlation-name
- A nickname that is not followed by a correlation-name
- An alias-name that is not followed by a correlation-name
If a correlation-clause clause does not follow a function-name reference, xmltable-expression expression, nested table expression, or data-change-table-reference reference, or if a typed-correlation-clause clause does not follow a function-name reference, then there is no exposed name for that table reference.
Each correlation-name is defined as a designator of the immediately preceding table-name, view-name, nickname, function-name reference, xmltable-expression, nested table expression, or data-change-table-reference. Any qualified reference to a column must use the exposed name. If the same table name, view, or nickname is specified twice, at least one specification must be followed by a correlation-name. The correlation-name is used to qualify references to the columns of the table, view or nickname. When a correlation-name is specified, column-names can also be specified to give names to the columns of the table reference. If the correlation-clause does not include column-names, the exposed column names are determined as follows:- Column names of the referenced table, view, or nickname when the table-reference is a table-name, view-name, nickname, or alias-name
- Column names specified in the RETURNS clause of the CREATE FUNCTION statement when the table-reference is a function-name reference
- Column names specified in the COLUMNS clause of the xmltable-expression when the table-reference is an xmltable-expression
- Column names exposed by the fullselect when the table-reference is a nested-table-expression
- Column names from the target table of the data change statement, along with any defined INCLUDE columns when the table-reference is a data-change-table-reference
- tablesample-clause
The optional tablesample-clause can be used to obtain a random subset (a sample) of the rows from the specified table-name, rather than the entire contents of that table-name, for this query. This sampling is in addition to any predicates that are specified in the where-clause. Unless the optional REPEATABLE clause is specified, each execution of the query will usually yield a different sample, except in degenerate cases where the table is so small relative to the sample size that any sample must return the same rows. The size of the sample is controlled by the numeric-expression1 in parentheses, representing an approximate percentage (P) of the table to be returned.
- TABLESAMPLE
-
The method by which the sample is obtained is specified after the TABLESAMPLE keyword, and can be either BERNOULLI or SYSTEM. The exact number of rows in the sample might be different for each execution of the query, but on average is approximately P percent of the table, before any predicates further reduce the number of rows.
The table-name must be a stored table. It can be a materialized query table (MQT) name, but not a subselect or table expression for which an MQT has been defined, because there is no guarantee that the database manager will route to the MQT for that subselect.
Semantically, sampling of a table occurs before any other query processing, such as applying predicates or performing joins. Repeated accesses of a sampled table within a single execution of a query (such as in a nested-loop join or a correlated subquery) will return the same sample. More than one table can be sampled in a query.
- BERNOULLI
BERNOULLI sampling considers each row individually. It includes each row in the sample with probability P/100 (where P is the value of numeric-expression1), and excludes each row with probability 1 - P/100, independently of the other rows. So if the numeric-expression1 evaluated to the value 10, representing a ten percent sample, each row would be included with probability 0.1, and excluded with probability 0.9.
- SYSTEM
-
SYSTEM sampling permits the database manager to determine the most efficient manner in which to perform the sampling. In most cases, SYSTEM sampling applied to a table-name means that each page of table-name is included in the sample with probability P/100, and excluded with probability 1 - P/100. All rows on each page that is included qualify for the sample. SYSTEM sampling of a table-name generally executes much faster than BERNOULLI sampling, because fewer data pages are retrieved. However, SYSTEM sampling can often yield less accurate estimates for aggregate functions, such as SUM(SALES), especially if the rows of table-name are clustered on any columns referenced in that query. The optimizer might in certain circumstances decide that it is more efficient to perform SYSTEM sampling as if it were BERNOULLI sampling. An example is when a predicate on table-name can be applied by an index and is much more selective than the sampling rate P.
- numeric-expression1
The numeric-expression1 specifies the size of the sample to be obtained from table-name, expressed as a percentage. It must be a constant numeric expression that cannot contain columns. The expression must evaluate to a positive number that is less than or equal to 100, but can be between 1 and 0. For example, a value of 0.01 represents one one-hundredth of a percent, meaning that 1 row in 10 000 is sampled, on average. A numeric-expression1 that evaluates to 100 is handled as if the tablesample-clause were not specified. If numeric-expression1 evaluates to the null value, or to a value that is greater than 100 or less than 0, an error is returned (SQLSTATE 2202H).
- REPEATABLE (numeric-expression2)
It is sometimes desirable for sampling to be repeatable from one execution of the query to the next; for example, during regression testing or query debugging. This can be accomplished by specifying the REPEATABLE clause. The REPEATABLE clause requires the specification of a numeric-expression2 in parentheses, which serves the same role as the seed in a random number generator. Adding the REPEATABLE clause to the tablesample-clause of any table-name ensures that repeated executions of that query (using the same value for numeric-expression2) return the same sample, assuming that the data itself has not been updated, reorganized, or repartitioned. To guarantee that the same sample of table-name is used across multiple queries, use of a global temporary table is recommended. Alternatively, the multiple queries can be combined into one query, with multiple references to a sample that is defined using the WITH clause.
Examples:- Request a 10% Bernoulli sample of the Sales table for auditing purposes.
SELECT * FROM Sales TABLESAMPLE BERNOULLI(10)
- Compute
the total sales revenue in the Northeast region for each product category, using a random 1% SYSTEM
sample of the Sales table. The semantics of SUM are for the sample itself, so to extrapolate the
sales to the entire Sales table, the query must divide that SUM by the sampling rate (0.01).
SELECT SUM(Sales.Revenue) / (0.01) FROM Sales TABLESAMPLE SYSTEM(1) WHERE Sales.RegionName = 'Northeast' GROUP BY Sales.ProductCategory
- Using
the REPEATABLE clause, modify the previous query to ensure that the same (yet random) result is
obtained each time the query is executed. The value of the constant enclosed by parentheses is
arbitrary.
SELECT SUM(Sales.Revenue) / (0.01) FROM Sales TABLESAMPLE SYSTEM(1) REPEATABLE(3578231) WHERE Sales.RegionName = 'Northeast' GROUP BY Sales.ProductCategory
- table-UDF-cardinality-clause
- The table-UDF-cardinality clause can be specified for each
user-defined table function reference within the FROM clause. This option indicates the expected
number of rows to be returned only for the SELECT statement that contains it. The CARDINALITY and
CARDINALITY MULTIPLIER clauses are not allowed if the table function is an inlined SQL table
function (SQLSTATE 42887).
- CARDINALITY integer-constant
- Specifies an estimate of the expected number of rows that are returned by the reference to the user-defined function. The value range of integer-constant is from 0 to 9 223 372 036 854 775 807 inclusive.
- CARDINALITY MULTIPLIER numeric-constant
-
The product of the specified CARDINALITY MULTIPLIER numeric-constant and the reference cardinality value are used by the database server as the expected number of rows that are returned by the table function reference.
In this case, numeric-constant can be in the integer, decimal, or floating-point format. The value must be greater than or equal to zero. If the decimal number notation is used, the number of digits can be up to 31. An integer value is treated as a decimal number with no fraction. If zero is specified or the computed cardinality is less than 1, the cardinality of the reference to the user-defined table function is assumed to be 1.
The value in the CARDINALITY column of SYSSTAT.ROUTINES for the table function name is used as the reference cardinality value. If no value is set in the CARDINALITY column of SYSSTAT.ROUTINES, a finite value is assumed as its default value for the reference cardinality value. This finite value is the same value that is assumed for tables for which the RUNSTATS utility has not gathered statistics.
Only a numeric constant can follow the keyword CARDINALITY or CARDINALITY MULTIPLIER. A host variable or parameter marker is not supported. Specifying a cardinality value in a table function reference does not change the CARDINALITY column value for the function in the SYSSTAT.ROUTINES catalog view.
The CARDINALITY value for external and compiled SQL user-defined table functions can be changed by updating the CARDINALITY column in the SYSSTAT.ROUTINES catalog view. The CARDINALITY value for an external table function can also be initialized by specifying the CARDINALITY option in the CREATE FUNCTION (external table) statement when a user-defined table function is created.
- typed-correlation-clause
- A typed-correlation-clause clause defines the appearance and contents of the table generated by a generic table function. This clause must be specified when the table-function-references is a generic table function and cannot be specified for any other table reference. The following data-type values are supported in generic table functions:
Table 4. Data types supported in generic table functions SQL column data type Equivalent Java™ data type SMALLINT short INTEGER int BIGINT long REAL float DOUBLE double DECIMAL(p,s) java.math.BigDecimal NUMERIC(p,s) java.math.BigDecimal CHAR(n) java.lang.String CHAR(n) FOR BIT DATA COM.ibm.db2.app.Blob VARCHAR(n) java.lang.String VARCHAR(n) FOR BIT DATA COM.ibm.db2.app.Blob GRAPHIC(n) java.lang.String VARGRAPHIC(n) String BLOB(n) COM.ibm.db2.app.Blob CLOB(n) COM.ibm.db2.app.Clob DBCLOB(n) COM.ibm.db2.app.Clob DATE String TIME String TIMESTAMP String XML AS CLOB(n) COM.ibm.db2.jcc.DB2Xml - continue-handler
Certain errors that occur within a nested-table-expression can be tolerated, and instead of returning an error, the query can continue and return a result. This is referred to as an error tolerant nested-table-expression.
Specifying the RETURN DATA UNTIL clause will cause any rows that are returned from the fullselect before the indicated condition is encountered to make up the result set from the fullselect. This means that a partial result set (which can also be an empty result set) from the fullselect is acceptable as the result for the nested-table-expression.
The FEDERATED keyword restricts the condition to handle only errors that occur at a remote data source.
The condition can be specified as an SQLSTATE value, with a string-constant length of 5. You can optionally specify an SQLCODE value for each specified SQLSTATE value. For portable applications, specify SQLSTATE values as much as possible, because SQLCODE values are generally not portable across platforms and are not part of the SQL standard.
Only certain conditions can be tolerated. Errors that do not allow the rest of the query to be executed cannot be tolerated, and an error is returned for the whole query. The specific-condition-value might specify conditions that cannot actually be tolerated by the database manager, even if a specific SQLSTATE or SQLCODE value is specified, and for these cases, an error is returned.
A query or view containing an error tolerant nested-table-expression is read-only.
The fullselect of an error tolerant nested-table-expression is not optimized using materialized query tables.
- specific-condition-value
- The following SQLSTATE values and SQLCODE values have the potential, when specified, to be tolerated by the database manager:
- SQLSTATE 08001; SQLCODEs -1336, -30080, -30081, -30082
- SQLSTATE 08004
- SQLSTATE 42501
- SQLSTATE 42704; SQLCODE -204
- SQLSTATE 42720
- SQLSTATE 28000
Correlated references in table-references
Example 1: SELECT t.c1, z.c5
FROM t, TABLE( tf3(t.c2) ) AS z -- t precedes tf3
WHERE t.c3 = z.c4; -- in FROM, so t.c2
-- is known
Example 2: SELECT t.c1, z.c5
FROM t, TABLE( tf4(2 * t.c2) ) AS z -- t precedes tf4
WHERE t.c3 = z.c4; -- in FROM, so t.c2
-- is known
Example 3: SELECT d.deptno, d.deptname,
empinfo.avgsal, empinfo.empcount
FROM department d,
LATERAL (SELECT AVG(e.salary) AS avgsal,
COUNT(*) AS empcount
FROM employee e -- department precedes nested
WHERE e.workdept=d.deptno -- table expression and
) AS empinfo; -- LATERAL is specified,
-- so d.deptno is known
Example 4: SELECT t.c1, z.c5
FROM TABLE( tf6(t.c2) ) AS z, t -- cannot resolve t in t.c2!
WHERE t.c3 = z.c4; -- compare to Example 1 above.
Example 5: SELECT a.c1, b.c5
FROM TABLE( tf7a(b.c2) ) AS a, TABLE( tf7b(a.c6) ) AS b
WHERE a.c3 = b.c4; -- cannot resolve b in b.c2!
Example 6: SELECT d.deptno, d.deptname,
empinfo.avgsal, empinfo.empcount
FROM department d,
(SELECT AVG(e.salary) AS avgsal,
COUNT(*) AS empcount
FROM employee e -- department precedes nested
WHERE e.workdept=d.deptno -- table expression but
) AS empinfo; -- LATERAL is not specified,
-- so d.deptno is unknown