CREATE EXTERNAL TABLE statement

While tables typically reside in a database, an external table resides in a text-based, delimited file, or in a fixed-length-format file outside of a database.

Use an external table to:
  • Store data outside the database while retaining the ability to query that data. To unload data from the database into an external file, specify the external table as the target table in one of the following SQL statements:
    • INSERT SQL
    • SELECT INTO SQL
    • CREATE EXTERNAL TABLE AS SELECT SQL
  • Load data from an external file into a table in the database. You can perform operations such as casts, joins, and dropping columns to manipulate data during loading. To load data into the database from an external table, use a FROM clause in a SELECT SQL statement as you would for any other table.
  • Transfer data to another application.

An advantage of using external tables for Extraction-Transformation-Loading (ETL) processes is that they can be carried out using plain SQL. Because an SQL-based ETL process can be initiated from any SQL client, it eliminates the need for special ETL tools.

An external table is of one of the following types:
Named
The external table has a name and catalog entry similar to a normal table.
Transient
The external table has a system-generated name of the form SYSTET<number> and does not have a catalog entry. For example, the system might create a transient external table to hold the result of a query. The lifetime of such a table is the duration of the query.

Invocation

This statement can be embedded in an application program or issued using dynamic SQL statements. It is an executable statement that can be dynamically prepared only if DYNAMICRULES run behavior is in effect for the package (SQLSTATE 42509).

Authorization

The privileges that are held by the authorization ID of the statement must have either DBADM authority or the following privileges or authorities:
  • CREATETAB authority
  • One of the following privileges or authorities:
    • USE privilege on the table space
    • SYSADM authority
    • SYSCTRL authority
  • One of the following privileges or authorities:
    • IMPLICIT_SCHEMA authority on the database (if the implicit or explicit schema name of the table does not exist)
    • CREATEIN privilege on the schema (if the schema name of the table refers to an existing schema)
    • SCHEMAADM authority on the schema (if the schema name of the table refers to an existing schema)
If a subtable is being defined, one of the following conditions must be met:
  • The authorization ID must be the same as the owner of the root table of the table hierarchy.
  • The privileges that are held by the authorization ID must include SCHEMAADM authority on the schema that contains the root table of the table hierarchy.
  • The privileges that are held by the authorization ID must include DBADM authority.

Syntax

The syntax of this statement depends on the nature of the external table that is to be created:
  • Use the following syntax to create, in the catalog, a table definition for a new external table. Specifying a table name is mandatory, so the resulting external table is a named table. A DATAOBJECT or FILE_NAME option must be specified to identify the target file.
    Read syntax diagramSkip visual syntax diagram CREATE EXTERNAL TABLE table-name (,column-definition)LIKEtable-name1view-namenickname USING ( optionoption-value)
  • Use the following syntax to use an existing table as a template for a new external table and to populate it with the contents of the source table. If this statement specifies a table name explicitly, the resulting external table is a named table; otherwise, the resulting external table is a transient table. The file name must be specified by either the file-name parameter or a DATAOBJECT or FILE_NAME option.
    Read syntax diagramSkip visual syntax diagram CREATE EXTERNAL TABLE table-name file-name USING ( optionoption-value) AS fullselect
If you specify one or more column definitions, the following additional parameters apply:
column-definition
Read syntax diagramSkip visual syntax diagram column-name built-in-typeNOT NULL
built-in-type
Read syntax diagramSkip visual syntax diagramSMALLINTINTEGERINTBIGINTDECIMALDECNUMERICNUM(5,0)( precision-integer,0, scale-integer)FLOAT(53)( integer)REALDOUBLEPRECISIONDECFLOAT(34)(16)CHARACTERCHAR(1)( integerOCTETSCODEUNITS32)VARCHARCHARACTERCHARVARYING( integerOCTETSCODEUNITS32)FOR BIT DATACLOBCHARACTERCHARLARGE OBJECT(65535)( integerKOCTETSCODEUNITS32)GRAPHIC(1)( integerCODEUNITS16CODEUNITS32)VARGRAPHIC( integerCODEUNITS16CODEUNITS32)DBCLOB(32767)( integerKCODEUNITS16CODEUNITS32)NCHARNATIONALCHARCHARACTER(1)( integer)NVARCHARNCHAR VARYINGNATIONALCHARCHARACTERVARYING( integer)NCLOBNCHAR LARGE OBJECTNATIONAL CHARACTER LARGE OBJECT(16383)( integerK)BINARY(1)( integer)VARBINARYBINARY VARYING(integer)BLOBBINARY LARGE OBJECT(65535)( integerK)DATETIMETIMESTAMP(6)(integer)BOOLEAN

Description

table-name
The names of the external table. The name, including the implicit or explicit qualifier, must not identify a table, view, nickname, or alias that is already described in the catalog. The schema name cannot be SYSIBM, SYSCAT, SYSFUN, or SYSSTAT (SQLSTATE 42939).
file-name
The fully-qualified name of the file (or any medium that can be treated as a file) that is to contain the external table to be created. If this parameter is specified, the DATAOBJECT or FILE_NAME option cannot be specified.
If the input data file is gzip compressed, this will be detected and the data is uncompressed by the Db2® server. The COMPRESS GZIP option can be used, but it is not mandatory for loading gzip compressed files.
If the input data file is lz4 compressed, the COMPRESS LZ4 option must be specified.
  • For a cloud-based managed service such as Db2 Warehouse on Cloud, when the REMOTESOURCE option is set to LOCAL (this is its default value), the path to the external table file is relative to your Db2 managed services home directory. Otherwise, the path to the external table file is relative to the path /home/ followed by the authorization ID of the table definer. For example, if the authorization ID of the table definer is user1, the path to the external table file is relative to /home/user1/.
  • For an on-premise product such as Db2 Warehouse, when both the REMOTESOURCE option is set to LOCAL (this is its default value) and the extbl_strict_io configuration parameter is set to NO, the path to the external table file is an absolute path and must be one of the paths specified by the extbl_location configuration parameter. Otherwise, the path to the external table file is relative to the path that is specified by the extbl_location configuration parameter followed by the authorization ID of the table definer. For example, if extbl_location is set to /home/xyz and the authorization ID of the table definer is user1, the path to the external table file is relative to /home/xyz/user1/.
The file name must be a valid UTF-8 string.
For a load operation, the following conditions apply:
  • The file must already exist.
  • Required permissions:
    • If the external table is a named external table, the owner must have read permission for the file and write permission for the LOGDIR directory.
    • If the external table is a transient external table, the authorization ID of the statement must have read permission for the file and write permission for the LOGDIR directory.
For an unload operation, the following conditions apply:
  • If the file exists, it is overwritten.
  • Required permissions:
    • If the external table is a named external table, the owner must have read and write permission for the directory of this file.
    • If the external table is transient, the authorization ID of the statement must have read and write permission for the directory of this file.
column-definition
Defines the attributes of a column.
column-name
Names a column of the table. The name cannot be qualified, and the same name cannot be used for more than one column of the table (SQLSTATE 42711).
built-in-type
One of the following built-in data types:
SMALLINT
A small integer.
[INTEGER | INT]
A large integer.
BIGINT
A big integer.
[DECIMAL | DEC | NUMERIC | NUM](precision-integer, scale-integer)
A decimal number.
  • The precision integer specifies the total number of digits. It must be in the range 1 ‑ 31. The default is 5.
  • The scale integer specifies the number of digits to the right of the decimal point. It cannot be negative and cannot exceed the precision. The default is 0.
FLOAT(integer)
A single or double-precision floating-point number. If the specified length is in the range:
  • 1 - 24, the number uses single precision
  • 25 - 53, the number uses double-precision
Instead of FLOAT, you can specify:
REAL
For single precision floating-point.
DOUBLE
For double-precision floating-point.
DOUBLE PRECISION
For double-precision floating-point.
FLOAT
For double-precision floating-point.
DECFLOAT(precision-integer)
A decimal floating-point number. The precision integer specifies the total number of digits, which can be either 16 or 34. The default is 34.
[CHARACTER | CHAR](integer [OCTETS | CODEUNITS32])
A fixed-length character string of the specified number of code units. This number can range from 1 ‑ 255 OCTETS or from 1 - 63 CODEUNITS32. The default is 1.
[VARCHAR | CHARACTER VARYING | CHAR VARYING](integer [OCTETS | CODEUNITS32])
A varying-length character string with a maximum length of the specified number of code units. This number can range from 1 ‑ 32672 OCTETS or from 1 - 8168 CODEUNITS32.
FOR BIT DATA
Specifies that the contents of the column are to be treated as bit (binary) data. During data exchange with other systems, code page conversions are not performed. Comparisons are done in binary, irrespective of the database collating sequence.
The FOR BIT DATA clause cannot be specified with string units CODEUNITS32 (SQLSTATE 42613).
[CLOB | CHARACTER LARGE OBJECT | CHAR LARGE OBJECT](integer [K] [OCTETS | CODEUNITS32])
A character large object string with a maximum length of the specified number of code units. The default maximum length is 65,535 bytes.
If you want to multiply the length integer by 1024, specify a K (kilo) multiplier.
  • Regardless of whether you use a K multiplier, the resulting length is limited by the maximum length of a CLOB column in an external table, which is 65,535 OCTETS, 32,767 CODEUNITS16, or 16,383 CODEUNITS32. Note that 64K OCTETS and 16K CODEUNITS32 each exceed the maximum length by one, and so are not allowed.
  • Any number of spaces (including zero spaces) are allowed between the data type and the length specification or between the length integer and the K multiplier. For example, the following specifications are all equivalent and valid:
    CLOB(50K)
    CLOB(50 K)
    CLOB (50   K)
  • The K multiplier can be specified in either uppercase or lowercase.

The default string units are OCTETS.

OCTETS
Specifies that the units of the length attribute are bytes.
CODEUNITS32
Specifies that the units of the length attribute are Unicode UTF-32 code units, which approximates counting in characters. This does not affect the underlying code page of the data type. The actual length of a data value is determined by counting the UTF-32 code units as if the data were converted to UTF-32.
GRAPHIC(integer [CODEUNITS16 | CODEUNITS32])
A fixed-length graphic string of the specified length, which can range from 1 ‑ 127 double bytes, 1 ‑ 127 CODEUNITS16, or 1 ‑ 63 CODEUNITS32. The default length is 1.
VARGRAPHIC(integer [CODEUNITS16 | CODEUNITS32])
A varying-length graphic string of the specified maximum length, which can range from 1 ‑ 16336 double bytes, 1 ‑ 16336 CODEUNITS16, or 1 ‑ 8168 CODEUNITS32.
DBCLOB(integer [K] [CODEUNITS16 | CODEUNITS32])
A character large object string of the specified maximum length in double bytes, Unicode UTF-16 code units, or Unicode UTF-32 code units. The default maximum length is 32,767 double bytes.
If you want to multiply the length integer by 1024, specify a K (kilo) multiplier.
  • Regardless of whether you use a K multiplier, the resulting length is limited by the maximum length of a DBCLOB column in an external table, which is 32,767 CODEUNITS16 or 16,383 CODEUNITS32. Note that 32K CODEUNITS16 and 16K CODEUNITS32 each exceed the maximum length by one, and so are not allowed.
  • Any number of spaces (including zero spaces) are allowed between the data type and the length specification or between the length integer and the K multiplier. For example, the following specifications are all equivalent and valid:
    DBCLOB(50K)
    DBCLOB(50 K)
    DBCLOB (50   K)
  • The K multiplier can be specified in either uppercase or lowercase.

The default string units are CODEUNITS16.

CODEUNITS16
Specifies that the units of the length attribute are Unicode UTF-16 code units, which is the same as counting in double bytes.
CODEUNITS32
Specifies that the units of the length attribute are Unicode UTF-32 code units. This does not affect the underlying code page of the data type. The actual length of a data value is determined by counting the UTF-32 code units as if the data were converted to UTF-32.
[NATIONAL CHARACTER | NATIONAL CHAR | NCHAR](integer)
A fixed-length string of the specified length. The default length is 1.

The NATIONAL CHARACTER type maps to a fixed-length character string with string units CODEUNITS32.

[NATIONAL CHARACTER VARYING | NATIONAL CHAR VARYING | NCHAR VARYING | NVARCHAR](integer)
A varying-length string of the specified maximum length.

The NATIONAL CHARACTER VARYING type maps to a varying-length character string with string units CODEUNITS32.

[NATIONAL CHARACTER LARGE OBJECT | NCHAR LARGE OBJECT | NCLOB](integer [K])
A large object string of the specified maximum length. The default maximum length is 16,383 double bytes.

This data type maps to a character large object (CLOB) with string units CODEUNITS32. See the description of the CLOB parameter for information about possible values for the length integer and how to use a K (kilo) multiplier.

BINARY(integer)
A fixed-length binary string of the specified length, which must be in the range 1 ‑ 255 bytes. The default length is 1.
[VARBINARY | BINARY VARYING](integer)
A varying-length binary string of the specified maximum length, which must be in the range 1 ‑ 32672 bytes.
[BLOB | BINARY LARGE OBJECT](integer [K])
A binary large object string with a maximum length of the specified number of code units. The default maximum length is 65,535 bytes.
If you want to multiply the length integer by 1024, specify a K (kilo) multiplier.
  • Regardless of whether you use a K multiplier, the resulting length is limited by the maximum length of a BLOB column in an external table, which is 65,535 bytes. Note that 64K exceeds the maximum length by one, and so is not allowed.
  • Any number of spaces (including zero spaces) are allowed between the data type and the length specification or between the length integer and the K multiplier. For example, the following specifications are all equivalent and valid:
    BLOB(50K)
    BLOB(50 K)
    BLOB (50   K)
  • The K multiplier can be specified in either uppercase or lowercase.
DATE
A date.
TIME
A time.
TIMESTAMP(integer) or TIMESTAMP
A timestamp. The integer specifies the number of decimal places for fractions of seconds, from 0 (seconds) to 12 (picoseconds). The default is 6 (microseconds).
BOOLEAN
A Boolean value.
LIKE table-name1 or view-name or nickname
Specifies that the columns of the table have the same name and description as the columns of the specified table (table-name1), view (view-name), or nickname (nickname). The specified table, view, or nickname must either exist in the catalog or must be a declared temporary table. A typed table or typed view cannot be specified (SQLSTATE 428EC).
The use of LIKE is an implicit definition of n columns, where n is the number of columns in the identified table (including implicitly hidden columns), view, or nickname. A column of the new table that corresponds to an implicitly hidden column in the existing table will also be defined as implicitly hidden. The implicit definition depends on what is specified after LIKE:
  • If a table is specified, then the implicit definition includes the column name, data type, hidden attribute, and nullability characteristic of each of the columns of that table. If EXCLUDING COLUMN DEFAULTS is not specified, then the column default is also included.
  • If a view is specified, then the implicit definition includes the column name, data type, and nullability characteristic of each of the result columns of the fullselect defined in that view. The data types of the view columns must be data types that are valid for columns of a table.
  • If a nickname is specified, then the implicit definition includes the column name, data type, and nullability characteristic of each column of that nickname.
  • If a protected table is specified, the new table inherits the same security policy and protected columns as the identified table.
  • If a table is specified and if that table contains a row-begin column, row-end column, or transaction-start-ID column, the corresponding column of the new table inherits only the data type of the source column. The new column is not considered a generated column.
  • If a table that includes a period is specified, the new table does not inherit the period definition.
  • If a system-period temporal table is specified, the new table is not a system-period temporal table.
  • If a random distribution table that uses the random by generation method is specified, and if the new table that is being created does not share the same table distribution, the RANDOM_DISTRIBUTION_KEY column that is used to generate the random distribution values is not included.

Column default attributes can be included or excluded, based on the copy-attributes clauses. The implicit definition does not include any other attributes of the specified table, view, or nickname. Consequently, the new table does not have any primary key, unique constraints, foreign key constraints, referential integrity constraints, triggers, indexes, ORGANIZE BY specification, or PARTITIONING KEY specification.

When a table is identified in the LIKE clause and that table contains a ROW CHANGE TIMESTAMP column, the corresponding column of the new table inherits only the data type of the ROW CHANGE TIMESTAMP column. The new column is not considered to be a generated column.

If a table is specified, and if row or column level access control is activated for that table, it is not inherited by the new table.

option
The following options control the loading of data to or retrieval of data from an external-table file. The value of each option is a text string and is not case-sensitive.
BOOLSTYLE or BOOLEAN_STYLE
During a load operation, all Boolean values must use the same style. This option specifies the Boolean style that is to be used:
  • 1_0 (this is the default)
  • T_F
  • Y_N
  • YES_NO
  • TRUE_FALSE
CARDINALITY
Non-zero positive integer value to override the estimation of the expected number of returned rows.
CCSID
The coded character set identifier (CCSID) of the input data file. The value can be any valid integer value from the CCSID specification. There is no default value. The CCSID and ENCODING options are mutually exclusive when the value of the ENCODING option is UTF8, LATIN9, or INTERNAL.
Which styles are used for dates and times depends on whether a CCSID is specified:
  • When a CCSID is specified, and when DATESTYLE, TIMESTYLE, DATEDELIM, or TIMEDELIM are not specified, the values or defaults for DATE_FORMAT, TIME_FORMAT, and TIMESTAMP_FORMAT are used.
  • When a CCSID is not specified, and when TIMESTAMP_FORMAT, DATE_FORMAT or TIME_FORMAT are not specified, the values or defaults for DATESTYLE, TIMESTYLE, DATEDELIM, and TIMEDELIM are used.
COMPRESS
For a load operation or an unload operation, whether the data file data is compressed:
GZIP
The data file data is compressed by using the GZIP compression algorithm.
NO
The data file data is not compressed. This is the default.
LZ4
The data file data is compressed by using the LZ4 compression algorithm.
The COMPRESS option cannot be specified if the value of the REMOTESOURCE option is GZIP or LZ4.
CRINSTRING
How to interpret an unescaped carriage-return (CR) or carriage-return line-feed (CRLF) character:
TRUE or ON
An unescaped CR character is interpreted as data, not as a record delimiter. An unescaped CRLF character is split into a CR character, which is interpreted as data, and an LF character, which is interpreted as a record delimiter. The value of the LFINSTRING option does not affect how this LF character is interpreted.
FALSE or OFF
An unescaped CR or CRLF character is interpreted as a record delimiter. This is the default.
Use fixed-length format for CRINSTRING only if the value of the CtrlChars option is set to OFF.
CTRLCHARS
Whether to allow an ASCII value 1 - 31 in a CHAR or VARCHAR field. Any NULL, CR, or LF characters must be escaped. Allowed values are:
TRUE or ON
An ASCII value 1 - 31 in a CHAR or VARCHAR field is allowed.
If fixed-length format is enabled, all unescaped characters are allowed.
FALSE or OFF
An ASCII value 1 - 31 in a CHAR or VARCHAR field is not allowed. This is the default.
If fixed-length format is enabled, unescaped characters cause an error.
Exceptions for fixed-length format:
  • \t, \n
  • \r if the CRinString option is set to ON
DATAOBJECT or FILE_NAME
The fully-qualified name of the file (or any medium that can be treated as a file) that is to contain the external table to be created. This option is mandatory when the name of the file is not specified immediately after the table name; otherwise, it is not allowed.
  • For a cloud-based managed service such as Db2 Warehouse on Cloud, when the REMOTESOURCE option is set to LOCAL (this is its default value), the path to the external table file is relative to your Db2 managed services home directory. Otherwise, the path to the external table file is relative to the path /home/ followed by the authorization ID of the table definer. For example, if the authorization ID of the table definer is user1, the path to the external table file is relative to /home/user1/.
  • For an on-premise product such as Db2 Warehouse, when both the REMOTESOURCE option is set to LOCAL (this is its default value) and the extbl_strict_io configuration parameter is set to NO, the path to the external table file is an absolute path and must be one of the paths specified by the extbl_location configuration parameter. Otherwise, the path to the external table file is relative to the path that is specified by the extbl_location configuration parameter followed by the authorization ID of the table definer. For example, if extbl_location is set to /home/xyz and the authorization ID of the table definer is user1, the path to the external table file is relative to /home/xyz/user1/.
The file name must be a valid UTF-8 string.
For a load operation, the following conditions apply:
  • The file must already exist.
  • Required permissions:
    • If the external table is a named external table, the owner must have read permission for the file and write permission for the LOGDIR directory.
    • If the external table is a transient external table, the authorization ID of the statement must have read permission for the file and write permission for the LOGDIR directory.
For an unload operation, the following conditions apply:
  • If the file exists, it is overwritten.
  • Required permissions:
    • If the external table is a named external table, the owner must have read and write permission for the directory of this file.
    • If the external table is transient, the authorization ID of the statement must have read and write permission for the directory of this file.
DATEDELIM
The delimiter character that separates the components of a date, according to the format specified by the DATESTYLE option. If you specify an empty string, there is no delimiter between the date components, and days and months must be specified as two-digit numbers. When DATESTYLE is set to MONDY or MONDY2, the default DATEDELIM value is a space. The TIMESTAMP_FORMAT and DATEDELIM options are mutually exclusive.
DATESTYLE
How to interpret the date format. For days or months in the range 1 ‑ 9, use 1 digit, 2 digits, or a space followed by a single digit. When the DATEDELIM option is a space, you can specify a comma after the day. An error occurs if you:
  • Specify zero for a day, month, or year
  • Specify a nonexistent date (for example, 32 August or 30 February)
The DATESTYLE option and the DATE_FORMAT or TIMESTAMP_FORMAT option are mutually exclusive.
Table 1. Possible values for the DateStyle option. The example shows how the date 21 March 2014 would be represented when DATEDELIM is set to '-'.
Value Description Example
YMD 4-digit year, 2-digit month, 2-digit day. This is the default. 2014-03-21
DMY 2-digit day, 2-digit month, 4-digit year. 21-03-2014
MDY 2-digit month, 2-digit day, 4-digit year. 03-21-2014
MONDY 3-character month, 2-digit day, 4-digit year. Mar 21 2014
DMONY 2-digit day, 3-character month, 4-digit year. 21-Mar-2014
Y2MD 2-digit year, 2-digit month, 2-digit day. Not supported for unloads. 14-03-21
DMY2 2-digit day, 2-digit month, 2-digit year. Not supported for unloads. 21-03-14
MDY2 2-digit month, 2-digit day, 2-digit year. Not supported for unloads. 03-21-14
MONDY2 3-character month, 2-digit day, 2-digit year. Not supported for unloads. Mar 21 14
DMONY2 2-digit day, 3-character month, 2-digit year. Not supported for unloads. 21-Mar-14
DATETIMEDELIM
A single-byte character that separates the date component and time component of the timestamp data type.
The default delimiter is a space (' ').
Between the date component and the time component, a delimiter is not required. For example, both of the following values are valid:
2010-10-10 10:10:10
2010-10-1010:10:10
DATE_FORMAT
The format of the date field in the data file. The value can be any of the date format strings that are accepted by the TIMESTAMP_FORMAT scalar function. The default is YYYY-MM-DD. The DATE_FORMAT option and the DATEDELIM or DATESTYLE option are mutually exclusive.
DECIMALDELIM or DECIMAL_CHARACTER
The decimal delimiter for the data types FLOAT, DOUBLE, TIME, and TIMESTAMP. Allowed values are ',' and '.'.
DECPLUSBLANK
Specifies how the positive decimal value is represented during the unload operation.
You can specify one of the following values for this option:
NONE
This is the default.
This value represents a positive decimal value without a sign.
PLUS
Specifies that a positive decimal value is represented by a '+' sign.
BLANK
Specifies that a positive decimal value is represented by a blank sign instead of a '+' sign.
If you specify the DECPLUSBLANK option for the load operation, the output is not affected.
Examples for a table test with ddl (decimal (6,2)) and all the available values for the DECPLUSBLANK option:
1234
-4563
  • Create external table '/tmp/unload.txt' using (DECPLUSBLANK NONE) as select * from test:
    unload.txt
    1234.00
    -4563.00
  • Create external table '/tmp/unload.txt' using (DECPLUSBLANK PLUS) as select * from test:
    unload.txt
    +1234.00
    -4563.00
  • Create external table '/tmp/unload.txt' using (DECPLUSBLANK BLANK) as select * from test:
    unload.txt
     1234.00
    -4563.00
DELIMITER or COLUMN_DELIMITER
The character that is used to delimit the fields of an input or output record. The default is a vertical bar ('|').
You can specify a character in the 7-bit ASCII range (decimal 1 ‑ 127) in any of the following ways:
  • As a single character (for example DELIMITER ';')
  • By specifying its corresponding ASCII decimal value (for example, DELIMITER 59 or DELIMITER '59')
  • By specifying its corresponding ASCII hex value (for example, DELIMITER x'3B')

The decimal range 128 - 255 is supported only with the ISO character set input file by specifying its corresponding ASCII decimal value or hex value. If the input file is in the UTF8 character set, this delimiter value range is not supported.

ENCODING
The type of data in the file:
UTF8
The file uses UTF8 encoding for all character data.
LATIN9
The file uses LATIN9 encoding for all character data.
INTERNAL
This is the default option.
The file uses a mixture of both UTF8 and LATIN9 encoding.
Files are encoded in Netezza internal format and therefore should be used only for files that are extracted from Netezza by using ENCODING (INTERNAL).
When the target column is CODEUINTS32 (NCHAR/VARCHAR), the input data is validated to be valid UTF-8 characters.
This option is supported only in a Unicode database.
DBCS_GRAPHIC
This value is allowed only for a load operation, not an unload operation. If this value is specified, the CCSID option must also be specified. During the load operation, fields of type GRAPHIC or VARGRAPHIC are encoded using the double-byte character set of the specified CCSID; fields of all other types are encoded using the mixed-byte character set of the specified CCSID.
Note: ENCODING cannot be set to DBCS_GRAPHIC for a DEL file that was created by the EXPORT utility, because such DEL files are encoded using a single character set.
The CCSID and ENCODING options are mutually exclusive when the value of the ENCODING option is UTF8, LATIN9, or INTERNAL.
ESCAPECHAR or ESCAPE_CHARACTER
Which character is to be regarded as an escape character. An escape character indicates that the character that follows it, which would otherwise be treated as a field-delimiter character or end-of-row sequence character, is instead treated as part of the value in the field. The escape character is ignored for graphic-string data. There is no default.
FILLRECORD
For a load operation, the field of a record are loaded into the columns of a target table from left to right. This option specifies whether an input record can contain fewer fields than there are columns defined for the target table:
TRUE or ON
An input line can contain fewer fields, provided that all columns for which a value is missing are nullable. Missing values are set to NULL. If one or more columns for which a value is missing is not nullable, the record is rejected.
FALSE or OFF
An input line that contains fewer columns is rejected. This is the default.
FORMAT or FILE_FORMAT
The data format of the source file:
TEXT
The data to be loaded or unloaded is in text-delimited format. This is the default.
INTERNAL
The data is in an internal format used by Netezza Platform Software (NPS). This value is valid only when loading data from a file to the database, not when unloading data to a file. If this value is specified for the FORMAT option, the following options, and only these options, must also be specified:
  • DATAOBJECT or FILE_NAME.
  • COMPRESS. This must be set to GZIP.
BINARY
The data is in an internal format that is used by Db2.
FIXED
The data is in fixed-length format.
Fixed-length format is supported only for load operations.
Files in fixed-length format use ordinal positions, which are offsets, to identify where fields are within the record.
Note:
  • The following external table options are not supported for the fixed-length format:
    • Delimiter
    • Encoding
    • EscapeChar
    • FillRecord
    • IgnoreZero
    • IncludeZeroSeconds
    • Lfinstring
    • QuotedValue
    • RequireQuotes
    • TimeExtraZeros
    • TruncString
  • There are no field delimiters.
  • An end-of-record delimiter is required even for the last record.
  • Usually, data in fixed-length format files does not have decimal delimiters or time delimiters because delimiters are not necessary and use space.
  • The locations of delimiters are fixed and specified in the layout definition because the fields are fixed in size. This definition comes with the fixed-length format data file.
  • To load fixed-format data into the database, you must define the target data type for the fields and the locations within the record.
  • You do not have to load all fields in a fixed-length format file. You can skip them by using the filler specification.
  • The order of fields in the data file must match the order in the target table. Alternatively, you must create an external table definition that specifies the order of the fields as database columns.
  • You can change the field order by using an external table definition in combination with an insert-select statement.
  • Typically, unknown values or null values are represented by known data patterns that are classified as representing null.
The following parameters apply when the FORMAT option of the external table is set to FIXED:
LAYOUT
Mandatory.
A layout is an ordered collection of zone or field definitions. It defines the location of the fields of the input record.
Specify comma-separated zone definitions within braces { }.
Each zone definition is made up of mutually exclusive, non-overlapping clauses.
No default value.
The clauses must be in the following order, even if some of them are optional and can be empty:
USE TYPE
Optional.
Indicates whether a zone is a normal data zone, a reference zone, or a filler zone.
For data zones, this value is omitted.
A reference zone is specified as REF. This specification implies that the zone is referred by another zone for zone length or null values.
A filler zone is specified as FILLER. Filler zones specify that the bytes or characters are treated as fillers in a data file.
NAME
Optional.
The name of the zone.
Currently, this definition is not used. Typically, it is provided to identify the field.
TYPE
Optional.
Defines the type of the zone.
If you do not specify the type, it gets the default value of the corresponding type of a table column.
Valid values are as follows:
  • CHAR
  • VARCHAR
  • NCHAR
  • NVARCHAR
  • SMALLINT
  • BIGINT
  • BINARY
  • VARBINARY
  • GRAPHIC
  • VARGRAPHIC
  • FLOAT
  • DOUBLE
  • DEC, NUM, or NUMERIC
  • DECFLOAT
  • BOOLEAN
  • DATE
  • TIME
  • TIMESTAMP
STYLE
Optional.
Defines the zone representation.
The default representation is based on zone type and format option.
All other styles are valid only for their corresponding non-textual zone types.
Valid values are as follows:
  • INTERNAL

    Valid only for textual zones, that is, char, varchar, nchar, and nvarchar.

  • DECIMAL

    Valid for integer and numeric zone types.

  • DECIMALDELIM <'decimal-delim'>

    Valid for numeric, float, double, and time style (time and timestamp) zone types.

  • FLOATING

    Valid for float or double zone types.

  • EXPONENTIAL

    Valid for float or double zone types.

  • YMD <'date-delim'>

    Valid for date zones, including other date styles that are supported for the DateStyle and DateDelim external table options.

  • 12Hour <'time-delim'>

    Valid for time zones, including other time styles that are supported for the TimeStyle and TimeDelim external table options.

  • 24Hour <'time-delim'>

    Valid for time zones, including other time styles that are supported for the TimeStyle and TimeDelim external table options.

  • YMD <'date-delim'> 24Hour <'time-delim'>

    Valid for timestamp zones, including other combinations of date and time styles that are supported for the DateStyle, DateDelim, TimeStyle, and TimeDelim external table options.

  • TRUE_FALSE, Y_N, 1_0

    Valid for boolean zones, including other boolean styles that are supported for the BoolStyle external table option. The style must be in accordance with the format.

LENGTH
Optional.
Specified as bytes or characters followed by the number or the internal reference to the reference zone.
Number of bytes or characters as provided or as referenced by the reference zone.
For reference zones or filler zones, you cannot use internal references. For reference zones, the number of bytes specifies how the data is read from the data file to get the referred value.
You can use plus signs and minus signs as follows:
BYTES @2 + 10
BYTES @2 - 10
NULLIF
Optional.
Definition of the zone NULLESS attribute.
Specifies a known data pattern within the field that, when it is present, signifies that the field is null.
The length is equal to or less than the column width. The maximum length is 39 bytes.
You can use the following types of references:
@
Internal reference to numeric zones.
Exact match of the numeric value.
&
External reference.
Exact match of the specified value.
&&
Isolated reference.
Leading spaces and trailing spaces are to be skipped with the exact string match.
Nulls are detailed in the following examples:
Table 2. Layout example
Use type Name Type Style Length Nullif
NA f1 int4 DECIMAL Bytes 10 Nullif & = 0
NA f2 date YMD Bytes 10 Nullif &= '2000-10-10'
NA f3 char(20) INTERNAL Chars 10 Nullif &&='ab'
Filler f4 char(10) NA Bytes 10 NA
Remember:
  • The referred zone in a length clause must be of type integer.
  • You must not specify the NULLIF option for reference zones or filler zones.
  • Reference zones and filler zones cannot have variable lengths.
  • Variable length zones cannot refer themselves.
  • Define the referred zone in a length clause as REF.
  • Length-clause references can use only the INTERNAL (@) reference. External or isolated references are not supported.
  • Between the referred zone of a length clause and the zone itself, reference zones are not allowed.
  • If the reference type is INTERNAL (@), the NULLIF clause cannot refer to itself.
  • If the column is non-nullable, it may not have the NULLIF clause.
  • Variable length is allowed only for the string type of zones.
  • The NULLIF clause can refer only to REF zones or the zones themselves.
  • Between the zone that is referred by the NULLIF clause and the zone itself, other referred zones are not allowed, except for the zone that is referred in the length clause.
  • The record length can point to zone 1 only for reference.
  • A REF must have a zone that refers it.
  • The NULLIF clause can have external references only if the REF zone is non-integer.
Recordlength
Specifies the length of the entire record, where null-indicator bytes are included if they exist, and the record delimiter is excluded if it exists.
The value is a constant integer.
The value can also be an internal reference to the reference zone in the layout definition.
There is no default value.
You can use plus signs and minus signs for an internal reference as follows:
RECORDLENGTH @1 + 10
RECORDLENGTH @1 - 10
IGNOREZERO or TRIM_NULLS
Specifies whether the binary value zero in CHAR fields and VARCHAR fields is to be discarded.
TRUE or ON
The byte value zero is ignored.
FALSE or OFF
The byte value zero is not ignored. This is the default.
KEEP
The binary value zero is accepted and allowed as part of the input field.
INCLUDEHEADER or COLUMN_NAMES
For an unload operation, whether the table column names are to be included as headers in the external-table file:
TRUE or ON
The table column names are to be included as headers.
FALSE or OFF
The table column names are not to be included as headers. This is the default.
INCLUDEZEROSECONDS
For an unload operation, whether to specify 00 as the value for seconds when no value for seconds is available:
TRUE or ON
Specify 00 as the value for seconds.
FALSE or OFF
Do not specify a value for seconds. This is the default.
LFINSTRING
Specifies how to interpret an unescaped line-feed (sometimes called an LF or newline) character within string data:
TRUE or ON
An unescaped LF character is interpreted as a record delimiter only if it is in the last field of a record; otherwise, it is treated as data. To cause an LF character that is in the last field of a record to be treated as data, enclose the value of that field in single or double quotation marks.
FALSE or OFF
An unescaped LF character is interpreted as a record delimiter regardless of its position. This is the default.

This option is not supported for unload operations. And applies only to line-feed characters, not carriage-return line-feed (CRLF) characters.

Attention: This SQL compatibility enhancement is only available in Db2 Version 11.5 Mod Pack 2 and later versions.
LOGDIR or ERROR_LOG
The directory to which the following files are written:
<database>.<schema>.<external-table-name>.<file-name>.<application-handle>.<id>.bad
A file containing rejected records (that is, records that could not be processed).
<database>.<schema>.<external-table-name>.<file-name>.<application-handle>.<id>.log
A log file.
The default is the directory to which the external-table file is written. If the length of the name that is constructed for a .bad or .log file would exceed the allowed maximum, the name of the file that contains the external table (indicated by <file-name>) is truncated so that the maximum is not exceeded.

If a .log or .bad file is generated while carrying out an operation on a partition, the name of the generated file is suffixed with a period followed by the 3-digit partition number.

MAXERRORS or MAX_ERRORS
For a load operation, the threshold for the number of rejected records at which the system stops processing and immediately rolls back the load. The default is 1 (that is, a single rejected record results in a rollback).
For fixed-length format, the following conditions apply:
  • The parser reports errors for each field or zone rather than one error for the row.
  • Multiple errors can be reported for the same row.
  • When the parser detects an error in a field or zone, it recovers by using the field length or zone length. It then continues from the next field or zone until the end of record is reached, or an unrecoverable error occurs, or the MaxErrors limit is reached.
  • Unrecoverable errors include the following errors:
    • RecordLength mismatch.
    • RecordDelimiter is not found.
    • The RecordLength value is not valid, that is, the value is a negative value or zero.
    • The zone length is not valid, that is, the value is a negative value.
    • The UTF-8 initial byte is not valid.
    • The UTF-8 continuation bytes are not valid.
MAXROWS or MAX_ROWS
If set to a positive integer, this specifies the maximum number of records (rows) in the external table that are to be processed. If set to 0 (the default), there is no limit and all rows are processed. During a load operation, if MAXROWS is set to a positive value, after that number of rows are processed, regardless of whether some of the rows were rejected or skipped, the system ends the load operation and commits all inserted records.
MERIDIANDELIM
A single-byte character that separates the seconds component from the AM token or PM token in the 12-hour delimited and undelimited formats of a time value.
The default delimiter is a space (' ').
Between the seconds component and the AM token or PM token, a delimiter is not required. For example, both of the following values are valid:
1:02:46.12345 AM
1:02:46.12345AM
NOLOG
Specifies whether the .log file for the external table is created.
This option does not apply to .bad files.
Possible values are:
TRUE
No .log file is created.
FALSE
The .log file is created.
This is the default.
NULLVALUE or NULL_VALUE
The UTF-8 string of at most 4 bytes that is to be used to indicate a null value. The default is 'NULL'.
PARTITION
If you have an MPP deployment, an external table can be partitioned into several files. The name of each of the data files that comprise an external table are suffixed with a period followed by a 3-digit number from 000 to 999 that indicates the number of the partition. For example, if an external table with the name dataFile.txt is divided into three partitions, the files that comprise it have the names dataFile.txt.000, dataFile.txt.001, and dataFile.txt.002. These files must be accessible from all members.
For a partitioned external table, the PARTITION option specifies to which partition or partitions the statement applies:
PARTITION ALL
The statement applies to all of the partitions that comprise the external table. For an unload operation, this is the only value that is allowed.
PARTITION (n TO n)
The statement applies to all of the partitions in the specified range, for example, PARTITION (54 TO 62).
PARTITION (n,n,…)
The statement applies only to the specified partition or partitions, for example, PARTITION (53) or PARTITION (51,57,58). If more than one partition number is specified, they must be in ascending order (sqlcode SQL0263N with SQLSTATE=42615) and there can be no duplicates (sqlcode SQL0265N with SQLSTATE=42615).

If a .log or .bad file is generated while carrying out an operation on a partitioned external table, the name of the generated file is suffixed with a period followed by the 3-digit partition number.

If you have an MPP deployment and the PARTITION option is not specified, the external table is treated as single-partitioned table on the coordinator member. The names of the external table file and the .log and .bad files are not suffixed with a partition number.

If you do not have an MPP deployment, the PARTITION option can be specified, but only with the value ALL, (0 to 0), or (0) (sqlcode SQL0644N with SQLSTATE=42615). It will have no effect.

The REMOTESOURCE and PARTITION options are mutually exclusive.

QUOTEDNULL
For a load operation, how to interpret a value that is enclosed in single or double quotation marks and that matches the null value specified by the NULLVALUE or NULL_VALUE option (for example, "NULL" or 'NULL'):
TRUE or ON
The value is interpreted as a null value. This is the default.
FALSE or OFF
The value is interpreted as a character string.
QUOTEDVALUE or STRING_DELIMITER
Whether data values are enclosed in quotation marks:
SINGLE or YES
Data values are enclosed in single quotation marks (').
DOUBLE
Data values are enclosed in double quotation marks (").
NO
Data values are not enclosed in quotation marks. This is the default.
RECORDDELIM or RECORD_DELIMITER
The string literal that is to be interpreted as a row (record) delimiter. The default is '\n'.
REMOTESOURCE
Where the external-table file resides and, if it resides on a remote system, whether the file data is to be compressed:
LOCAL
The file resides on the local server, that is, the system that hosts the database. This is the default.
YES
The file resides on a system other than the local server. For example, specify YES if a client system is connected to the database and the file resides on that system. File data is not compressed before it is transferred.
GZIP
Similar to YES, except that the file data is compressed using the GZIP compression algorithm before the data is transferred, and is decompressed after it is received. This improves overall performance when a large amount of compressible data is being transferred.
LZ4
Similar to YES, except that the file data is compressed using the LZ4 compression algorithm before the data is transferred, and is decompressed after it is received. This improves overall performance when a large amount of compressible data is being transferred.

The REMOTESOURCE and PARTITION options are mutually exclusive. The COMPRESS option cannot be specified if the value of the REMOTESOURCE option is GZIP or LZ4.

REQUIREQUOTES
Whether quotation marks are mandatory:
TRUE or ON
Quotation marks are mandatory. The QUOTEDVALUE option must be set to YES, SINGLE, or DOUBLE.
FALSE or OFF
Quotation marks are not mandatory. This is the default.
SKIPROWS or SKIP_ROWS
For a load operation, the number of rows to skip before beginning to load the data. The default is 0. Because skipped rows are processed before they are skipped, a skipped row is still capable of causing a processing error.
SOCKETBUFSIZE
The size, in bytes, of the chunks of data that are read from the source file. Valid values range from 64 KB ‑ 800 MB. If you specify a value outside this range, the value is set to the nearest valid value. The default is 8 MB.
STRICTNUMERIC
For a load operation, how to treat a value that is to be inserted into a DECIMAL field when its scale exceeds that defined for the field:
TRUE or ON
The row containing the value to be inserted is rejected. For example, if any of the following values were to be loaded into a DECIMAL(5,3) field, the row containing that value would be rejected:
12.666666666
-98.34496862785
0.00089
FALSE or OFF
The row containing the value to be inserted is accepted, and the portion of the decimal fraction that exceeds the scale defined for the field is truncated. This is the default. For example, the values in the previous example would be converted to:
12.666
-98.344
0.000
SWIFT
Specifies that the source data file is located in a Swift object store. Use the DATAOBJECT option to specify the file name.
Syntax:
SWIFT (endpoint, authKey1, authKey2, bucket)
where:
endpoint
A character string that specifies the URL of the SWIFT web service.
authKey1
A character string that specifies the access ID or username of the Swift open stack account used to validate the user.
authKey2
A character string that specifies the password of the Swift open stack account used to validate the user.
bucket
The name of the Swift open stack container (bucket) in which the file resides.
Example:
CREATE EXTERNAL TABLE exttab1(a int) using 
  (dataobject 'datafile1.dat' 
   swift('https://dal05.objectstorage.softlayer.net/auth/v1.0/', 
    'XXXOS123456-2:xxx123456',
    'b207c6e974020737d92174esdf6d5be9382aa4c335945a14eaa9172c70f8df16', 
    'my_dev'
   )
  )
S3
Specifies that the source data file is located in an S3 compatible object store. Use the DATAOBJECT option to specify the file name.
Syntax:
S3 (endpoint, authKey1, authKey2, bucket)
where:
endpoint
A character string that specifies the URL of the S3 compatible web service.
authKey1
A character string that specifies the S3 access key ID of the access keys used to validate the user and all user actions. For IBM Cloud Object Storage, this is the access key ID from the HMAC credentials.
authKey2
A character string that specifies the S3 secret key of the access keys that are used to validate the user and all user actions. For IBM Cloud Object Storage, this is the secret access key from the HMAC credentials.
bucket
The name of the S3 bucket in which the file resides.
Note: For IBM Cloud Object Storage, to create HMAC credentials, when creating new service credentials, specify {"HMAC:true} in the Add Inline Configuration Parameters field.
Example using AWS S3:
CREATE EXTERNAL TABLE exttab2(a int) using 
  (dataobject 'datafile2.dat' 
   s3('s3.amazonaws.com', 
    'XXXOS123456-2:xxx123456',
    'bs07c6e974040737d92174e5e96d5be9382aa4c33xxx5a14eaa9172c70f8df16', 
    'my_dev'
   )
  )
Example using IBM Cloud Object Storage:
CREATE EXTERNAL TABLE exttab2(a int) using 
  (dataobject 'datafile2.dat' 
   s3('s3-api.us-geo.objectstorage.softlayer.net',
    '1a2bkXXXsaddntLo0xX0',
    'XXxxiEPjJ7T7WBUz74E6abcdABCDE8Q7RgU4gYY9',
    'my_dev'
   )
  )
AZURE
Specifies that the source data file is located in Microsoft Azure Blob Storage. The REMOTESOURCE, SWIFT, S3, and AZURE options are mutually exclusive. Use the DATAOBJECT option to specify the file name. Syntax:
Syntax:
AZURE (endpoint, authKey1, authKey2, bucket)
where:
endpoint
A character string that specifies the URL of the AZURE web service.
authKey1
A character string that specifies the access ID or username of the Azure Blob Storage account used to validate the user.
authKey2
A character string that specifies the access key of the Azure Blob Storage account used to validate the user.
bucket
The name of the Azure Blob Storage container (bucket) in which the file resides.
Example:
CREATE EXTERNAL TABLE exttab1(a int) using 
  (dataobject 'datafile1.dat' 
   azure('https://my_account.blob.core.windows.net', 
    'my_account',
    'lW+oHjmZecPS++IKgThAHlMUOaFUA5C6Z2RlFmc9JPpK34RO/ZIOywzILxJnzGPHz6d/yDrcQDAwH5wySbOZMQ==', 
    'my_bucket'
   )
  )
Example using IBM Cloud Object Storage:
CREATE EXTERNAL TABLE exttab2(a int) using 
  (dataobject 'datafile2.dat' 
   s3('s3-api.us-geo.objectstorage.softlayer.net',
    '1a2bkXXXsaddntLo0xX0',
    'XXxxiEPjJ7T7WBUz74E6abcdABCDE8Q7RgU4gYY9',
    'my_dev'
   )
  )
TIMEDELIM
The single-byte character that is to separate time components (hours, minutes, and seconds). The default is ':'. If TIMEDELIM is set to an empty string, hours, minutes, and seconds must all be specified as two-digit numbers. The TIMESTAMP_FORMAT and TIMEDELIM options are mutually exclusive.
TIMEROUNDNANOS or TIMEEXTRAZEROS
Note: This option applies only to TIMESTAMP columns.
Specifies whether records that contain time values whose non-zero precision exceeds six decimal places are to be accepted (and rounded to the nearest microsecond) or rejected:
TRUE
All records are accepted. Their time values are rounded to the nearest microsecond.
FALSE
Only those records that can be stored without a loss of precision (for example, '08.15.32.123' or '08.15.32.12345600000', but not '08.15.32.1234567') are accepted. All other records are rejected. This is the default.
TIMESTYLE
The time format that is to be used in the data file:
24HOUR
24-hour format, for example 23:55. This is the default.
12HOUR
12-hour format, for example 11:55 PM. An AM or PM token can be preceded by a single space and is not case-sensitive.
The TIMESTYLE option and the TIME_FORMAT or TIMESTAMP_FORMAT option are mutually exclusive.
TIMESTAMP_FORMAT
The format of the timestamp field in the data file. The value can be any of the format strings that are accepted by the TIMESTAMP_FORMAT scalar function. The default is 'YYYY-MM-DD HH.MI.SS'. The TIMESTAMP_FORMAT option and the TIMEDELIM, DATEDELIM, TIMESTYLE, or DATESTYLE option are mutually exclusive.
TIME_FORMAT
The format of the time field in the data file. The value can be any of the time format strings that are accepted by the TIMESTAMP_FORMAT scalar function. The default is HH.MI.SS. The TIME_FORMAT option and a TIMEDELIM or TIMESTYLE option are mutually exclusive.
TRIMBLANKS
How an external table is to treat leading or trailing blanks (that is, leading or trailing space characters) in a string:
LEADING
All leading blanks (that is, blanks that precede the first non-blank character) are removed.
TRAILING
All trailing blanks (that is, blanks that follow the last non-blank character) are removed.
BOTH
All leading and trailing blanks are removed.
NONE
No blanks are removed. This is the default.
When reading data from a file and loading it into an external table:
  • If QUOTEDVALUE or STRING_DELIMITER is specified with the values SINGLE, YES, or DOUBLE, leading and trailing blanks within quotation marks are not removed.
  • For CHAR and NCHAR data, the values TRAILING or BOTH will not have any effect on trailing blanks, because the string will automatically be re-padded with trailing blanks.
TRUNCSTRING or TRUNCATE_STRING
How the system processes a CHAR or VARCHAR string that exceeds its declared storage size:
TRUE
The system truncates a string value that exceeds its declared storage size.
FALSE
The system returns an error when a string value exceeds its declared storage size. This is the default.
Y2BASE
The year that is the beginning of the 100-year range. Years that are specified as 2 digits are counted from this year. The default is 2000. This option must be specified when DATESTYLE is set to Y2MD, MDY2, DMY2, MONDY2 or DMONY2.
Table 3. Options
Option Default Applies to Load Applies to Unload
Azure (no default) Y Y
BOOLSTYLE or BOOLEAN_STYLE 1_0 Y Y
CARDINALITY (no default) Y Y
CCSID (no default) Y Y
COMPRESS NO Y Y
CRINSTRING FALSE Y Y
CTRLCHARS FALSE Y N
DATAOBJECT or FILE_NAME (no default) Y Y
DATEDELIM '-' Y Y
DATETIMEDELIM A space (' ') Y Y
DATESTYLE YMD Y Y
DATE_FORMAT YYYY-MM-DD Y Y
DECIMALDELIM or DECIMAL_CHARACTER '.' Y Y
DELIMITER '|' Y Y
ENCODING INTERNAL Y Y1
ESCAPECHAR or ESCAPE_CHARACTER (no default) Y Y
FILLRECORD FALSE Y N
FORMAT or FILE_FORMAT TEXT Y Y
IGNOREZERO or TRIM_NULLS FALSE Y N
INCLUDEHEADER or COLUMN_NAMES FALSE Y Y
INCLUDEZEROSECONDS FALSE Y Y
LFINSTRING FALSE Y N
LOGDIR or ERROR_LOG target directory of external-table file Y N
MAXERRORS or MAX_ERRORS 1 Y N
MAXROWS or MAX_ROWS 0 Y N
MERIDIANDELIM A space (' ') Y Y
NOLOG FALSE Y Y
NULLVALUE or NULL_VALUE 'NULL' Y Y
PARTITION (no default) Y Y
QUOTEDNULL TRUE Y N
QUOTEDVALUE NO Y N
RECORDDELIM or RECORD_DELIMITER '\n' Y N
REMOTESOURCE LOCAL Y Y
REQUIREQUOTES FALSE Y N
SKIPROWS or SKIP_ROWS 0 Y N
SOCKETBUFSIZE 8 MB Y Y
STRICTNUMERIC FALSE Y N
SWIFT (no default) Y Y
S3 (no default) Y Y
TIMEDELIM ':' Y Y
TIMEROUNDNANOS or TIMEEXTRAZEROS FALSE Y N
TIMESTAMP_FORMAT 'YYYY-MM-DD HH.MI.SS' Y Y
TIMESTYLE 24HOUR Y Y
TIME_FORMAT HH.MI.SS Y Y
TRIMBLANKS NONE Y Y
TRUNCSTRING or TRUNCATE_STRING FALSE Y N
Y2BASE 2000 Y N
1 Only for the values INTERNAL, UTF8, and LATIN9.
AS SELECT STATEMENT
Specifies that, for each column in the derived result table of the fullselect, a corresponding column is to be defined for the table and populated with the results of the query. Each defined column adopts the following attributes from its corresponding column of the result table (if applicable to the data type):
  • Column name
  • Column description
  • Data type, length, precision, and scale
  • Nullability

Notes

  • Records that cannot be processed (if any) are written to a file with a name of the form:
    <database>.<schema>.<external-table-name>.<file-name>.<application-handle>.<id>.bad
    Errors are logged in a file with a name of the form:
    <database>.<schema>.<external-table-name>.<file-name>.<application-handle>.<id>.log
    These files are located in the directory specified by the LOGDIR or ERROR_LOG option.

    For an operation on a partition, the name of the generated .bad or .log file is suffixed with a period followed by the 3-digit partition number.

  • To create, insert into, or drop a named external table, issue a CREATE, INSERT, or DROP statement. You cannot insert into or drop a transient external table.
  • Dropping an external table deletes the table definition but does not delete the data file that is associated with the table.

Restrictions

  • External tables can be queried only by a user ID defined within the operating system.
  • Remote external table is not allowed within routines.
  • Data being loaded must be properly formatted.
  • You cannot delete, truncate, or update an external table.
    • A single query or subquery cannot select from more than one external table at a time, and cannot reference the same external table more than once. If necessary, combine data from several external tables into a single table and use that table in the query.
    • A union operation cannot involve more than one external table.

Syntax alternatives

The following alternatives are non-standard. They are supported for compatibility with earlier product versions or with other database products.
  • SAMEAS can be used in place of LIKE.
  • For the REMOTESOURCE option, the values ODBC, JDBC, or OLE-DB can be specified in place of YES.
  • If the FORMAT option is set to INTERNAL, the value YES can be specified in place of GZIP for the COMPRESS option.

Examples

  • Unload data to an external table:
    CREATE EXTERNAL TABLE 'order.tbl' USING (DELIMITER '|') AS SELECT * from orders;
    CREATE EXTERNAL TABLE 'export.csv' USING (DELIMITER ',') AS SELECT foo.x, bar.y, bar.dt FROM foo, bar WHERE foo.x = bar.x;
  • Load data from an external table:
    INSERT INTO target SELECT * FROM EXTERNAL 'data.txt' USING (DELIMITER '|');
    INSERT INTO orders SELECT * FROM EXTERNAL 'order.tbl'( order_num INT, order_dt TIMESTAMP) USING (DELIMITER '|');
  • Select data from an external table:
    SELECT * FROM EXTERNAL 'order.tbl' (order_num INT, order_dt TIMESTAMP) USING (DELIMITER '|');
    SELECT * FROM EXTERNAL 'test.txt' LIKE test_table USING (DELIMITER ',');
    SELECT x, y AS dt FROM EXTERNAL 'test.txt' ( x integer, y decimal(18,4) ) USING (DELIMITER ',');