Ingest data from an input file or
pipe into a DB2® table.
Authorization
The privileges
held by the authorization ID used to connect to the database must
include:
- If the target table has any triggers, the authorization ID must
have sufficient privileges to execute the operations that the triggers
specify.
- To insert into or update a table that has protected columns, the
authorization ID must have LBAC credentials that allow write access
to all protected columns in the table. Otherwise, the command fails
and an error is returned.
- If an UPDATE or MERGE statement requires reading a protected
column, the authorization ID must have LBAC credentials that allow
read access to the column. Otherwise, the command fails and an error
is returned.
- To insert into or update a table that has protected rows, the
authorization ID must hold an LBAC credential that meets these criteria:
- The LBAC credential is part of the security policy protecting
the table.
- If the security policy was defined as RESTRICT NOT AUTHORIZED
WRITE SECURITY LABEL, then the LBAC credential must granted to the
authorization ID for write access
The
security label on the row to be inserted, the authorization ID's LBAC
credentials, the security policy definition, and the LBAC rules determine
whether insert or update can be performed on the table with protected
rows.
- If the INGEST command specifies the RESTART
NEW (the default) or RESTART CONTINUE option,
then SELECT, INSERT, UPDATE, and DELETE privilege on the restart table
- If the INGEST command
specifies the RESTART TERMINATE option, then SELECT
and DELETE privilege on the restart table.
- If the INGEST command
specifies the EXCEPTION TABLE option, then INSERT
privilege on the exception table
In addition, the SQL statement on
the INGEST command is subject to the same fine
grained access controls (FGAC) that it would be if the user running
the ingest utility accessed the table outside of the ingest utility.
The
user running the CLP must have the following file permissions:
- read access to the control file
- if the INGEST command specifies the MESSAGES option,
then:
- write access to the directory containing the messages file
- write access to the file if the messages file already exists
- If the INGEST command specifies the DUMPFILE option,
then:
- write access to the directory containing the dump file
- write access to the dump file if the dump file already exists
Required connection
Database
Command syntax
>>-INGEST--+------+--| from-clause |--| format-clause |--------->
'-DATA-'
>--| options |--| SQL-statement |------------------------------><
from-clause
.-,---------.
V |
|--FROM--+-FILE----file-name-+-+--------------------------------|
| .-,---------. |
| V | |
'-PIPE----pipe-name-+-'
format-clause
|--FORMAT--| format-definition |--------------------------------|
options
|--+----------------------------+--+---------------------+------>
'-| error-handling options |-' '-MESSAGES--file-name-'
>--+------------------------------------+-----------------------|
| .-NEW--+--------+-------. |
| | '-job-ID-' | |
'-RESTART--+-OFF-------------------+-'
'-+-CONTINUE--+--job-ID-'
'-TERMINATE-'
error-handling options
|--+-------------------------+---------------------------------->
'-+-DUMPFILE-+--file-name-'
'-BADFILE--'
>--+-----------------------------+--+-----------------+---------|
'-EXCEPTION TABLE--table-name-' '-WARNINGCOUNT--n-'
SQL-statement
|--+-| DELETE-statement |--+------------------------------------|
+-| INSERT-statement |--+
+-| MERGE-statement |---+
+-| REPLACE-statement |-+
'-| UPDATE-statement |--'
format-definition
|--+-DELIMITED--+----------+-----------+------------------------>
| '-BY--char-' |
'-POSITIONAL--+-------------------+-'
'-RECORDLEN--length-'
>--+-------------------------+---------------------------------->
'-INPUT CODEPAGE -integer-'
>--+-----------------------------+--+----------------+----------|
'-+-IMPLICITLYHIDDENINCLUDE-+-' '-| field-list |-'
'-IMPLICITLYHIDDENMISSING-'
field-list
.-,--------------------.
V |
|--(----| field-definition |-+--)-------------------------------|
field-definition
|--$field-name--+-----------------------------------+----------->
'-POSITION--(--start--+--------+--)-'
'-:--end-'
>--| field-type |--+----------------------------------------+---|
'-DEFAULTIF--+----------------+--=--char-'
'-(--position--)-'
field-type
|--+-+-CHARACTER-+--+--------------+--+--------------+--+--------+--+------------------------------+-----------+--|
| '-CHAR------' '-(--length--)-' '-FOR BIT DATA-' +-NOTRIM-+ '-OPTIONALLY ENCLOSED BY--char-' |
| +-LTRIM--+ |
| +-RTRIM--+ |
| '-TRIM---' |
+-+-DECIMAL-+--+-----------------------------------+--+---------------------------------------------------+-+
| '-DEC-----' | .-5,0---------------------. | +-PACKED--------------------------------------------+ |
| | V | | +-ZONED---------------------------------------------+ |
| '-(----precision--+----------+-+--)-' '-EXTERNAL--+--------------+--+-------------------+-' |
| '-,--scale-' '-(--length--)-' '-RADIX POINT--char-' |
+-+-INTEGER--+--+---------------------------------------------------+---------------------------------------+
| +-INT------+ +-+-BIG----+--ENDIAN--------------------------------+ |
| +-SMALLINT-+ | '-LITTLE-' | |
| '-BIGINT---' '-EXTERNAL--+--------------+--+-------------------+-' |
| '-(--length--)-' '-RADIX POINT--char-' |
| .-(34)-. |
+-DECFLOAT--+------+--+---------------------------------------------------+---------------------------------+
| '-(16)-' +-+-BIG----+--ENDIAN--------------------------------+ |
| | '-LITTLE-' | |
| '-EXTERNAL--+--------------+--+-------------------+-' |
| '-(--length--)-' '-RADIX POINT--char-' |
| .-(53)----------. |
+-+-FLOAT--+---------------+-+--+---------------------------------------------------+-----------------------+
| | '-(--integer--)-' | +-+-BIG----+--ENDIAN--------------------------------+ |
| +-REAL---------------------+ | '-LITTLE-' | |
| | .-PRECISION-. | '-EXTERNAL--+--------------+--+-------------------+-' |
| '-DOUBLE--+-----------+----' '-(--length--)-' '-RADIX POINT--char-' |
+-DATE--+--------+--+------------------------------+--------------------------------------------------------+
| '-string-' '-OPTIONALLY ENCLOSED BY--char-' |
+-TIME--+--------+--+------------------------------+--------------------------------------------------------+
| '-string-' '-OPTIONALLY ENCLOSED BY--char-' |
+-TIMESTAMP--+------------+--+--------+--+------------------------------+-----------------------------------+
| '-(--prec--)-' '-string-' '-OPTIONALLY ENCLOSED BY--char-' |
'-DB2SECURITYLABEL--+--------------+--+--------+------------------------------------------------------------'
'-(--length--)-' +-NAME---+
'-STRING-'
DELETE-statement
|--DELETE--FROM--+-table-name-+--+-------------------+---------->
+-view-name--+ '-| period-clause |-'
'-nickname---'
>--WHERE--| search-condition |--+--------------+----------------|
'-WITH--+-RR-+-'
+-RS-+
+-CS-+
'-UR-'
period-clause
|--FOR PORTION OF BUSINESS_TIME FROM--value1--TO--value2--------|
INSERT-statement
|--INSERT--INTO--+-table-name-+--+-----------------------+------>
+-view-name--+ | .-,-----------. |
'-nickname---' | V | |
'-(----column-name-+--)-'
.-,--------------.
V |
>--VALUES--(----+-expression-+-+--)-----------------------------|
+-NULL-------+
'-DEFAULT----'
MERGE-statement
|--MERGE INTO--+-table-name-+--ON--search-condition------------->
'-view-name--'
.----------------------------------------------------------------.
V |
>----WHEN--| matching-condition |--THEN--| modification-operation |-+-->
.-ELSE IGNORE-.
>--+-------------+--+--------------+----------------------------|
'-WITH--+-RR-+-'
+-RS-+
+-CS-+
'-UR-'
matching-condition
|--+-----+--MATCHED--+-----------------------+------------------|
'-NOT-' '-AND--search-condition-'
modification-operation
|--+-UPDATE--+-------------------+--SET--| assignment-clause |-+--|
| '-| period-clause |-' |
+-DELETE--+-------------------+-----------------------------+
| '-| period-clause |-' |
'-| insert-operation |--------------------------------------'
period-clause
|--FOR PORTION OF BUSINESS_TIME FROM--value1--TO--value2--------|
assignment-clause
.-,--------------------------------------------------------.
V |
|----+-column-name-- = --+-expression-+---------------------+-+--|
| +-DEFAULT----+ |
| '-NULL-------' |
| .-,-----------. .-,--------------. |
| V | V | |
'-(----column-name-+--)-- = --(----+-expression-+-+--)-'
+-DEFAULT----+
'-NULL-------'
insert-operation
|--INSERT--+-----------------------+---------------------------->
| .-,-----------. |
| V | |
'-(----column-name-+--)-'
.-,--------------.
V |
>--VALUES----(----+-expression-+-+--)---------------------------|
+-DEFAULT----+
'-NULL-------'
REPLACE-statement
|--REPLACE--INTO--+-table-name-+--+-----------------------+----->
+-view-name--+ | .-,-----------. |
'-nickname---' | V | |
'-(----column-name-+--)-'
.-,--------------.
V |
>--VALUES--(----+-expression-+-+--)-----------------------------|
+-NULL-------+
'-DEFAULT----'
UPDATE-statement
|--UPDATE--+-table-name-+--+-------------------+---------------->
+-view-name--+ '-| period-clause |-'
'-nickname---'
>--SET--| assignment-clause |----------------------------------->
>--+-----------------------------+--+--------------+------------|
'-WHERE--| search-condition |-' '-WITH--+-RR-+-'
+-RS-+
+-CS-+
'-UR-'
assignment-clause
.-,--------------------------------------------------------.
V |
|----+-column-name-- = --+-expression-+---------------------+-+--|
| +-NULL-------+ |
| '-DEFAULT----' |
| .-,-----------. .-,--------------. |
| V | V | |
'-(----column-name-+--)-- = --(----+-expression-+-+--)-'
+-NULL-------+
'-DEFAULT----'
expression
.-operator-----------------------------.
V |
|------+-----+--+-| function-invocation |-+-+-------------------|
+- + -+ +-| (expression) |--------+
'- - -' +-constant----------------+
+-column-name-------------+
+-$field-name-------------+
+-| case-expression |-----+
'-| cast-specification |--'
operator
|--+-CONCAT-+---------------------------------------------------|
+- / ----+
+- * ----+
+- + ----+
'- - ----'
function-invocation
.-,----------------.
V |
|--function-name--(----+--------------+-+--)--------------------|
'-| argument |-'
argument
|--+-expression-----+-------------------------------------------|
+-row-expression-+
'-DEFAULT--------'
case-expression
|--CASE--+-searched-when-clause-+------------------------------->
'-simple-when-clause---'
.-ELSE NULL---------------.
>--+-------------------------+--END-----------------------------|
'-ELSE--result-expression-'
searched-when-clause
.-----------------------------------------------------.
V |
|----WHEN--search-condition--THEN--+-result-expression-+-+------|
'-NULL--------------'
simple-when-clause
.-----------------------------------------------.
V |
|--expression----WHEN--expression--THEN--+-result-expression-+-+--|
'-NULL--------------'
cast-specification
|--CAST--(--+-expression-+--AS--| data-type |--)----------------|
'-NULL-------'
search-condition
|--+-----+----| predicate |------------------------------------->
'-NOT-'
.-----------------------------------------.
V |
>----+-------------------------------------+-+------------------|
'-+-AND-+--+-----+----| predicate |---'
'-OR--' '-NOT-'
predicate
|--+-| basic-predicate |-+--------------------------------------|
'-| NULL-predicate |--'
basic-predicate
|--expression--+-=--+--| expression |---------------------------|
+-<>-+
+-<--+
+->--+
+-<=-+
'->=-'
NULL-predicate
|--| expression |--IS--+-----+--NULL----------------------------|
'-NOT-'
Command parameters
- FROM
- Specifies the source of the input data. The ingest utility always
reads the input data from the client.
- FILE file-name
- Specifies that data is to be read from the specified files. For
the syntax of file names, see File and pipe names.
- PIPE pipe-name
- Specifies that data is to be read from the specified pipes. For
the syntax of pipe names, see File and pipe names.
- FORMAT format-definition
- The data is in the format given by the
specified format definition. The syntax of a format definition is
given in the "format-definition" section described previously.
- DUMPFILE or BADFILE file-name
- Specifies that rows rejected by the formatters are to be written
to the specified file. The formatters reject rows due to the following
types of errors:
- numbers that are invalid or out of range
(based on the field type)
- dates, times, and timestamps that do
not fit the specified format
- Any other errors detected by the formatter
If a relative path is specified, it is relative to the current
directory and the default drive. On Linux and UNIX platforms, if the file does
not exist, the ingest utility creates the file with the permission
flags specified by the user's umask setting. The ingest utility uses
the umask setting that was in effect at the time the DB2 CLP backend process (db2bp) started, which
is usually at the first connection. If the file already exists, the
ingest utility appends the records to the end of the file.
The
ingest utility writes each record in the same format as it appeared
in the input. This means that after correcting the errors, you can
re-run the
INGEST command using the dump file as
input. However, the ingest utility might not write records in the
same order that they appeared across multiple input files or within
each input file. If you want the dump file to contain records in the
same order that they appeared within each input file, set the
num_formatters configuration
parameter to
1. In this case:
- If the operation is INSERT or REPLACE,
records from different input files might still appear out of order
but all the records from a given input file appears in the same order
that they were within that file.
- If the operation is UPDATE, DELETE,
or MERGE, all records are in the same order that
they were across all input files and within each input file
If
an error occurs opening or writing to the file, the ingest utility
issues an error message, including the input source and line number,
and ends the command. In this case, the record is not saved anywhere.
(If the input is from a file, the user can get the original input
record from the file.)
The dump file is guaranteed to be complete
only if the INGEST command completes normally in
a single run. If the INGEST command fails and
the restarted command succeeds, the combination of dump files from
the two commands might be missing records or might contain duplicate
records.
If this parameter is not specified, the ingest utility
issues an error message for each rejected row, but does not save the
row anywhere.
Note: If
you specify this parameter and any rejected input records contain
sensitive data, these records appear in the dump file. You must protect
the file from unauthorized access as needed.
- EXCEPTION TABLE table-name
- Specifies that rows inserted by the
ingest utility and rejected by DB2 with
certain SQLSTATEs are to be written to the specified table. DB2 could reject rows due to the
following types of errors. Note that each of these errors indicates
bad data in the input file:
- For character data, right truncation occurred; for example, an
update or insert value is a string that is too long for the column,
or a datetime value cannot be assigned to a host variable, because
it is too small.
- A null value, or the absence of an indicator parameter was detected;
for example, the null value cannot be assigned to a host variable,
because no indicator variable is specified.
- A numeric value is out of range.
- An invalid datetime format was detected; that is, an invalid string
representation or value was specified.
- The character value for a CAST specification or cast scalar function
is invalid.
- A character is not in the coded character set.
- The data partitioning key value is not valid.
- A resulting row did not satisfy row permissions.
- An insert or update value is null, but the column cannot contain
null values.
- The insert or update value of the FOREIGN KEY is not equal to
any value of the parent key of the parent table.
- A violation of the constraint imposed by a unique index or a unique
constraint occurred.
- The resulting row of the INSERT or UPDATE does not conform to
the check constraint definition.
- The value cannot be converted to a valid security label for the
security policy protecting the table.
- This authorization ID is not allowed to perform the operation
on the protected table.
- The component element is not defined in security label component.
- The specified security label name cannot be found for the specified
security policy.
- The data type, length, or value of the argument to routine is
incorrect.
The specified table must be defined as described in the
"Exception
tables" topic in the related links section. The restrictions listed
in that section and in message SQL3604N also apply, with the following
exception:
- If the target table contains a DB2SECURITYLABEL column, the exception
table column corresponding to the DB2SECURITYLABEL column must be
of type VARCHAR with a length of at least 128. (Type LONG VARCHAR
is also permissible but has been deprecated.)
The ingest utility uses the exception table
only when the operation is INSERT or REPLACE.
If the operation is UPDATE, MERGE,
or DELETE, and the EXCEPTION TABLE parameter
is specified, the utility issues an error.
If an error occurs inserting into the table,
the ingest utility issues an error message, including the input source
and line number, and continues. In this case, the record is not saved
anywhere. (If the input is from a file, the user can get the original
input record from the file.)
If the
table name is not fully qualified, the ingest utility uses the value
of CURRENT SCHEMA for the schema name. If the EXCEPTION TABLE parameter
is not specified, the ingest utility issues an error message for each
rejected row but does not save the row anywhere.
- WARNINGCOUNT n
- Specifies that the INGEST command is to stop
after n warning and error messages have been issued.
This parameter covers only certain types of errors. For more information,
see Categories of errors. The number
is the total for all input sources. If this limit is reached in the
middle of a transaction, the transaction is rolled back. However,
rows that were committed to the target table before this limit was
reached remain in the table. If the number is 0, the INGEST command continues regardless
of the number of warnings and errors. The range is 0 - 2,147,483,647
(max 32-bit signed integer). The default is 0.
- MESSAGES file-name
- Specifies the file to receive informational, warning, and error
messages.
If a relative path is specified, it is relative to the
current directory and the default drive. On Linux and UNIX platforms,
if the file does not exist, the ingest utility creates the file with
the permission flags specified by the user's umask setting. The ingest
utility uses the umask setting that was in effect at the time the DB2 CLP backend process (db2bp)
started, which is usually at the first connection. If the file already
exists, the ingest utility appends to the end of the file. If this
parameter is not specified, the ingest utility writes messages to
standard output.
Even when this parameter is specified, the
ingest utility writes messages to standard output in the following
cases:
- syntax errors
- an input file is not found or not readable
- target or exception table is not found
- the dump file or messages file cannot be opened
- other errors detected at start-up
Also, the summary messages issued at the end (showing number
of rows read, inserted, and so on, and the total number of warnings
and errors) are always written to standard output.
If an error
occurs opening or writing to the messages file, the ingest utility
issues an error message and ends the command.
For examples of
messages from the INGEST command, see Messages from the INGEST command.
- RESTART NEW job-ID
- Specifies that if the INGEST command fails
before completing, it can be restarted from the point of the last
commit by specifying the RESTART CONTINUE option
on a later INGEST command. The job-ID is
a string of up to 128 bytes that uniquely identifies the INGEST command.
This job-ID must be unique across all INGEST commands
in the current database that specified the RESTART option
and are not yet complete. (These could be commands that are still
running or that failed before completing.) Once the INGEST command
completes, you can reuse the job-ID on the RESTART parameter
of a later INGEST command. If the job-ID is omitted,
the ingest utility generates one. This option is the default.
Before
using this option, you must have created the restart log table. For
more information, see "Restarting a failed INGEST command" in the
following related links section.
- RESTART OFF
- Specifies that no restart information is to be saved. If the INGEST command
fails before completing, it cannot be restarted using the RESTART
CONTINUE option. If you want to rerun the command to completion,
you need to restore the target table to its state before running the INGEST command,
and rerun the INGEST command with the same input
data.
- RESTART CONTINUE job-ID
- Specifies that the ingest utility is to restart a previous INGEST command
that specified the RESTART NEW option and failed
before completing. The job-ID specified on this
option must match the job-ID specified on the previous INGEST command.
This restarted command is also restartable. Once the restarted command
completes, you can reuse the job-ID on the RESTART NEW parameter
of a later INGEST command. For more information,
see "Restarting a failed INGEST command" in the following related
links section.
- RESTART TERMINATE job-ID
- Specifies that the ingest utility is to clean up the restart information
for a previous INGEST command that specified the RESTART
NEW option and failed before completing. The string specified
on this option must match the string specified on the previous INGEST command.
This
option is intended for when a previous restartable
INGEST command
failed, and you plan to never resume the job. Once you specify this
option, the
INGEST command that failed earlier
can no longer be restarted. You can, however, reuse the string on
the
RESTART NEW parameter of a later
INGEST command.
Note: Unlike with the TERMINATE parameter of the LOAD command,
data that the original INGEST command committed
before its failure is still in the target table. For more information,
see "Terminating a failed INGEST command" in the following related
links section.
- DELIMITED BY char
- Specifies that each input record is a sequence of text fields
delimited by the specified character. The delimiter can be any single-byte
character except the following values:
- the null character (X'00')
- blank (X'20')
- carriage return or line feed (X'0D' or X'0A')
- dot (X'2E'), regardless of what is specified on the RADIX POINT
clause
- the decimal point (specified by the RADIX POINT clause; the default
is the dot .)
- the string delimiter (specified by the OPTIONALLY
ENCLOSED BY clause; the default is the double quote ")
If the input data code page is a DBCS or EUC code page, the
character must be in the range X'01' to X'3F', inclusive. Note that
this excludes the vertical bar (| or X'7C'). If the input data code
page is not a single-byte code page, the character must be in the
range X'01' to X'7F', inclusive. If the input code page is a single-byte
code page, the character can be in the range X'01' to X'FF'.
The
default is a comma ,.
- string
The metavariable
string specifies any character
string constant, including:
- A sequence of characters that starts and ends with a string delimiter
which is an apostrophe (').
- A hexadecimal constant: X followed by a sequence of hex characters
that starts and ends with an apostrophe.
- A Unicode string constant: U& followed by a sequence of characters
that starts and ends with an apostrophe and that is optionally followed
by the UESCAPE clause.
- A graphic string constant: G or N followed by a sequence of double-byte
characters that starts and ends with an apostrophe.
- A hexadecimal graphic string constant: GX followed by a sequence
of hex characters that starts and ends with an apostrophe.
- A UCS-2 graphic string constant: UX followed by a sequence of
hex characters that starts and ends with an apostrophe.
If the string contains the enclosing character, it must
be doubled inside the string. For example, if the string is "don't
know" and the enclosing character is a single quote, specify 'don''t
know'. Examples of this are 'This is a string', X'303132',
and U&'Unicode chars'.
- char
The metavariable char specifies a character
string constant that is 1 byte in length. Examples of this are 'A' and X'30'.
- POSITIONAL
- Specifies that each input record is a sequence of text and/or
binary fields identified by their position in the input record. The
position of each field is given by the POSITION clause in the field
definition.
- RECORDLEN length
- Specifies that the input is fixed-length records and each record
has the specified length in bytes. Any carriage return or line feed
characters (X'0D' or X'0A') in the record are assumed to be part of
the data. Note that the record length that some editors display does
not include these characters, so you might need to add 1 or 2 to the
record length that the editor displays. If the input contains binary
data, this parameter is required.
The range is
1 - 32,767. If RECORDLEN specifies a length less
than the sum of the field lengths specified on each field definition,
the ingest utility issues an error message and exits. If RECORDLEN specifies
a length greater than the maximum field end position, the ingest utility
ignores the bytes past the end of the field with the maximum end position.
If the last record is too short, the ingest utility issues an error
message and rejects the record.
If this parameter is omitted,
the ingest utility recognizes both of the following as the record
(row) delimiter, regardless of platform:
- line feed (X'0A')
- one or more carriage returns followed by a line
feed (X'0D0A')
- INPUT CODEPAGE integer
- Specifies the code page of the input data. The ingest utility
supports the code pages listed in the "Supported territory codes
and code pages" topic in the Related links section, except
the following values :
- 1200 (UTF-16)
- any code pages listed as "host" code pages
The default is the DB2 application
code page.
If the input data is not
in the code page specified by this parameter, the ingest utility does
not issue an error. However, if the ingest utility or DB2 needs to convert the input data into another
code page, the utility might convert some characters to the substitute
character (Unicode U+001A) or other characters that you did not intend.
For more information, see the topic on "Code page considerations
for the ingest utility".
- IMPLICITLYHIDDENINCLUDE
- Specifies that implicitly hidden columns are included in the default
column list. If the DB2_DMU_DEFAULT registry variable on the client-side
is also specified, then this keyword takes precedence. This keyword
is only allowed for INSERT statements where there is not a column
list following the table name. If the SQL statement is not an INSERT
or there is a column list after the table name, then an error is returned
(SQLCODE -2918).
- IMPLICITLYHIDDENMISSING
- Specifies that implicitly hidden columns are omitted from the
default column list. If the DB2_DMU_DEFAULT registry variable on the
client-side is also specified, then this keyword takes precedence.
This keyword is only allowed for INSERT statements where there is
not a column list following the table name. If the SQL statement is
not an INSERT or there is a column list after the table name, then
an error is returned (SQLCODE -2918).
- field-list
- Specifies the list of field definitions. If the list is omitted,
it defaults as described in Default field definition list.
- $field-name
- Field names can be from 2 to 129 bytes in length (including the
dollar sign) and follow the same rules as SQL identifiers. You can
also specify delimited field names by specifying a dollar sign followed
by a delimited name, for example, $"My Field Name".
If
the SQL statement specifies a table name column name that starts with
a dollar sign, specify the table name or column name as a delimited
identifier. For example, the ingest utility interprets the following
identifiers as indicated:
Table 1. The meaning of field name identifiersIdentifier |
Meaning |
$name |
field "$NAME" |
$NAME |
field "$NAME" |
$"My Name" |
field "$My Name" |
"$name" |
column "$name" (in lowercase) |
"$NAME" |
column "$NAME" (in upper case) |
"$My Name" |
column "$My Name" |
If you specify the
INGEST command on the
system command line in UNIX,
in order to avoid the field names being seen as environment variables,
you need to put the command in single quotation marks. For more information,
see
Using INGEST on the system command line (UNIX).
- POSITION (start:end)
- This clause is valid only when the format is POSITIONAL. It specifies
the starting position and optionally the ending position of the field
within the record. The units are bytes. The first byte in the record
is at position 1. If the ending position is specified, it must be
greater than or equal to the starting position.
Positions specified
on different fields can overlap. For example, if the full name of
an American state is in positions 1 - 10, you could define the following
two fields:
- $state_full_name POSITION(1:10)
- $state_abbreviation POSITION(1:2)
If there is data outside the
specified field positions, the ingest utility ignores it. For example,
if the field definition list specifies two fields at POSITION(1:10)
and POSITION(21:30), the ingest utility ignores data in positions
11:20 and any data past position 30 to the end of the record.
If
the POSITION clause is omitted, the starting position is assumed to
be the next byte following the previous field, or position 1 if this
is the first field. If the ending position is omitted, the utility
uses a default or issues an error, as described in Rules and defaults for field lengths.
When
the field definition does not specify a length, the ingest utility
uses a length of end_position - start_position + 1. This length is
subject to the following restrictions:
- Fields of type SMALLINT, INTEGER, BIGINT, DECIMAL, REAL, FLOAT,
and DECFLOAT with the EXTERNAL modifier have a maximum length of 50.
- If a format string is specified for fields of type DATE, TIME,
and TIMESTAMP(p), the field length must be greater than or equal to
length of the shortest value that matches the format string
- If a format string is not specified:
- For DATE, the field length must be between 8 and 10, inclusive.
- For TIME, the field length must be between 4 and 8, inclusive.
- For TIMESTAMP, the field length must be between 19 and 32, inclusive.
- field-type
- Specifies the data type of the field.
- CHARACTER (length)
- Character data. The code page is specified by the INPUT
CODEPAGE parameter. The default is the application code
page.
For information about the default length, see Rules and defaults for field lengths.
- NOTRIM, LTRIM, RTRIM, TRIM
- Specifies that leading blanks (LTRIM), trailing blanks (RTRIM),
or both leading and trailing blanks (TRIM) in the field and not enclosed
by the string delimiter are not part of the string. When the format
is DELIMITED, the default is TRIM. When the format is POSITIONAL,
the default is NOTRIM. If this parameter is specified on any character
field, it must have the same value on all other character fields.
- OPTIONALLY ENCLOSED BY char
- This clause is valid only when the format is
DELIMITED. The string can be enclosed by the specified character.
The restrictions are the same as for the field delimiter. In addition,
the character cannot be the same as the field delimiter or the decimal
point. If this parameter is specified on any character, date, time,
or timestamp field, it must have the same value on all other character,
date, time, and timestamp fields.
The utility uses the following
algorithm to determine whether the value starts with the delimiter:
- If the TRIM or LTRIM option is specified (or defaults), the utility
tests the first character in the field after trimming of leading blanks.
- If the RTRIM or NOTRIM option is specified, the utility tests
the first character in the field. (In other words, when RTRIM or NOTRIM
is specified, the utility does not skip over leading blanks to locate
the string delimiter.)
Except for record delimiters,
the utility considers all characters between the enclosing characters
to be part of the string. If the string contains the enclosing character,
it must be doubled inside the string. For example, if the string is don't
know and the enclosing character is a single quote, specify 'don''t
know'. If the string does not end with the enclosing character,
the utility considers the opening delimiter to be part of the string.
The
utility issues warning SQL3114W under the following conditions:
- TRIM or RTRIM is specified (or defaults) and there are extra non-blank
characters between the closing delimiter and the next field delimiter
(or end of record).
- NOTRIM or LTRIM is specified and there are extra characters (including
blanks) between the closing delimiter and the next field delimiter
(or end of record).
When this parameter is not
specified, the default is OPTIONALLY ENCLOSED BY '"'.
In that case, if the value does not start with a double quote, the
utility considers any character (other than the field delimiter) around
the string to be part of the string.
- FOR BIT DATA
- Specifies that the field contains bit data and that the ingest
utility is not to convert it from the input data code page to the
ingest utility application code page. If this field is used to assign
to a column that specifies FOR BIT DATA, it is strongly recommended
that the field definition also specify FOR BIT DATA. Otherwise, the
value assigned to the column is unpredictable, because of how the
ingest utility sometimes optimizes code page conversion.
- INTEGER/SMALLINT/BIGINT EXTERNAL (length)
- A character string of up to length bytes representing
a number. The string can be an integer, decimal, or floating point
constant. If the string specifies a decimal or floating point value,
the ingest utility converts it to the specified integer type by truncating
the fractional part. If the string is not a valid number, or the number
is out of range, the ingest utility issues an error message and rejects
the record. For more information, see Handling of invalid numeric data.
Use EXTERNAL with numeric field
types to indicate that the field value is specified as ASCII characters
rather than in binary. If EXTERNAL is specified on any integer field,
it must be specified on all other integer, decfloat, and float fields.
For
information about default length, see Rules and defaults for field lengths.
- DECIMAL (precision, scale) EXTERNAL (length)
- A character string of up to length bytes
representing a decimal number. The string can be an integer, decimal,
or floating point constant. The ingest utility converts it to a decimal
value with the specified precision and scale. If the string specifies
a decimal value with a scale larger than the scale specified on the
field type, the ingest utility truncates the fractional part of the
value with no warning or error. If the string specifies a decimal
value with a scale smaller than the scale specified on the field type,
the ingest utility pads the fractional part of the value with zeros.
If the string is not a valid number, or the number is out of range
(other than the scale), the ingest utility issues an error message
and rejects the record. For more information, see Handling of invalid numeric data.
The precision can range from 1 to 31. The
scale can range from 0 to the precision. The default is (5,0). For
more information aboutdefault length, see Rules and defaults for field lengths.
If EXTERNAL is specified
on any decimal field, it must be specified on all other decimal fields.
- DECFLOAT (16|34) EXTERNAL (length)
- A character string of up to length bytes representing
a decimal floating point number. The string can be an integer, decimal,
or floating point constant, or a decimal floating point special value.
The ingest utility converts it to a decimal floating point value of
the specified precision. If the string is not a valid number or decimal
floating point special value, or the number is out of range, the ingest
utility issues an error message and rejects the record. For more information,
see Handling of invalid numeric data. The precision can be either
16 or 34 (the default).
For more information about the default length,
see Rules and defaults for field lengths.
If EXTERNAL is specified
on any decfloat field, it must be specified on all other integer,
decfloat, and float fields.
- FLOAT(integer) EXTERNAL(length)
- A character string of up to length bytes representing
a floating point number. The string can be an integer, decimal, or
floating point constant. If the string is not a valid number, or the
number is out of range, the ingest utility issues an error message
and rejects the record. For more information, see Handling of invalid numeric data.
The value of the integer must be in the
range 1 through 53. Values 1 through 24 indicate single precision,
and values 25 through 53 indicate double precision. The default is
53. You can also specify:
- REAL
- For single-precision floating-point.
- DOUBLE, DOUBLE PRECISION, or FLOAT
- For double-precision floating-point.
If EXTERNAL is specified on any float field,
it must be specified on all other integer, decfloat, and float fields.
For
information about the default length, see Rules and defaults for field lengths.
- RADIX POINT char
- The character to use as the decimal point. The restrictions are
the same as those for the field delimiter. In addition, the character
cannot be the same as the field delimiter or the string delimiter.
If this parameter is specified on any integer, decimal, decfloat,
or float field, it must have the same value on all other integer,
decimal, decfloat, and float fields. The default is a dot ..
- INTEGER, SMALLINT, or BIGINT
- An integer of the specified type in binary. These types can be
specified only when the format is POSITIONAL and the RECORDLEN parameter
specified. The following can also be specified:
- BIG | LITTLE ENDIAN
- Specifies whether the integer is in big endian
format (most significant byte at low address) or little endian format
(least significant byte at low address). This option applies to integer,
float, and decfloat types, but not decimal types. The default is the
default format on the current hardware platform where the ingest utility
is running.
- DECIMAL (precision, scale) PACKED | ZONED
- A decimal number of the specified precision and scale in packed
or zoned format. These types can be specified only when the format
is POSITIONAL and the RECORDLEN parameter is specified.
The precision can range from 1 to 31. The scale can range from 0 to
the precision. The default is (5,0). The PACKED/ZONED setting must
be the same on all other decimal fields. The default is PACKED.
- DECFLOAT(16|34)
- A decimal floating point number of the specified precision in
binary. The precision can be either 16 or 34 (the default). This type
can be specified only when the format is POSITIONAL and the RECORDLEN parameter
is specified.
- FLOAT(integer)
- A single- or double-precision floating point number of the specified
size in binary (IEEE floating point format). The value of the integer
must be in the range 1 through 53. Values 1 through 24 indicate single
precision and values 25 through 53 indicate double precision. The
default is 53. These types can be specified only when the format is
POSITIONAL and the RECORDLEN parameter is specified.
You can also specify:
- REAL
- For single-precision floating-point.
- DOUBLE, DOUBLE PRECISION, or FLOAT
- For double-precision floating-point.
- DATE, TIME, TIMESTAMP
- A date, time, or timestamp specified as a string. If the database
has thedate_compat database configuration parameter
set to ON, DATE is equivalent to TIMESTAMP(0).
For
information about the default length, see Rules and defaults for field lengths.
- precision
- Specifies the precision of a timestamp. The range is 0 - 12. The
default is 6. If the input data specifies a timestamp with a precision
larger than the precision specified on the field type, the ingest
utility truncates the fractional part of the timestamp with no warning
or error (provided that the timestamp has the format specified on
the format string). If the input data specifies a timestamp with a
precision smaller than the precision specified on the field type,
the ingest utility pads the fractional part of the timestamp with
zeros.
- string
- Specifies the format of the date, time, or timestamp value. The
elements of the format string are those currently supported by the IMPORT and LOAD commands.
For more information, see the descriptions of the dateformat, timeformat,
and timestampformat file type modifiers in the "IMPORT
command" topic.
If the format is POSITIONAL and the string is
longer than the length specified by the POSITION parameter,
the ingest utility issues an error. For example, if the string is
'yyyy-mm-dd' (10 bytes), but the position is 1:6, the ingest utility
issues an error.
The date, time, or timestamp value in the input
data must match the format, except that you can omit sub-fields on
the right. For example, if the timestamp format is "yyyy/mm/dd hh:mm:ss.uuu",
the following timestamps are valid:
2010/06/03 12:34
2010/06/03 12:34:56
2010/06/03 12:34:56.1
2010/06/03 12:34:56.12
2010/06/03 12:34:56.123
However, "2010/06/03 12:34:56.1234"
is not valid. When you omit sub-fields, they default as described
in the descriptions of the
dateformat,
timeformat,
and
timestampformat file type modifiers in the
"IMPORT
command" topic.
There is one exception to the previous description
of omitting sub-fields: When the format is POSITIONAL and the field
is the last field in the record and the record ends before the field's
end position, the utility issues a warning and uses NULL for the value.
If
the date, time, or timestamp value is not in the specified format
(other than sub-fields missing from the right), the ingest utility
issues an error message and rejects the row.
The following rules
also apply
- If this parameter is specified on any DATE field, it must also
be specified and have the same value on all other DATE fields. If
the database has thedate_compat database configuration
parameter on (in which case, field type DATE is equivalent to TIMESTAMP(0)),
it must also be specified and have the same value on all other TIMESTAMP
fields.
- If it is specified on any TIME field, it must also be specified
and have the same value on all other TIME fields.
- If it is specified on any TIMESTAMP field, it must also be specified
and have the same value on all other TIMESTAMP fields. If the database
has thedate_compat database configuration parameter
on (in which case, field type DATE is equivalent to TIMESTAMP(0)),
it must also be specified and have the same value on all other DATE
fields.
If the format string is not specified, the ingest
utility recognizes various formats, as shown in
Table 2.
Table 2. Default values for DATE, TIME, and TIMESTAMP Field type |
Default format |
DATE |
- If thedate_compat database configuration
parameter is set to OFF, then:
- 'yyyymmdd' (delimited format only)
- 'yyyy-mm-dd'
- 'mm/dd/yyyy'
- 'dd.mm.yyyy'
Note: In all but the first format, the month and day can also
be specified as a single digit, for example, 'yyyy-m-d'.
- If thedate_compat database configuration
parameter is set to ON, then the same defaults as
TIMESTAMP(0).
|
TIME |
- 'hh.mm.ss'
- 'hh:mm tt'
- 'hh:mm:ss'
|
TIMESTAMP |
'yyyy-mm-dd-hh.mm.ss.uuuuuuuuuuuu' or 'yyyy-mm-dd
hh:mm:ss.uuuuuuuuuuuu' - 'yyyy-mm-dd-hh.mm.ss.uuuuuuuuuuuu'
- 'yyyy-mm-dd hh:mm:ss.uuuuuuuuuuuu'
- 'yyyy-mm-dd'
- 'mm/dd/yyyy'
- 'dd.mm.yyyy'
- 'yyymmdd' (delimited format only)
Note: In all but the first format, the month and day can also
be specified as a single digit, for example, 'yyyy-m-d-hh.mm.ss.uuuuuuuuuuuu''.
|
- OPTIONALLY ENCLOSED BY char
- This clause is valid only when the format is
DELIMITED. The string can be enclosed by the specified character.
The restrictions are the same as for the field delimiter. In addition,
the character cannot be the same as the field delimiter or the decimal
point. If this parameter is specified on any character, date, time,
or timestamp field, it must have the same value on all other character,
date, time, and timestamp fields.
The utility uses the following
algorithm to determine whether the value starts with the delimiter:
- If the TRIM or LTRIM option is specified (or defaults), the utility
tests the first character in the field after trimming of leading blanks.
- If the RTRIM or NOTRIM option is specified, the utility tests
the first character in the field. (In other words, when RTRIM or NOTRIM
is specified, the utility does not skip over leading blanks to locate
the string delimiter.)
Except for record delimiters,
the utility considers all characters between the enclosing characters
to be part of the string. If the string contains the enclosing character,
it must be doubled inside the string. For example, if the string is don't
know and the enclosing character is a single quote, specify 'don''t
know'. If the string does not end with the enclosing character,
the utility considers the opening delimiter to be part of the string.
The
utility issues warning SQL3114W under the following conditions:
- TRIM or RTRIM is specified (or defaults) and there are extra non-blank
characters between the closing delimiter and the next field delimiter
(or end of record).
- NOTRIM or LTRIM is specified and there are extra characters (including
blanks) between the closing delimiter and the next field delimiter
(or end of record).
When this parameter is not
specified, the default is OPTIONALLY ENCLOSED BY '"'.
In that case, if the value does not start with a double quote, the
utility considers any character (other than the field delimiter) around
the string to be part of the string.
- DB2SECURITYLABEL (length)
- Specifies a DB2 security
label. If neither NAME nor STRING is specified and the format is POSITIONAL,
the default format is encoded numeric format; otherwise an error is
returned.
For information about the default length, see Rules and defaults for field lengths.
- NAME
- The DB2 security label is
specified by its name. If the format is DELIMITED, either NAME or
STRING must be specified. If this parameter is specified on any DB2SECURITYLABEL
field, it must have the same value on all other DB2SECURITYLABEL fields.
If no security label exists with the indicated name for the security
policy protecting the table, the row is not loaded and a warning is
returned.
- STRING
- The DB2 security label is
specified in string format. If the format is DELIMITED, either NAME
or STRING must be specified. If this parameter is specified on any
DB2SECURITYLABEL field, it must have the same value on all other DB2SECURITYLABEL
fields. If a string is not in the proper format, the row is not loaded
and a warning is returned. If the string does not represent a valid
security label that is part of the security policy protecting the
table, the row is not loaded and a warning is returned.
The default
is encoded numeric format and is allowed only when the format is POSITIONAL.
- DEFAULTIF (position) = char
- Specifies
a input value for a column in the input file to convert to the database
table column's default value. The position is optional and can be
specified only if the format is POSITIONAL. If the position is specified,
the utility reads the character from the input record before applying
any specified or default trim option. The position does not have to
be in the field. If the position is omitted, the utility first applies
any specified or default trim option and then reads the first character
of the field. In either case, if the character read from the input
record matches the one specified, the utility considers the field
value to be the default value of the table column to which the field
is assigned.
The
DEFAULTIF parameter has the
following restrictions:
- When the position is omitted, there must be at least one space
between DEFAULTIF and the equal sign. Otherwise a
syntax error occurs.
- The DEFAULTIF character can be any single-byte
character except the following values:
- the null character (X'00')
- carriage return or line feed (X'0D' or X'0A')
If the input data code page is a DBCS or EUC code page, the character
must be in the range X'01' to X'3F', inclusive. If the input data
code page is not a single-byte code page, the character must be in
the range X'01' to X'7F', inclusive. If the input code page is a single-byte
code page, the character can be in the range X'01' to X'FF'.
- The field must be assigned to only one column and cannot be part
of an expression that is assigned to that column. For example, consider
the following INGEST command:
INGEST FROM FILE ...
UPDATE my_table
SET (c1, c2, c3, c4, c5) =
($field1, $field2, $field2, $field3, $field4+$field5)
WHERE $field3 = 1;
Only $field1 can specify DEFAULTIF.
$field2 cannot because it is assigned to multiple columns. $field3
cannot because it is also used in a predicate. $field4 and $field5
cannot because they are used in an expression.
- The default value of the table column must be a constant or NULL.
It cannot be a special register.
Note: If a column is nullable (that
is, NOT NULL is not specified on the column definition) and the WITH
DEFAULT clause is not specified on the column definition,
the column still has a default value of NULL.
- If the table column type differs from the field type, the ingest
utility converts the default value to the field type, applying the
same rules used to convert input data to the field type. For example,
if the table column is defined as FLOAT WITH DEFAULT 3.7 and the field
is defined as INTEGER DEFAULTIF=..., the ingest utility uses a default
value of 3. Similarly, if the length of the table column's default
value is greater than the field length, the utility truncates the
default value to the field length.
If these restrictions are violated, the ingest utility issues
an error and the
INGEST command ends.
Examples
Basic ingest examples
The following example
inserts data from a delimited text file:
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
INSERT INTO my_table;
The following example inserts
data from a delimited text file with fields separated by a comma (the
default). The fields in the file correspond to the table columns.
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$field1 INTEGER EXTERNAL,
$field2 DATE 'mm/dd/yyyy',
$field3 CHAR(32)
)
INSERT INTO my_table
VALUES($field1, $field2, $field3);
Delimiter override example
The following example
inserts data like the previous example, but the fields are separated
by a vertical bar.
INGEST FROM FILE my_file.txt
FORMAT DELIMITED by '|'
(
$field1 INTEGER EXTERNAL,
$field2 DATE 'mm/dd/yyyy',
$field3 CHAR(32)
)
INSERT INTO my_table
VALUES($field1, $field2, $field3);
Omitting the field definition and VALUES list example
In
the following example, the table is defined as follows:
CREATE TABLE my_table (
c1 VARCHAR(32),
c2 INTEGER GENERATED BY DEFAULT AS IDENTITY,
c3 INTEGER GENERATED ALWAYS AS (c2 + 1),
);
The user issues
the following
INGEST command:
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
INSERT INTO mytable;
- The default field definition list will be:
(
$C1 CHARACTER(32),
$C2 INTEGER EXTERNAL,
$C3 INTEGER EXTERNAL
)
- The default VALUES list on the INSERT statement is:
VALUES($C1, $C2, DEFAULT)
Note
that the third value is DEFAULT because the column that corresponds
to field $C3 is defined as GENERATED ALWAYS. The fourth value is omitted
because it has no field.
Extra fields used to compute column values example
The
following example is the same as the delimiter override example, but
only the first two fields correspond to the first two table columns
(PROD_ID and DESCRIPTION), whereas the value for the third table column
(TOTAL_PRICE) is computed from the remaining three fields
INGEST FROM FILE my_file.txt
FORMAT DELIMITED BY '|'
(
$prod_ID CHAR(8),
$description CHAR(32),
$price DECIMAL(5,2) EXTERNAL,
$sales_tax DECIMAL(4,2) EXTERNAL,
$shipping DECIMAL(3,2) EXTERNAL
)
INSERT INTO my_table(prod_ID, description, total_price)
VALUES($prod_id, $description, $price + $sales_tax + $shipping);
Filler fields example
The following example
inserts data from a delimited text file with fields separated by a
comma (the default). The fields in the file correspond to the table
columns except that there are extra fields between the fields for
columns 2 and 3 and columns 3 and 4.
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$field1 INTEGER,
$field2 CHAR(8),
$filler1 CHAR,
$field3 CHAR(32),
$filler2 CHAR,
$field4 DATE
)
INSERT INTO my_table VALUES($field1, $field2, $field3, $field4);
Format modifiers example
The following example
inserts data from a delimited text file in code page 850. Date fields
are in American format and char fields are enclosed in equal signs.
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
INPUT CODEPAGE 850
(
$field1 INTEGER EXTERNAL,
$field2 DATE 'mm/dd/yyyy',
$field3 CHAR(32) OPTIONALLY ENCLOSED BY '='
)
INSERT INTO my_table
VALUES($field1, $field2, $field3);
Positional example
The following example inserts
data from a file with fields in the specified positions. The fields
in the file correspond to the table columns.
INGEST FROM FILE my_file.txt
FORMAT POSITIONAL
(
$field1 POSITION(1:8) INTEGER EXTERNAL,
$field2 POSITION(10:19) DATE 'yyyy-mm-dd',
$field3 POSITION(25:34) CHAR(10)
)
INSERT INTO my_table
VALUES($field1, $field2, $field3);
DEFAULTIF examples
This example is similar
to the previous example, except if the second field starts with a
blank, the ingest utility inserts the default value:
INGEST FROM FILE my_file.txt
FORMAT POSITIONAL
(
$field1 POSITION(1:8) INTEGER EXTERNAL,
$field2 POSITION(10:19) DATE 'yyyy-mm-dd' DEFAULTIF = ' ',
$field3 POSITION(25:34) CHAR(10)
)
INSERT INTO my_table
VALUES($field1, $field2, $field3);
This example is the
same as the previous example, except that the default indicator is
in the column after the data columns:
INGEST FROM FILE my_file.txt
FORMAT POSITIONAL
(
$field1 POSITION(1:8) INTEGER EXTERNAL,
$field2 POSITION(10:19) DATE 'yyyy-mm-dd' DEFAULTIF(35) = ' ',
$field3 POSITION(25:34) CHAR(10)
)
INSERT INTO my_table
VALUES($field1, $field2, $field3);
Multiple input sources example
This example
inserts data from three delimited text files:
INGEST FROM FILE my_file.txt, my_file2.txt, my_file3.txt
FORMAT DELIMITED
(
$field1 INTEGER EXTERNAL,
$field2 DATE 'mm/dd/yyyy',
$field3 CHAR(32)
)
INSERT INTO my_table
VALUES($field1, $field2, $field3);
Pipe example
This example
inserts data from a pipe:
INGEST FROM PIPE my_pipe
FORMAT DELIMITED
(
$field1 INTEGER EXTERNAL,
$field2 DATE 'mm/dd/yyyy',
$field3 CHAR(32)
)
INSERT INTO my_table
VALUES($field1, $field2, $field3);
Options example
This example inserts data from
a delimited text file with fields separated by a comma (the default).
The fields in the file correspond to the table columns. The command
specifies that write rows rejected by DB2 (for
example, due to constraint violations) are to be written to table
EXCP_TABLE, rows rejected due to other errors are to be discarded,
and messages are to be written to file
messages.txt.
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$field1 INTEGER EXTERNAL,
$field2 DATE 'mm/dd/yyyy',
$field3 CHAR(32)
)
EXCEPTION TABLE excp_table
MESSAGES messages.txt
INSERT INTO my_table
VALUES($field1, $field2, $field3);
Restart example
This example issues an
INGEST command
(which is restartable, by default) with a specified ingest job id:
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$field1 INTEGER EXTERNAL,
$field2 DATE 'mm/dd/yyyy',
$field3 CHAR(32)
)
RESTART NEW 'ingestcommand001'
INSERT INTO my_table
VALUES($field1, $field2, $field3);
If the command terminates
before completing, you can restart it with the following command:
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$field1 INTEGER EXTERNAL,
$field2 DATE 'mm/dd/yyyy',
$field3 CHAR(32)
)
RESTART CONTINUE 'ingestcommand001'
INSERT INTO my_table
VALUES($field1, $field2, $field3);
Restart terminate example
This example issues
the same
INGEST command as the previous "Restart
example":
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$field1 INTEGER EXTERNAL,
$field2 DATE 'mm/dd/yyyy',
$field3 CHAR(32)
)
RESTART NEW 'ingestcommand001'
INSERT INTO my_table
VALUES($field1, $field2, $field3);
If the command terminates
before completing and you do not plan to restart it, you can clean
up the restart records with the following command.
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$field1 INTEGER EXTERNAL,
$field2 DATE 'mm/dd/yyyy',
$field3 CHAR(32)
)
RESTART TERMINATE 'ingestcommand001'
INSERT INTO my_table
VALUES($field1, $field2, $field3);
After issuing this
command, you can no longer restart the
INGEST command
with the job id: "ingestcommand001", but you can reuse that string
on the
RESTART NEW parameter of a new
INGEST command.
Reordering columns example
This example inserts
data from a delimited text file with fields separated by a comma.
The table has three columns and the fields in the input data are in
the reverse order of the table columns.
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$field1 INTEGER EXTERNAL,
$field2 DATE 'mm/dd/yyyy',
$field3 CHAR(32)
)
INSERT INTO my_table
VALUES($field3, $field2, $field1);
Basic UPDATE, MERGE, and DELETE examples
The
following examples update the table rows whose primary key matches
the corresponding fields in the input file.
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$key1 INTEGER EXTERNAL,
$key2 INTEGER EXTERNAL,
$data1 CHAR(8),
$data2 CHAR(32),
$data3 DECIMAL(5,2) EXTERNAL
)
UPDATE my_table
SET (data1, data2, data3) = ($data1, $data2, $data3)
WHERE (key1 = $key1) AND (key2 = $key2);
or
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$key1 INTEGER EXTERNAL,
$key2 INTEGER EXTERNAL,
$data1 CHAR(8),
$data2 CHAR(32),
$data3 DECIMAL(5,2) EXTERNAL
)
UPDATE my_table
SET data1 = $data1, data2 = $data2, data3 = $data3
WHERE (key1 = $key1) AND (key2 = $key2);
This example
merges data from the input file into the target table. For input rows
whose primary key fields match a table row, it updates that table
row with the input row. For other input rows, it adds the row to the
table.
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$key1 INTEGER EXTERNAL,
$key2 INTEGER EXTERNAL,
$data1 CHAR(8),
$data2 CHAR(32),
$data3 DECIMAL(5,2) EXTERNAL
)
MERGE INTO my_table
ON (key1 = $key1) AND (key2 = $key2)
WHEN MATCHED THEN
UPDATE SET (data1, data2, data3) = ($data1, $data2, $data3)
WHEN NOT MATCHED THEN
INSERT VALUES($key1, $key2, $data1, $data2, $data3);
This
example deletes table rows whose primary key matches the corresponding
primary key fields in the input file.
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$key1 INTEGER EXTERNAL,
$key2 INTEGER EXTERNAL
)
DELETE FROM my_table
WHERE (key1 = $key1) AND (key2 = $key2);
Complex SQL examples
Consider the following
example in which there is a table with columns KEY, DATA, and ACTION.
The following command updates the DATA column of table rows where
the primary key column (KEY) matches the corresponding field in the
input file and the ACTION column is 'U':
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$key_fld INTEGER EXTERNAL,
$data_fld INTEGER EXTERNAL
)
UPDATE my_table
SET data = $data_fld
WHERE (key = $key_fld) AND (action = 'U');
The following
example is the same as the previous example except that if the keys
match and the ACTION column is 'D', then it deletes the row from the
table:
INGEST FROM FILE my_file.txt
FORMAT DELIMITED
(
$key_fld INTEGER EXTERNAL,
$data_fld INTEGER EXTERNAL
)
MERGE INTO my_table
ON (keyl = $key_fld)
WHEN MATCHED AND (action = 'U') THEN
UPDATE SET data = $data_fld
WHEN MATCHED AND (action = 'D') THEN
DELETE;
Usage notes
Using INGEST on the system command
line (UNIX)
If you
specify the INGEST command on the system command
line in a UNIX command shell,
you must specify field names in single quotation marks or put an escape
character (backslash) before the $ that starts
the field name. Otherwise, the command shell interprets the field
names as environment variables and replace them with the value of
the corresponding environment variable.
For example, suppose
environment variable
$field1 has value
abc and
environment variable
$field2 has no value, and
you enter the following command in the UNIX command
shell:
db2 INGEST ... FORMAT DELIMITED ($field1 INTEGER, $field2 CHAR) ...
The DB2 CLP sees the following command,
which is not valid:
INGEST ... FORMAT DELIMITED (abc INTEGER, CHAR) ...
You can avoid this by using single quotation marks,
as follows:
db2 'INGEST ... FORMAT DELIMITED ($field1 INTEGER, $field2 CHAR) ...'
If
the
INGEST command contains single quotation marks,
you can enclose the entire command in double quotation marks
" and
use escape characters
\ before the
$ that
starts field names. For example:
db2 "INGEST ... FORMAT DELIMITED BY X'7C' (\$field1 ...) ..."
The
simplest way to avoid all the previously mentioned problems is to
put the command in a file and specify the -f option
on the db2 command.
Commit frequency
Updates
from the
INGEST command are committed at the end
of an ingest operation. The
INGEST command issues
commits based on the
commit_period and
commit_count configuration
parameters. As
a result of this, the following do not affect the
INGEST command:
- the CLP -c or +c options, which normally affect whether the CLP
automatically commits
- the NOT LOGGED INITIALLY option on the CREATE TABLE statement
Categories of error
For
purposes of determining how the ingest utility handles errors, we
can group errors into the following categories:
- 1. Start-up errors
- These errors include:
- syntax errors
- an input file is not found or not readable
- the target or exception table is not found
- the dump file or messages file cannot be opened
- not enough memory
- other errors detected at startup
When the utility detects any of these errors, it issues an error
message and exits without ingesting any data.
- 2. Data errors
- These errors include:
- Errors in the input data that the formatter detects, for example:
- numbers that are invalid or out of range (based on the field type)
- some dates, times, timestamps that do not fit the specified format
- other errors detected by the formatters
- Errors from SQL statements. These are listed in the description
of the EXCEPTION TABLE parameter on the INGEST command.
The most common are
- data too long, out of range, or incorrect format (not caught by
formatter)
- constraint violations on the target table, including not null,
unique, referential, or check
- range violations on the target table
When the utility detects any of these errors, it issues a warning
or error message and continues on. In the case of formatter errors,
the utility also writes the record to the dump file (if specified).
In the case of SQL errors, the utility also inserts the row into
the exception table (if specified). To have the utility end after
a certain number of these errors, use the WARNINGCOUNT parameter.
- 3. Recoverable errors
- These are errors that might go away if the utility reconnects
(if needed), waits some small amount of time, and tries again. When
the utility detects any of these errors and the retry_count or reconnect_count configuration
parameter is greater than 0, the utility attempts to recover from
the error. If retry_count and reconnect_count are
0, or the utility cannot recover, the utility considers these errors
to be terminating errors.
- 4. Terminating errors
- These are all other errors not listed in the previous three types.
When the utility detects any of these errors, it issues an error message
and exits. Any data that was committed remains in the target table.
You can restart the command from the last commit point.
Messages from the INGEST command
If the utility read at least one record from the
input source, the utility issues a summary of the number of rows read,
inserted, and rejected (similar to the import and load utilities)
and a successful completion message. For example:
INGEST FROM FILE my_table.del FORMAT DELIMITED
($field1 INTEGER EXTERNAL, $field2 CHAR(32))
INSERT INTO my_table VALUES($field1, $field2)
Number of rows read = 6
Number of rows inserted = 6
Number of rows rejected = 0
SQL2980I The ingest utility completed successfully at timestamp "11/02/2011 12:34:56.123456".
The
meaning of the
"Number of rows..." messages is as follows:
- Number of rows read
- The number of records the utility read from the input source.
- Number of rows inserted (updated, deleted, merged)
- The number of rows affected by the execution of the SQL statement
against the target table and committed to the database. The message
says "inserted", "updated", "deleted", or "merged",
depending on the SQL statement.
- Number of rows rejected
- The number of rows rejected (by the utility or DB2).
If there is only one warning
or error and the error is such that no input records could be processed,
the command issues only that error message and does not display the
summary of rows read, rejected, and so on. For example:
INGEST FROM FILE bad_file.del FORMAT DELIMITED
($field1 INTEGER EXTERNAL, $field2 CHAR)
INSERT INTO my_table VALUES($field1, $field2)
SQL2036N The path for the file or device "bad_file.del" is not valid.
After
the
INGEST command begins reading records, if there
are one or more warning or error conditions detected, the command
issues messages, along with the summary of rows read, skipped, and
so on, and a summary of the number of warnings and errors. Consider
the following example which tries to insert two duplicate keys:
INGEST FROM FILE my_table1.del, my_table2.del FORMAT DELIMITED
( $field1 integer external, $field2 char(32) )
INSERT INTO my_table VALUES($field1, $field2)
SQL2905I The following error occurred issuing the SQL "INSERT"
statement on table "ALVEYWD.MY_TABLE" using data from line "6" of
input file "my_table1.del".
SQL0803N One or more values in the INSERT statement, UPDATE statement,
or foreign key update caused by a DELETE statement are not valid
because the primary key, unique constraint or unique index identified
by "1" constrains table "ALVEYWD.MY_TABLE" from having duplicate values
for the index key. SQLSTATE=23505
SQL2905I The following error occurred issuing the SQL "INSERT" statement
on table "ALVEYWD.MY_TABLE" using data from line "9" of input file "my_table2.del".
SQL0803N One or more values in the INSERT statement, UPDATE statement,
or foreign key update caused by a DELETE statement are not valid because
the primary key, unique constraint or unique index identified by "1"
constrains table "ALVEYWD.MY_TABLE" from having duplicate values
for the index key. SQLSTATE=23505
Number of rows read = 6
Number of rows inserted = 4
Number of rows rejected = 2
Number of rows committed = 4
SQL2902I The ingest utility completed at timestamp "11/02/2011 12:34:56.123456".
Number of errors: 2. Number of warnings: 0.
If
the MESSAGES parameter is specified, most messages
are written to that file, except as specified in the description of
that parameter.
File
and pipe names
The metavariables
file-name and
pipe-name specify
a file or pipe name that follows the operating system syntax. File
or pipe names specified on the
INGEST command that
contain any characters other than the following must be enclosed in
single quotation marks:
- alphanumeric
- underscore (_)
- dash (-)
- forward-slash (/) or back-slash (\)
- dot (.)
The quotation marks need to be around the entire name.
For example, if a filename contains the equal sign (=), specify:
'dir1/dir2/my=file'
instead
of
dir1/dir2/'my=file'
Default field definition
list
When all of the following are true, you can omit the
field definition list:
- The format is delimited.
- The SQL statement is INSERT.
- The VALUES clause is omitted from the INSERT statement.
The field definition list defaults as follows:
- If a column list follows the table name on the INSERT statement,
there is one field for each column in the list.
- If the INSERT statement omits the column
list and there are no implicitly hidden columns, then there is one
field for each column in the table.
- If the INSERT statement omits the column
list and there are implicitly hidden columns, then you must explicitly
specify whether or not the implicitly hidden columns are included.
Use the DB2_DMU_DEFAULT registry variable, or the IMPLICITLYHIDDENINCLUDE
or IMPLICITLYHIDDENMISSING keywords to specify if implicitly hidden
columns are included.
- Each field has the same name as the corresponding table column,
prefixed with a dollar sign $.
- Each field has the same data type and length (or precision and
scale) as the corresponding table column. Numeric fields (integer,
decimal, and so on) default to EXTERNAL format. DB2SECURITYLABEL fields
default to STRING format.
Table 3 shows how the field, column, and
values lists default for every possible combination of omitting or
including those lists.
Table 3. Possible combinations when field definition list is specified
or omittedField definition list specified? |
Column list specified? |
VALUES clause specified? |
Defaults |
no |
no |
no |
Field list and
column list default to all columns. Values list defaults to field
list. Note: If the table contains implicitly
hidden columns, then you must explicitly specify whether or not the
implicitly hidden columns are included using the DB2_DMU_DEFAULT registry
variable, or the IMPLICITLYHIDDENINCLUDE or IMPLICITLYHIDDENMISSING
keywords.
|
no |
no |
yes |
Not allowed |
no |
yes |
no |
Field list and values list
default to the specified column list
|
no |
yes |
yes |
Not allowed |
yes |
no |
no |
Column list defaults
to all columns.
Values list defaults to user-specified field
list.
Notes: - The number of fields must be same as number of columns. Otherwise
an error occurs.
- If the table contains implicitly hidden columns, then you must
explicitly specify whether or not the implicitly hidden columns are
included using the DB2_DMU_DEFAULT registry variable, or the IMPLICITLYHIDDENINCLUDE
or IMPLICITLYHIDDENMISSING keywords.
|
yes |
no |
yes |
Column list defaults
to all columns.
Note: If the table contains
implicitly hidden columns, then you must explicitly specify whether
or not the implicitly hidden columns are included using the DB2_DMU_DEFAULT
registry variable, or the IMPLICITLYHIDDENINCLUDE or IMPLICITLYHIDDENMISSING
keywords.
|
yes |
yes |
no |
Values list defaults to user-specified field
list. Note: Number of fields must be same as number of columns.
Otherwise an error occurs.
|
yes |
yes |
yes |
No defaults |
Rules and defaults
for field lengths
Field lengths specified on field types
are in bytes and must be 1 - 32 767, inclusive.
The following
tables show the length of a field under all combinations of input
data format, field type, and whether the length and ending position
is specified. The cell value
the specified length means
that the field length is explicitly specified; for example, the length
that can follow "INTEGER EXTERNAL". It is not the precision that
is on some numeric types. The cell value
n/a means not
applicable or not allowed.
Table 4. Field lengths for DELIMITED
formatField type |
Field type omits length (or datetime format) 1 |
Field type specifies length (or datetime format) 1 |
CHARACTER |
2552 |
the specified length |
SMALLINT , INTEGER, BIGINT EXTERNAL |
2552 |
the specified length |
DECIMAL(p,s) EXTERNAL (default
for (p,s) is (5,0)) |
2552 |
the specified length |
DECFLOAT(16 or 34) EXTERNAL |
2552 |
the specified length |
REAL EXTERNAL |
2552 |
the specified length |
FLOAT or DOUBLE EXTERNAL |
2552 |
the specified length |
DATE |
- date_compat off: length of maximum length
default format for a date (10)
- date_compat on: length of maximum length
default format for a timestamp (32)
|
- date_compat off: length of maximum length
default format for a date (10)
- date_compat on: length of maximum length
default format for a timestamp (32)
|
TIME |
length of maximum length default format (8) |
length of max length default format (8) |
TIMESTAMP(p) (default for p is
6) |
length of maximum length default format (32) |
length of max length default format (32) |
DB2SECURITYLABEL with NAME modifier |
2552 |
the specified length |
DB2SECURITYLABEL with STRING modifier |
2552 |
the specified length |
1 In delimited format,
the field length means the length to which the utility truncates the
field data if it exceeds that length. The truncation occurs after
applying the trim option. If the truncation results in loss of non-blank
characters, the utility issues a warning.
2 In delimited
format, character, and numeric fields that are specified in ASCII
and omit the length have a default length of 255 bytes. If you need
a longer length, specify an explicit length on the field type.
|
Table 5. Field lengths for POSITIONAL formatField type |
Field type omits length (or date/time format),
POSITION omits end position |
Field type specifies length (or date/time format),
POSITION omits end position |
Field type omits length (or date/time format),
POSITION specifies end position |
Field type specifies length (or date/time format),
and POSITION specifies end position |
CHARACTER |
n/a3 |
the specified length |
(end - start +
1) |
the lesser of the specified length and (end - start +
1) 7 |
SMALLINT (binary) |
2 |
n/a4 |
If (end -start + 1) is 2, 4, or 8, use that.
Otherwise, use length of binary type. If conflict, issue warning.5 |
n/a4 |
INTEGER (binary) |
4 |
n/a4 |
If (end - start +
1) is 2, 4, or 8, use that. Otherwise, use length of binary type.
If conflict, issue warning.5 |
n/a4 |
BIGINT (binary) |
8 |
n/a4 |
If (end - start +
1) is 2, 4, or 8, use that. Otherwise, use length of binary type.
If conflict, issue warning.5 |
n/a4 |
SMALLINT, INTEGER, BIGINT EXTERNAL |
n/a3 |
the specified length, which must be <= 50 |
(end - start +
1), which must be <= 50 |
the lesser of the specified length and (end - start +
1)7 |
DECIMAL(p,s) PACKED (default for (p,s) is (5,0)) |
(p+2)/2 |
n/a4 |
Use (p+2)/2. If conflict with end position,
issue warning.5 |
n/a4 |
DECIMAL(p,s) ZONED (default for (p,s) is (5,0)) |
p |
n/a4 |
Use precision. If conflict with end position,
issue warning.5 |
n/a4 |
DECIMAL(p,s) EXTERNAL (default for (p,s) is
(5,0)) |
n/a3 |
the specified length |
(end - start +
1) |
the lesser of the specified length and (end - start +
1)7 |
DECFLOAT(16) (binary) |
8 |
n/a4 |
Use 8. If conflict with end position, issue
warning.5 |
n/a4 |
DECFLOAT(34) (binary) |
16 |
n/a4 |
Use 16. If conflict with end position, issue
warning.5 |
n/a4 |
DECFLOAT(16 or 34) EXTERNAL |
n/a3 |
the specified length |
(end - start +
1) |
the lesser of the specified length and (end - start +
1)7 |
REAL (binary) |
4 |
n/a4 |
Use 4. If conflict with end position, issue
warning.5 |
n/a4 |
REAL EXTERNAL |
n/a3 |
the specified length |
(end - start +
1) |
the lesser of the specified length and (end - start +
1)7 |
FLOAT or DOUBLE (binary) |
8 |
n/a4 |
Use 8. If conflict with end position, issue
warning.5 |
n/a4 |
FLOAT or DOUBLE EXTERNAL |
n/a3 |
the specified length |
(end - start +
1)) |
the lesser of the specified length and (end - start +
1)7 |
DATE |
- date_compat off: length of maximum length
default format for a date (10)
- date_compat on: length of maximum length
default format for a TIMESTAMP(0) (19)
|
length of format string, which must be >=
1 |
- date_compat off: (end - start +
1) , which must be >= 8 and <= 10
- date_compat on: (end - start +
1) , which must be >= 19 and <= 32
|
If (end - start +
1) is greater than or equal to length of the format string, use that.
Otherwise, (end - start + 1)
must be greater than or equal to the length of the shortest value
that matches the format string.6 |
TIME |
length of max length default format (8) |
length of format string, which must be >=
1 |
(end - start +
1) , which must be >= 4 and <= 8 |
If (end - start +
1) is greater than or equal to length of the format string, use that.
Otherwise, (end - start + 1)
must be greater than or equal to the length of the shortest value
that matches the format string.6 |
TIMESTAMP(p) (default for p is
6) |
length of default format (If p = 0, then 19.
Otherwise, 20 + p.) |
length of format string, which must be >=
1 |
(end - start +
1) , which must be >= 19 and <= 32 |
If (end - start +
1) is greater than or equal to length of the format string, use that.
Otherwise, (end - start + 1)
must be greater than or equal to the length of the shortest value
that matches the format string.6 |
DB2SECURITYLABEL (binary) |
n/a3 |
the specified length |
(end-start+1) |
the lesser of the specified length and (end - start +
1)7 |
DB2SECURITYLABEL with NAME modifier |
n/a3 |
the specified length |
(end-start+1) |
the lesser of the specified length and (end - start +
1)7 |
DB2SECURITYLABEL with STRING modifier |
n/a3 |
the specified length |
(end-start+1) |
the lesser of the specified length and (end - start +
1)7 |
3 These cases (positional
format, length and end position omitted) are not allowed because the
utility cannot determine the intended length of the field and there
is no suitable default.
4 These cases (positional
format, numeric binary field, length specified, end position omitted)
are not allowed because with binary numeric types, the length is implied
by the binary type and the syntax does not allow specifying an explicit
field length in bytes.
5 The various utilities handle
conflicts between the numeric binary type specified and the length
specified on the POSITION parameter as shown in Table 6.
6 The
various utilities handle conflicts between the date/time format string
length and the length specified on the POSITION parameter
as shown in Table 7.
7 In
these cases (length and end position are specified but conflict),
the ingest utility issues a warning and uses the lesser of the specified
length and the value of (end - start +
1).
|
Table 6. Conflicts between numeric
binary type and specified lengthField type |
Load1 |
Ingest |
SMALLINT, INTEGER, BIGINT |
If (end - start +
1) is 2, 4, or 8, use that length with no warning or error. Otherwise
issue warning, and use NULL or reject the row. |
If (end - start +
1) is 2, 4, or 8, use that length and issue a warning. Otherwise,
use the length of the binary type and issue a warning. |
DECIMAL |
Issue error SQL3123W and reject the row. |
Use the length of the binary type and issue
a warning. |
REAL,FLOAT, DECFLOAT |
Use the length of the binary type. Do not issue
a warning or error. |
Use the length of the binary type and issue
a warning. |
1 Import does not
support binary numeric types (MODIFIED BY binarynumerics).
|
Table 7. Conflicts between date/time format
string length and specified lengthConflict |
Import/Load |
Ingest |
(end - start +
1) is greater than or equal to the length of the format string. |
Use end - start +
1. No warning or error. |
Use end - start +
1. No warning or error. |
(end - start +
1) is less than the length of the format string. |
Issue error. |
Issue error. |
How the ingest
utility determines field lengths
The algorithm for determining
field length is as follows:
- Determine the length of the field as it appears in the input data:
- If the format is DELIMITED:
- Apply the specified or default trim option. (For CHAR fields,
the default is TRIM. For all other non-binary field types, the ingest
utility always trims leading and trailing blanks.)
- If OPTIONALLY ENCLOSED BY is not specified, the
data length is the number of bytes left in the field.
- If OPTIONALLY ENCLOSED BY is specified, the data
length is the number of bytes between the string delimiters (after
converting doubled delimiters inside the string to single delimiters).
- If the format is POSITIONAL:
- Apply the specified or default trim option. (For CHAR fields,
the default is NOTRIM. For all other non-binary field types, the ingest
utility always trims leading and trailing blanks.)
- If the length determined in step 1 is greater than the field length
as described in Rules and defaults for field lengths:
- For fields other than DATE, TIME, or TIMESTAMP, truncate the data
to the field length. If non-blank characters were truncated, issue
a warning.
- For DATE, TIME, and TIMESTAMP fields, issue an error.
- If the SQL statement sets a column to the field value and the
value is too long or out of range, DB2 issues
an error message and rejects the row.
Note: This can happen only on
an INSERT, UPDATE, or MERGE statement.
The following tables
show some examples. In these tables, the tilde (~) means a blank.
Table 8. FORMAT DELIMITED BY '|'Field definition |
Input field |
Final field value |
CHAR TRIM |
|~ABC~| |
ABC |
CHAR NOTRIM |
|~ABC~| |
~ABC~ |
CHAR OPTIONALLY ENCLOSED BY '"' |
|~"ABC"~| |
ABC |
CHAR OPTIONALLY ENCLOSED BY '"' |
|~"~ABC~"~| |
~ABC~ |
CHAR NOTRIM OPTIONALLY ENCLOSED BY '"' |
|"~ABC~"| |
~ABC~ |
CHAR NOTRIM OPTIONALLY ENCLOSED BY '"' |
|~"ABC"~| |
~"ABC"~ |
CHAR(1) TRIM |
|~ABC~| |
A (and warning because non-blanks were trimmed) |
CHAR(1) NOTRIM |
|~ABC~| |
~ (and warning because non-blanks were trimmed) |
CHAR(1) OPTIONALLY ENCLOSED BY '"' |
|~ABC~| |
A (and warning because non-blanks were trimmed) |
CHAR(6) TRIM |
|~ABC~| |
ABC |
CHAR(6) NOTRIM |
|~ABC~| |
~ABC~ |
CHAR(6) OPTIONALLY ENCLOSED BY '"' |
|~"ABC"~| |
ABC |
CHAR(6) OPTIONALLY ENCLOSED BY '" |
|~"~ABC~"~| |
~ABC~ |
INTEGER EXTERNAL |
|~12345~| |
12345 |
INTEGER EXTERNAL(3) |
|~12345~| |
123 (and warning because non-blanks were trimmed) |
INTEGER EXTERNAL(9) |
|~12345~| |
12345 |
Table 9. FORMAT POSITIONALField definition |
Input field |
Final field value |
POSITION(1:5) CHAR TRIM |
~ABC~ |
ABC |
POSITION(1:5) CHAR NOTRIM |
~ABC~ |
~ABC~ |
POSITION(1:5) CHAR(3) TRIM |
ABC |
A (and warning because end position conflicts
with length) |
POSITION(1:5) CHAR(5) NOTRIM |
~ABC~ |
~ABC~ |
POSITION(1:5) CHAR(5) TRIM |
~ABC~ |
ABC |
POSITION(1:5) CHAR(7) NOTRIM |
~ABC~ |
~ABC~ (and warning because end position conflicts
with length) |
POSITION(1:7) INTEGER EXTERNAL |
~12345~ |
12345 |
POSITION(1:7) INTEGER EXTERNAL(3) |
123 |
123 (and warning because end pos conflicts with
length) |
POSITION(1:7) INTEGER EXTERNAL(9) |
~12345~ |
12345 (and warning because end position conflicts
with length) |
Handling of invalid numeric
data
In general, when numeric data is specified as a character
string and the data cannot be converted to a number or is out of range
for the field type, the ingest utility rejects the record and issues
an error. This is different than the import and load utilities, which
handle these errors as follows:
- For all numeric types, if the data cannot be converted to the
column's type (for example, "ABC" in an integer field):
- If the column is nullable, the import and load utilities replace
the value with NULL and issue a warning.
- If the column is not nullable, the import and load utilities reject
the record and issue an error.
- For all numeric types except DECIMAL, if the data is out of range:
- If the column is nullable, the import and load utilities replace
the value with NULL and issue a warning.
- If the column is not nullable, the import and load utilities reject
the record and issue an error.
- For DECIMAL fields, if the data is out of range, the import and
load utilities reject the record and issue an error.
Notes regarding
the FORMAT clause
- The priority for delimiters is the same as for the load utility:
record, character, field. The import and load utilities also support
the delprioritychar file type modifier, which changes
the priority to character, record, field, but the ingest utility does
not.
- When modifiers are specified on a field, they must be specified
on all fields that have a similar type. For example, if a decimal
field specifies RADIX POINT, the same RADIX POINT character must be
specified on all other fields of type INTEGER, DECIMAL, DECLFOAT,
and FLOAT.
- For character fields, the various TRIM options specify whether
the ingest utility strips leading and/or trailing blanks. For all
other non-binary field types (for example, INTEGER EXTERNAL), the
ingest utility always trims leading and trailing blanks.
- When the format is DELIMITED or the format is POSITIONAL and the
RECORDLEN parameter is omitted:
- If a record contains fewer fields than defined in the format,
the ingest utility considers the missing fields to be NULL.
- If a record contains extra fields, the ingest utility ignores
them.
- When the format is POSITIONAL:
- If the RECORDLEN parameter is specified and the
last record is too short, the ingest utility issues an error and rejects
the record.
- If there is data outside the specified field positions, the ingest
utility ignores it. For example, if the field specifies two fields
at POSITION(1:10) and POSITION(21:30), the ingest utility ignores
data in positions 11:20 and any data past position 30 to the end of
the record.
SQL statements on the INGEST command
- The ingest utility supports the following values in the PERIOD
clause:
- constants
- special registers
Note: Only the following special registers affect
the execution of the SQL statement on the
INGEST command:
- CURRENT SCHEMA
- CURRENT TEMPORAL SYSTEM_TIME
- CURRENT TEMPORAL BUSINESS_TIME
- scalar functions whose arguments are supported operands (although
nested function invocations and user-defined functions cannot be used)
- CAST specifications where the cast operand is a supported operand
expression using arithmetic operators and operands
- expressions using arithmetic operators and operands
You can also specify a field name, which the utility converts
to a parameter marker.
- There are several other considerations for ingesting into a temporal
table:
- When special register CURRENT TEMPORAL SYSTEM_TIME is set, you
cannot ingest into a system-period temporal table or a bi-temporal
table.
- When special register CURRENT TEMPORAL BUSINESS_TIME is set, you
can still ingest into a business-time temporal table or a bi-temporal
table (subject to the other restrictions on system-time temporal tables).
- You cannot use the REPLACE or DELETE operation on a system-period
temporal table or bi-temporal table.
- The ingest utility cannot insert values for the SYSTEM_TIME period
in a system-period temporal table.
- Any strings in the SQL statements are assumed to be in the application
code page.
- Any column names that start with a dollar sign ($) must be specified
as SQL delimited identifiers (enclosed in double quotation marks),
in order to distinguish them from field names.
- In most cases, after converting field names to parameter markers
and binding in values for the fields, the ingest utility passes these
statements as is to DB2.
They are therefore subject to the same restrictions. For example:
- Fields are bound as their field type. Therefore they can be
used only where a value of the specified type is allowed. For example,
fields of type DB2SECURITYLABEL can be used only where a security
label is allowed.
- If the INSERT statement omits the column
list but specifies the VALUES list, the VALUES list must contain an
item for each column. If the table contains implicitly hidden columns,
then you must explicitly specify whether or not the implicitly hidden
columns are included. Use the DB2_DMU_DEFAULT registry variable, or
the IMPLICITLYHIDDENINCLUDE or IMPLICITLYHIDDENMISSING keywords to
specify if implicitly hidden columns are included.
- The ingest utility issues an error and ends the INGEST command
if any of the following conditions is true:
- If the SQL statement specified on the command does not reference
any fields
- The SQL statement is DELETE or UPDATE and it has no WHERE clause
or the WHERE clause does not reference any fields.
- The SQL statement is MERGE and the ON clause does not reference
any fields.
- The default isolation level is cursor stability (CS). This can
be overridden by specifying the WITH clause on the SQL statement (except
INSERT), or by setting the CLI IsolationLevel keyword in the file
specified by the DB2CLIINIPATH environment variable
(the default is sqllib/cfg/db2cli.ini). Setting
the IsolationLevel keyword, however, affects all CLI applications.
- The MERGE statement
The
INGEST command issues the MERGE statement
once for each input data record, treating the record as a one-row
table (equivalent to the
USING table-reference parameter
on the SQL MERGE statement). This can occasionally produce different
results than the equivalent SQL MERGE statement. Consider the following
example:
The SQL MERGE statement is:
MERGE INTO target_table
USING (SELECT * FROM source_table) source_table
ON target_table.c1 <> source_table.c1
WHEN NOT MATCHED THEN
INSERT VALUES(source_table.c1, source_table.c2);
The
INGEST command (using the equivalent
MERGE statement) is:
INGEST FROM FILE source_table.del
FORMAT DELIMITED
(
$c1 INTEGER EXTERNAL,
$c2 CHAR(32)
)
MERGE INTO target_table
ON c1 <> $c1
WHEN NOT MATCHED THEN
INSERT VALUES($c1, $c2);
Following the SQL
MERGE statement, TARGET_TABLE contains the following rows:
(NULL, 'target data 1')
(1, 'source data 1')
(2, 'source data 2')
Following the
INGEST command,
TARGET_TABLE contains the following rows:
(NULL, 'target data 1')
(1, 'source data 1')
The reason the INGEST command
results in fewer rows is that after the INGEST command
processes the first input data record, the target table contains the
two rows shown previously. From that point onwards, the ON clause
has a matching row, which means the INSERT statement on the WHEN NOT
MATCHED clause is not processed again.
Note that this example
is not a common use of the MERGE statement. When the MERGE statement
has the more common format of updating or inserting rows based on
matching keys, the SQL MERGE statement and the INGEST command
produce the same results.
- The REPLACE statement
- REPLACE is basically the same as INSERT except that the ingest
utility issues the DELETE statement to delete all the table rows before
beginning the inserts. If the table is large, the DELETE could take
some time and use significant log space.
- Data type conversion
For each field the utility ingests, there are two places that
data conversion can occur:
- When the input data is in character format (including numbers
specified in ASCII), the utility converts the data from character
to the field type.
- If a field's type differs from its corresponding column type, DB2 converts the field value from
the field type to the column type.
In both cases, the ingest utility and DB2 use the rules described in the topic
"Assignments
and comparisons". This can occasionally produce different results
than converting directly from character to the column type. Consider
the following example:
- The input file specifies string '9.99E6143', which is a valid
DECFLOAT(34) number but outside the range of DECFLOAT(16).
- The field type is DECFLOAT(34).
- The column type is DECFLOAT(16).
In this example, the ingest utility converts the field value
from character to DECFLOAT(34). Then DB2 converts
from DECFLOAT(34) to DECFLOAT(16). That results in a warning and a
value of DECFLOAT infinity in the DECFLOAT(16) column.
- Casting a field to an SQL data type
Before issuing the SQL statement that you specify on the
INGEST command,
the ingest utility converts field names to parameter markers and then
issues the SQL statement dynamically. In order to determine the data
types of the parameter markers, DB2 uses
the rules described in the topic
"Determining data types of untyped
expressions". Just as with dynamic SQL, if the data type of a field
is not obvious from the context where it is used, you might get an
error, in which case you need to do perform one of the following actions:
- Set the DB2_DEFERRED_PREPARE_SEMANTICS registry
variable to YES.
- Use a CAST specification to cast the field name to an SQL data
type.
For example, consider the following
INGEST command:
INGEST FROM FILE my_file.txt
FORMAT DELIMITED ( $date_fld DATE )
INSERT INTO my_table(int_col) VALUES(day($date_fld));
If
you have the
DB2_DEFERRED_PREPARE_SEMANTICS registry
variable set to
NO (the default), when the ingest
utility tries to issue the statement, it might return an error because DB2 cannot determine which version
of the DAY function to use. (This is true even though you defined
the field as a DATE and the ingest utility binds it as a DATE.)
You
can fix the problem by casting field names to their corresponding
field types, for example:
INSERT INTO my_table(int_col) VALUES( day(CAST($date_fld AS DATE)) );
In
some cases, you can also fix the problem by setting registry variable
DB2_DEFERRED_PREPARE_SEMANTICS=YES. However, this does not always
work.
As an additional example, consider the following
INGEST command:
INGEST FROM FILE my_file.txt
FORMAT DELIMITED ( $field1 DECIMAL(5,2) )
INSERT INTO my_table(dec_col) VALUES($field1 + 1);
Because
$field1 is added to an integer, DB2 assigns
type INTEGER to the field. In order to have DB2 assign type DECIMAL(5,2), you need to change
the SQL statement to:
INSERT INTO my_table(dec_col) VALUES(CAST($field1 AS DECIMAL(5,2)) + 1);
or
INSERT INTO my_table(dec_col) VALUES($field1 + 001.00);