LOAD command for IBM Db2 Warehouse
The LOAD command efficiently loads large amounts of data. The LOAD command loads data at the page level and delays constraint checking until after the data is loaded into the table. Data can be in the form of a file, tape, or named pipe. You can also load data by using a user-written script or application.
If possible, use the dbload command instead, which is easier to use.
Authorization
- Choice 1: DATAACCESS authority, which is included by default if you have DBADM authority
- Choice 2: DBADM authority without DATACCESS authority but with the INSERT privilege
- Choice 3: LOAD authority without DATAACCESS authority or DBADM authority (by default, granting DATAACCESS authority or DBADM authority also grants LOAD authority) but with the INSERT privilege
Command syntax
Command parameters
- CLIENT
- Specifies that the data to be loaded is on the client. This parameter is ignored if you aren't
invoking the load operation from a client.
Code page conversion is not performed during a remote load operation. If the code page of the data is different from that of the server, specify the data code page by using the codepage modifier.
The dumpfile modifier refers to a file on the server even if you specify the CLIENT parameter.
- FROM filename | pipename | device
- Specifies the file, pipe, or device that refers to an SQL statement that contains the data that
is being loaded.
The file, pipe, or device must be accessible from the head node on the server.
If you specify several names, they are processed in sequence. If the last item that you specify is a tape device and you are prompted for a tape, the LOAD fails and returns an error.
You must use a fully qualified path file name, and it must exist on the server.
- OF filetype
- Specifies the format of the data, which can be as follows:
- ASC
- Non-delimited ASCII format.
- DEL
- Delimited ASCII format.
- MODIFIED BY filetype_mod
- Specifies file type modifier options. See File type modifiers. The LOAD command does not issue a warning if you attempt to use unsupported file types with the MODIFIED BY parameter. If you attempt this, the load operation fails, and an error code is returned.
- WARNINGCOUNT n
- Stops the load operation after n warnings. Set this parameter if you do not expect any warnings but you want to verify that the correct load file and target table are being used. If you incorrectly specify the load file or the target table, the LOAD command generates a warning for each row that it attempts to load, which causes the load operation to fail. If n is zero or you do not specify this parameter, the load operation continues regardless of the number of warnings that it issues.
- MESSAGES ON SERVER
- Saves the message file that the LOAD command creates on the server. This
parameter applies only if you are running the LOAD command by using the command
line processor plus (CLPPlus) interface with the ADMIN_CMD procedure. The result set includes the
following two columns:
- MSG_RETRIEVAL
- Contains the SQL statement that is required to retrieve all the warnings and error messages that occur during the load operation.
- MSG_REMOVAL
- Contains the SQL statement that is required to clean up the messages.
If you do not specify this parameter, the message file is deleted when the ADMIN_CMD procedure returns to the caller. The MSG_RETRIEVAL and MSG_REMOVAL column in the result set contains null values.
- MESSAGES message_file
- Specifies the destination for warning and error messages that occur during the load operation.
This parameter applies only if you are running the LOAD command by using CLP. If
you do not specify this parameter when you run the LOAD command by using CLP,
messages are written to standard output. If you do not specify the complete path to the file, the
LOAD command uses the current directory and the default drive as the destination.
If you specify the name of an existing file, the command appends the messages to that file.
The message file is usually populated with messages at the end of the load operation and, as such, is not suitable for monitoring the progress of the operation.
- INSERT
- Adds the loaded data to the table without changing the existing table data.
A LOAD INSERT operation into a column-organized table maintains table statistics by default if the table was empty at the start of the load operation.
- TERMINATE
- Terminates a previously interrupted load operation and rolls back the operation to the point in
time at which it started, even if consistency points were passed. The states of any table spaces
that were involved in the operation return to normal, and all table objects are made consistent. If
the load operation that is being terminated is a LOAD INSERT operation, the table
retains all of its original records after the LOAD TERMINATE operation.
The TERMINATE parameter does not remove a backup pending state from table spaces.
- INTO table-name
- Specifies the database table into which the data is to be loaded. This table cannot be a system table, a declared temporary table, or a created temporary table. You can specify an alias or the fully qualified or unqualified table name. A qualified table name is in the form schema.tablename. If you specify an unqualified table name, the table name is qualified with the value of the CURRENT SCHEMA special register.
- NONRECOVERABLE
- Avoids the use of transactional logging during a load operation. If the load operation fails, you can only drop the table or restore it from a backup.
Usage notes
- Data is loaded in the sequence that appears in the input file. If you want a particular sequence, sort the data before you attempt the load.
- Any path that you use for the LOAD command must be a valid fully qualified path on the server head node.
Running the LOAD command
- You can run the LOAD command in the DB2 command line processor (CLP)
interface, such as by including the command in a script. To use the CLP, you must install the IBM
Data Server Client (such as by deploying the Db2® Warehouse client container) and connect the CLP to Db2 Warehouse. See Connecting to IBM Db2 Warehouse with the command-line processor (CLP) or Call Level Interface (CLI). If the data that you want to load is on the client, specify the CLIENT parameter. If you do not specify the CLIENT parameter, it is assumed that the data is on the server. For loading from the server, the easiest approach is to put the data in /mnt/clusterfs/scratch/datafile, where datafile specifies the file that contains your data. However, if you have large data sets on an NFS share that you want to load, it's preferable to mount that share on the host instead of copying it to the host. To mount the share and map the volume inside the container, use the following instructions:
- Mount the remote drive on the host where you deployed the
container:
mount remote_server:/data_location /mnt/clusterfs/scratch/nfsdata - Stop the Db2 Warehouse
services:
docker exec -it Db2wh stop - Stop the Db2 Warehouse container:
docker stop Db2wh - Remove the Db2 Warehouse container:
docker rm Db2wh - Re-create the Db2 Warehouse container and start it,
specifying the -v option to create a new mapped
volume:
For the latest version of the container, specify one of the following tag values.docker run -d -it --privileged=true --net=host --name=Db2wh -v /mnt/clusterfs:/mnt/bludata0 -v /mnt/clusterfs:/mnt/blumeta0 -v /mnt/clusterfs/scratch/nfs_data:/mnt/clusterfs/scratch/nfsdata tag- For a container for IBM POWER LE hardware:
store/ibmcorp/db2wh_ee:v11.5.7.0-cn5-db2wh-ppcle - For a container for IBM z Systems hardware:
store/ibmcorp/db2wh_ee:v11.5.7.0-cn5-db2wh-s390x - For a container for x86 hardware:
store/ibmcorp/db2wh_ee:v11.5.7.0-cn5-db2wh-linux
- For a container for IBM POWER LE hardware:
In the following example, the LOAD command is specified in the load.sql script, which is then run in the CLP:
For information about the CLP options, see Command line processor options.$ cat load.sql load client from /usr/data/unload.del of del modified by escapechar coldel| insert into user_table; $ db2 -tvf load.sql - Mount the remote drive on the host where you deployed the
container:
- You can run the LOAD command in the command line processor plus (CLPPlus) interface. CLPPlus is included in the Db2 Warehouse container. For information, see Loading data using CLPPlus and CLPPlus LOAD command results.
File type modifiers
| Modifier | Description |
|---|---|
| noheader | This modifier applies only if you want to perform a LOAD_ONLY operation by using a file that does not have a header. |
| norowwarnings | This modifier suppresses all warnings about rejected rows. |
| usedefaults | If you specify a source column for a target table column but it contains no data for one or
more row instances, default values are loaded. Examples of missing data are as follows:
|
| Modifier | Description |
|---|---|
| codepage=x | The x value is an ASCII character string. The value is interpreted as the
code page of the data in the input data set. Character data and numeric data that you specify in
characters is converted from this code page to the database code page during the load operation.
The following rules apply:
|
| dateformat="x" | The x value is the format of the date in the source file.1
Valid date elements are as follows:
A default value of 1 is assigned for each element that is not specified. Sample formats
follow:
|
| dumpfile = x | The x value is the fully qualified (according to the server database
partition) name of an exception file to which rejected rows are written. A maximum of 32 KB of data
is written per record. The following example shows how to specify a dump file:
The file is created and owned by the instance owner. To override the default file permissions, use the dumpfileaccessall file type modifier. In a partitioned database environment, use a path that is local to the loading database partition so that concurrently running load operations do not attempt to write to the same file. If the specified file exists, it is not re-created, but it is truncated. The dumpfile modifier refers to a file on the server even if you specify the CLIENT parameter. |
| dumpfileaccessall | Grants read access to 'OTHERS' when a dump file is created. This file type modifier is valid if you use it with the dumpfile file type modifier and you have the SELECT privilege on the load target table. If the specified file exists, its permissions are not changed. |
| fastparse | Use this modifier with caution. It reduces syntax checking on user-supplied column values and enhances performance. Tables are guaranteed to be architecturally correct (the command performs sufficient data checking to prevent a segmentation violation or trap); however, the coherence of the data is not validated. Use this modifier only if you are certain that your data is coherent and correct. For example, if the user-supplied data contains an invalid timestamp column value of :1>0-00-20-07.11.12.000000, this value is inserted into the table if you specify the fastparse modifier and is rejected if you do not specify the fastparse modifier. |
| implieddecimal | The location of an implied decimal point is determined by the column definition; it is no
longer assumed to be at the end of the value. For example, the value 12345 is loaded into a
DECIMAL(8,2) column as 123.45, not 12345.00. You cannot use this modifier with the packeddecimal modifier. |
| timeformat="x" | The x value is the format of the time in the source file.1
Valid time elements are as follows:
A default value of 0 is assigned for each element that is not specified. Sample formats
follow:
|
| timestampformat="x" | The x value is the format of the time stamp in the source
file.1 Valid time stamp elements are as follows:
|
| timestampformat="x" (Continued) |
A default value of 1 is assigned for unspecified YYYY, M, MM, D, DD, or DDD elements. A default
value of 'Jan' is assigned to an unspecified MMM element. A default value of 0 is assigned for all
other unspecified elements. A sample time stamp format
follows:
The valid values for the MMM element are 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', and 'dec'. These values are case insensitive. If you do not specify the timestampformat modifier, the
LOAD command formats the time stamp field by using one of two possible formats:
The LOAD command chooses the format based on the separator that is between the DD and HH. If it is a dash (-), the LOAD command uses the regular dashes and dots format (YYYY-MM-DD-HH.MM.SS). If it is a blank space, the LOAD command expects a colon (:) to separate the HH, MM, and SS. In either format, if you include the microseconds field (UUUUUU), the LOAD command expects the dot (.) as the separator. Either YYYY-MM-DD-HH.MM.SS.UUUUUU or YYYY-MM-DD HH:MM:SS.UUUUUU is acceptable. The following example illustrates how to load data that contains user-defined date and time
formats into a table that is called schedule:
|
| usegraphiccodepage | If you specify the usegraphiccodepage modifier, it is assumed that data
that you are loading into graphic data fields is in the graphic code page. The rest of the data is
assumed to be in the character code page. The graphic code page is associated with the character
code page. The LOAD command determines the character code page by using either
the codepage modifier, if you specify it, or the code page of the database if
you do not specify the codepage modifier. Use the usegraphiccodepage modifier with the delimited data file that is generated by drop table recovery only if the table that you are recovering has graphic data. |
| Modifier | Description |
|---|---|
| binarynumerics | Numeric (but not DECIMAL) data must be in binary form, not the character representation. This
avoids costly conversions. This modifier is supported only with positional ASC, using fixed length records that you specify by using the reclen modifier. The
following rules apply:
NULL values cannot be present in the data for columns that are affected by this modifier. Blanks (normally interpreted as NULL) are interpreted as a binary value when you use this modifier. |
| nochecklengths | If you specify the nochecklengths modifier, an attempt is made to load each row even if the source data has a column definition that exceeds the size of the target table column. Such rows can be successfully loaded if code page conversion causes the source data to shrink. For example, 4-byte EUC data in the source might shrink to 2-byte DBCS data in the target and require half the space. The modifier is particularly useful if you know that the source data fits in all cases despite mismatched column definitions. |
| packeddecimal | Use this modifier to load packed-decimal data directly. The
binarynumerics modifier does not apply to the DECIMAL field type. This modifier is supported only with positional ASC, using fixed length records that you specify by using the reclen modifier. Supported values for the sign nibble are as follows:
NULL values cannot be present in the data for columns that are affected by this modifier. Blanks (normally interpreted as NULL) are interpreted as a binary value when you use this modifier. Regardless of the server platform, the byte order of binary data in the load source file is assumed to be big endian. That is, when you use this modifier on Windows operating systems, the byte order must not be reversed. You cannot use this modifier with the implieddecimal modifier. |
| reclen=x | The x value is an integer with a maximum value of 32767. For each row, x characters are read, and a newline character is not used to indicate the end of the row. |
| striptblanks | Use this modifier to truncate any trailing blank spaces when you load data into a
variable-length field. If you do not specify this modifier, blank spaces are kept. You cannot specify this modifier with the striptnulls modifier. |
| striptnulls | Use this modifier to truncate any trailing NULL values (0x00 characters) when you load data
into a variable-length field. If you do not specify this modifier, NULL values are kept. You cannot specify this modifier with the striptblanks modifier. |
| zoneddecimal | Use this modifier to load zoned decimal data. The binarynumerics
modifier does not apply to the DECIMAL field type. The zoneddecimal modifier is supported only with positional ASC, using fixed length records that you specify by using the reclen modifier. Half-byte sign values can be one of the following
values:
Supported values for
digits are Supported values for zones are
|
| Modifier | Description |
|---|---|
| chardelx | This modifier specifies a single-character string delimiter. This character is used instead
of double quotation marks, which are the default character string delimiter.2,
3 If you want to explicitly specify the double quotation mark (") as the character
string delimiter, specify it as follows: You can
specify the single quotation mark (') as a character string delimiter as follows:
|
| coldelx | This modifier specifies a single-character column delimiter. This character is used instead of a comma, which is the default column delimiter. 2, 3 |
| decplusblank | This modifier specifies the plus sign character. It causes positive decimal values to be prefixed with a blank space instead of a plus sign (+). The default action is to prefix positive decimal values with a plus sign. |
| decptx | This modifier specifies a single-character decimal point value. The default value is a period (.). |
| delprioritychar | The current default priority for delimiters is record delimiter (the new line character),
character delimiter, and column delimiter. The delprioritychar modifier
protects applications that depend on the older priority of character delimiter, record delimiter,
and column delimiter. For example, consider the following DEL data file:
If you specify the delprioritychar modifier, only two rows in this data file are recognized. The second <row delimiter> value is interpreted as part of the first data column of the second row, and the first and the third <row delimiter> values are interpreted as actual record delimiters. If you do not specify the delprioritychar modifier, three rows are recognized, each delimited by a <row delimiter> value. |
| escapechar | This modifier specifies that the backslash (\) is used as the escape character for the character string delimiter, column delimiter, and record delimiter. The default character and column delimiters and the modifiers for changing the defaults are mentioned earlier in this table. The default record delimiter is a new line character. |
| keepblanks | This modifier preserves the leading and trailing blanks in each field of type CHAR, VARCHAR,
or LONG VARCHAR. Without this modifier, all leading and trailing blanks that are not inside
character delimiters are removed, and a NULL is inserted into the table for all blank fields. The
following example illustrates how to load data into the TABLE1 table while preserving all leading
and trailing spaces in the data file:
|
| nochardel | If you specify this modifier, character delimiters are parsed as part of column data; they
are not treated as special characters. Do not specify this modifier if you exported the data from a
DB2® database system unless you also specified this modifier at
export time. The modifier is provided to support vendor data files that do not have character
delimiters. Improper usage might result in data loss or corruption. You cannot specify this modifier with the nodoubledel modifier. |
| nodoubledel | This modifier suppresses recognition of double character delimiters. |
- Double quotation marks around the date format string are mandatory. Field separators cannot
contain any of the following characters: a - z, A - Z, and 0 - 9. Use a field separator that is not
the same as the character delimiter or field delimiter in the DEL file format. A field separator is
optional if the start and end positions of an element are unambiguous. Ambiguity can exist if
(depending on the modifier) you use elements such as D, H, M, or S, because of the variable length
of the entries. For time stamp formats, take care to avoid ambiguity between the month and the minute descriptors, because they both use the letter M. A month field must be next to other date fields. A minute field must be next to other time fields. Following are some ambiguous time stamp formats:
In ambiguous cases, the LOAD command issues an error message, and the load operation fails."M" (Might be a month or a minute) "M:M" (Which is which?) "M:YYYY:M" (Both are interpreted as month.) "S:M:YYYY" (Adjacent to both a time value and a date value)Following are some unambiguous time stamp formats:"M:YYYY" (Month) "S:M" (Minute) "M:YYYY:S:M" (Month....Minute) "M:H:YYYY:M:D" (Minute....Month)You must precede some characters, such as double quotation marks and backslashes, by an escape character (for example, \).
- The character values that you provide for the chardel and
coldel file type modifiers must be in the code page of the source data. You can
specify the character code point (instead of the character symbol) by using the syntax
xJJ or 0xJJ, where JJ is the hexadecimal
representation of the code point. For example, to specify the # character as a column delimiter, use
one of the following statements:
... modified by coldel# ... ... modified by coldel0x23 ... ... modified by coldelX23 ... -
Certain restrictions help prevent a specified delimiter character from being treated as a part of the data:
- Delimiters are mutually exclusive.
- A delimiter cannot be binary zero, a line-feed character, a carriage return, or a blank space. As well, the default decimal point (.) cannot be a string delimiter.
- In A DBCS environment, the pipe (|) character delimiter is not supported.
There are differences in delimiters for ASCII-family and EBCDIC-family code pages:- The shift-in (0x0F) and the shift-out (0x0E) characters cannot be delimiters for an EBCDIC MBCS data file.
- Delimiters for MBCS, EUC, or DBCS code pages cannot be greater than 0x40, except the default decimal point for EBCDIC MBCS data, which is 0x4b.
- Default delimiters for data files in ASCII code pages or EBCDIC MBCS code pages are as follows:
- String delimiter:
"(0x22, double quotation mark) - Column delimiter:
,(0x2c, comma)
- String delimiter:
- Default delimiters for data files in EBCDIC SBCS code pages are as follows:
- String delimiter:
"(0x7F, double quotation mark) - Column delimiter:
,(0x6B, comma)
- String delimiter:
- The default decimal point for ASCII data files is 0x2e (period). The default decimal point for EBCDIC data files is 0x4B (period).
- If the code page of the server is different from the code page of the client, specify the hexadecimal representation of non-default delimiters.
| codepage=x | usegraphiccodepage | LOAD behavior |
|---|---|---|
| Absent | Absent | All data in the file is assumed to be in the database code page, not the application code page. |
| Present | Absent | All data in the file is assumed to be in code page x. Warning: Graphic data will be corrupted if you load it into the database if x is a
single-byte code page.
|
| Absent | Present | Character data in the file is assumed to be in the database code page. Graphic data is
assumed to be in the code page of the database graphic data. If the database code page is single byte, all data is assumed to be in the database code page. Warning: Graphic data will
be corrupted if you load it into a database with a single-byte code page.
|
| Present | Present | Character data is assumed to be in code page x. Graphic data is assumed to
be in the graphic code page of x. If x is a single-byte or double-byte code page, all data is assumed to be in code page x. Warning: Graphic data will be corrupted if you load it into the database if
x is a single-byte code page.
|
