LOAD command for IBM Db2 Warehouse

The LOAD command efficiently loads large amounts of data. The LOAD command loads data at the page level and delays constraint checking until after the data is loaded into the table. Data can be in the form of a file, tape, or named pipe. You can also load data by using a user-written script or application.

If possible, use the dbload command instead, which is easier to use.

Authorization

If you are loading by using the INSERT parameter, you must be authorized in one of the following three ways:

Choice 1: DATAACCESS authority, which is included by default if you have DBADM authority
Choice 2: DBADM authority without DATACCESS authority but with the INSERT privilege
Choice 3: LOAD authority without DATAACCESS authority or DBADM authority (by default, granting DATAACCESS authority or DBADM authority also grants LOAD authority) but with the INSERT privilege

Command syntax

Command parameters

CLIENT

Specifies that the data to be loaded is on the client. This parameter is ignored if you aren't invoking the load operation from a client.

Code page conversion is not performed during a remote load operation. If the code page of the data is different from that of the server, specify the data code page by using the codepage modifier.

The dumpfile modifier refers to a file on the server even if you specify the CLIENT parameter.

FROM filename | pipename | device

Specifies the file, pipe, or device that refers to an SQL statement that contains the data that is being loaded.

The file, pipe, or device must be accessible from the head node on the server.

If you specify several names, they are processed in sequence. If the last item that you specify is a tape device and you are prompted for a tape, the LOAD fails and returns an error.

You must use a fully qualified path file name, and it must exist on the server.

OF filetype

Specifies the format of the data, which can be as follows:

ASC: Non-delimited ASCII format.
DEL: Delimited ASCII format.

MODIFIED BY filetype_mod

Specifies file type modifier options. See File type modifiers. The LOAD command does not issue a warning if you attempt to use unsupported file types with the MODIFIED BY parameter. If you attempt this, the load operation fails, and an error code is returned.

WARNINGCOUNT n

Stops the load operation after n warnings. Set this parameter if you do not expect any warnings but you want to verify that the correct load file and target table are being used. If you incorrectly specify the load file or the target table, the LOAD command generates a warning for each row that it attempts to load, which causes the load operation to fail. If n is zero or you do not specify this parameter, the load operation continues regardless of the number of warnings that it issues.

In a partitioned database environment, a load operation can have multiple load and partition agents. Each agent has a WARNINGCOUNT value. If the value of n is reached on a single agent, the load operation fails. The n values are not cumulative. For example, if n is 3 and there are two agents, each with a WARNINGCOUNT value of 2, the load operation is successful.

MESSAGES ON SERVER

Saves the message file that the LOAD command creates on the server. This parameter applies only if you are running the LOAD command by using the command line processor plus (CLPPlus) interface with the ADMIN_CMD procedure. The result set includes the following two columns:

MSG_RETRIEVAL: Contains the SQL statement that is required to retrieve all the warnings and error messages that occur during the load operation.
MSG_REMOVAL: Contains the SQL statement that is required to clean up the messages.

If you do not specify this parameter, the message file is deleted when the ADMIN_CMD procedure returns to the caller. The MSG_RETRIEVAL and MSG_REMOVAL column in the result set contains null values.

MESSAGES message_file

Specifies the destination for warning and error messages that occur during the load operation. This parameter applies only if you are running the LOAD command by using CLP. If you do not specify this parameter when you run the LOAD command by using CLP, messages are written to standard output. If you do not specify the complete path to the file, the LOAD command uses the current directory and the default drive as the destination. If you specify the name of an existing file, the command appends the messages to that file.

The message file is usually populated with messages at the end of the load operation and, as such, is not suitable for monitoring the progress of the operation.

INSERT

Adds the loaded data to the table without changing the existing table data.

A LOAD INSERT operation into a column-organized table maintains table statistics by default if the table was empty at the start of the load operation.

TERMINATE

Terminates a previously interrupted load operation and rolls back the operation to the point in time at which it started, even if consistency points were passed. The states of any table spaces that were involved in the operation return to normal, and all table objects are made consistent. If the load operation that is being terminated is a LOAD INSERT operation, the table retains all of its original records after the LOAD TERMINATE operation.

The TERMINATE parameter does not remove a backup pending state from table spaces.

INTO table-name

Specifies the database table into which the data is to be loaded. This table cannot be a system table, a declared temporary table, or a created temporary table. You can specify an alias or the fully qualified or unqualified table name. A qualified table name is in the form schema.tablename. If you specify an unqualified table name, the table name is qualified with the value of the CURRENT SCHEMA special register.

NONRECOVERABLE

Avoids the use of transactional logging during a load operation. If the load operation fails, you can only drop the table or restore it from a backup.

Usage notes

Data is loaded in the sequence that appears in the input file. If you want a particular sequence, sort the data before you attempt the load.
Any path that you use for the LOAD command must be a valid fully qualified path on the server head node.

Running the LOAD command

You can run the LOAD command in two ways:

You can run the LOAD command in the DB2 command line processor (CLP) interface, such as by including the command in a script. To use the CLP, you must install the IBM Data Server Client (such as by deploying the Db2® Warehouse client container) and connect the CLP to Db2 Warehouse. See Connecting to IBM Db2 Warehouse with the command-line processor (CLP) or Call Level Interface (CLI).
If the data that you want to load is on the client, specify the CLIENT parameter. If you do not specify the CLIENT parameter, it is assumed that the data is on the server. For loading from the server, the easiest approach is to put the data in /mnt/clusterfs/scratch/datafile, where datafile specifies the file that contains your data. However, if you have large data sets on an NFS share that you want to load, it's preferable to mount that share on the host instead of copying it to the host. To mount the share and map the volume inside the container, use the following instructions:
1. Mount the remote drive on the host where you deployed the container:
```
mount remote_server:/data_location /mnt/clusterfs/scratch/nfsdata
```
2. Stop the Db2 Warehouse services:
```
docker exec -it Db2wh stop
```
3. Stop the Db2 Warehouse container:
```
docker stop Db2wh
```
4. Remove the Db2 Warehouse container:
```
docker rm Db2wh
```
5. Re-create the Db2 Warehouse container and start it, specifying the -v option to create a new mapped volume:
```
docker run -d -it --privileged=true --net=host --name=Db2wh -v /mnt/clusterfs:/mnt/bludata0 -v /mnt/clusterfs:/mnt/blumeta0 -v /mnt/clusterfs/scratch/nfs_data:/mnt/clusterfs/scratch/nfsdata tag
```
  For the latest version of the container, specify one of the following tag values.
  - For a container for IBM POWER LE hardware:
```
store/ibmcorp/db2wh_ee:v11.5.7.0-cn5-db2wh-ppcle
```
  - For a container for IBM z Systems hardware:
```
store/ibmcorp/db2wh_ee:v11.5.7.0-cn5-db2wh-s390x
```
  - For a container for x86 hardware:
```
store/ibmcorp/db2wh_ee:v11.5.7.0-cn5-db2wh-linux
```
  If you do not want to deploy the latest version of the container, see the naming convention information in IBM Db2 Warehouse containers.
In the following example, the LOAD command is specified in the load.sql script, which is then run in the CLP:
```
$ cat load.sql 
load client from /usr/data/unload.del of del modified by escapechar coldel| insert into user_table;
$ db2 -tvf load.sql
```
For information about the CLP options, see Command line processor options.
You can run the LOAD command in the command line processor plus (CLPPlus) interface. CLPPlus is included in the Db2 Warehouse container. For information, see Loading data using CLPPlus and CLPPlus LOAD command results.

File type modifiers

Table 1. All file formats
Modifier	Description
noheader	This modifier applies only if you want to perform a LOAD_ONLY operation by using a file that does not have a header.
norowwarnings	This modifier suppresses all warnings about rejected rows.
usedefaults	If you specify a source column for a target table column but it contains no data for one or more row instances, default values are loaded. Examples of missing data are as follows: For DEL files: Two adjacent column delimiters (",,") or two adjacent column delimiters that are separated by an arbitrary number of spaces (", ,") are specified for a column value. For DEL or ASC files: A row does not have enough columns or is not long enough for the original specification. For ASC files, NULL column values are not considered explicitly missing, and a default is not substituted for NULL column values. NULL column values are represented by all space characters for numeric, date, time, and time stamp columns or by using the NULL INDICATOR for a column of any type to indicate that the column is NULL. If you do not specify this modifier and a source column contains no data for a row instance, one of the following actions occurs: If the column is nullable, a NULL is loaded. If the column is not nullable, the LOAD command rejects the row.

Table 2. ASCII file formats (ASC/DEL)
Modifier	Description
codepage=`x`	The `x` value is an ASCII character string. The value is interpreted as the code page of the data in the input data set. Character data and numeric data that you specify in characters is converted from this code page to the database code page during the load operation. The following rules apply: For pure DBCS (graphic), mixed DBCS, and EUC data, delimiters are restricted to x00 - x3F. For DEL data that you specify in an EBCDIC code page, the delimiters might not coincide with the shift-in and shift-out DBCS characters.
dateformat=`"x"`	The `x` value is the format of the date in the source file.¹ Valid date elements are as follows: YYYY Year (four digits, 0000 - 9999). M Month (one or two digits, 1 - 12). MM Month (two digits, 01 - 12; mutually exclusive with M). D Day (one or two digits, 1 - 31). DD Day (two digits, 01 - 31; mutually exclusive with D) DDD Day of the year (three digits, 001 - 366; mutually exclusive with other day or month elements). A default value of 1 is assigned for each element that is not specified. Sample formats follow: `"D-M-YYYY" "MM.DD.YYYY" "YYYYDDD"`
dumpfile = `x`	The `x` value is the fully qualified (according to the server database partition) name of an exception file to which rejected rows are written. A maximum of 32 KB of data is written per record. The following example shows how to specify a dump file: `load from data of del modified by dumpfile = /u/user/filename insert into table_name` The file is created and owned by the instance owner. To override the default file permissions, use the dumpfileaccessall file type modifier. In a partitioned database environment, use a path that is local to the loading database partition so that concurrently running load operations do not attempt to write to the same file. If the specified file exists, it is not re-created, but it is truncated. The dumpfile modifier refers to a file on the server even if you specify the CLIENT parameter.
dumpfileaccessall	Grants read access to 'OTHERS' when a dump file is created. This file type modifier is valid if you use it with the dumpfile file type modifier and you have the SELECT privilege on the load target table. If the specified file exists, its permissions are not changed.
fastparse	Use this modifier with caution. It reduces syntax checking on user-supplied column values and enhances performance. Tables are guaranteed to be architecturally correct (the command performs sufficient data checking to prevent a segmentation violation or trap); however, the coherence of the data is not validated. Use this modifier only if you are certain that your data is coherent and correct. For example, if the user-supplied data contains an invalid timestamp column value of `:1>0-00-20-07.11.12.000000`, this value is inserted into the table if you specify the fastparse modifier and is rejected if you do not specify the fastparse modifier.
implieddecimal	The location of an implied decimal point is determined by the column definition; it is no longer assumed to be at the end of the value. For example, the value 12345 is loaded into a DECIMAL(8,2) column as 123.45, not 12345.00. You cannot use this modifier with the packeddecimal modifier.
timeformat=`"x"`	The `x` value is the format of the time in the source file.¹ Valid time elements are as follows: H Hour (one or two digits, 0 - 12 for a 12-hour system and 0 - 24 for a 24-hour system). HH Hour (two digits, 00 - 12 for a 12-hour system and 00 - 24 for a 24-hour system; mutually exclusive with H). M Minute (one or two digits, 0 - 59.) MM Minute (two digits, 00 - 59; mutually exclusive with M). S Second (one or two digits, 0 - 59). SS Second (two digits, 00 - 59; mutually exclusive with S). SSSSS Second of the day after midnight (five digits, 00000 - 86400; mutually exclusive with other time elements). TT Meridian indicator (AM or PM). A default value of 0 is assigned for each element that is not specified. Sample formats follow: `"HH:MM:SS" "HH.MM TT" "SSSSS"`
timestampformat=`"x"`	The `x` value is the format of the time stamp in the source file.¹ Valid time stamp elements are as follows: YYYY Year (four digits, 0000 - 9999). M Month (one or two digits, 1 - 12). MM Month (two digits, 01 - 12; mutually exclusive with M and MMM). MMM Month (three-letter case-insensitive abbreviation for the month name; mutually exclusive with M and MM). D Day (one or two digits, 1 - 31). DD Day (two digits, 01 - 31; mutually exclusive with D). DDD Day of the year (three digits, 001 - 366; mutually exclusive with other day or month elements). H Hour (one or two digits, 0 - 12 for a 12-hour system and 0 - 24 for a 24-hour system). HH Hour (two digits, 00 - 12 for a 12-hour system and 00 - 24 for a 24-hour system; mutually exclusive with H). M Minute (one or two digits, 0 - 59). MM Minute (two digits, 00 - 59; mutually exclusive with M). S Second (one or two digits, 0 - 59). SS Second (two digits, 00 - 59; mutually exclusive with S). SSSSS Second of the day after midnight (five digits, 00000 - 86400; mutually exclusive with other time elements). U (1 - 12 times) Fractional seconds (number of occurrences of U represent the number of digits; each digit is 0 - 9). TT Meridian indicator (AM or PM).
timestampformat=`"x"` (Continued)	A default value of 1 is assigned for unspecified YYYY, M, MM, D, DD, or DDD elements. A default value of 'Jan' is assigned to an unspecified MMM element. A default value of 0 is assigned for all other unspecified elements. A sample time stamp format follows: `"YYYY/MM/DD HH:MM:SS.UUUUUU"` The valid values for the MMM element are 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', and 'dec'. These values are case insensitive. If you do not specify the timestampformat modifier, the LOAD command formats the time stamp field by using one of two possible formats: `YYYY-MM-DD-HH.MM.SS YYYY-MM-DD HH:MM:SS` The LOAD command chooses the format based on the separator that is between the DD and HH. If it is a dash (-), the LOAD command uses the regular dashes and dots format (YYYY-MM-DD-HH.MM.SS). If it is a blank space, the LOAD command expects a colon (:) to separate the HH, MM, and SS. In either format, if you include the microseconds field (UUUUUU), the LOAD command expects the dot (.) as the separator. Either YYYY-MM-DD-HH.MM.SS.UUUUUU or YYYY-MM-DD HH:MM:SS.UUUUUU is acceptable. The following example illustrates how to load data that contains user-defined date and time formats into a table that is called schedule: `load from delfile2 of del modified by timestampformat="yyyy.mm.dd hh:mm tt" insert into schedule`
usegraphiccodepage	If you specify the usegraphiccodepage modifier, it is assumed that data that you are loading into graphic data fields is in the graphic code page. The rest of the data is assumed to be in the character code page. The graphic code page is associated with the character code page. The LOAD command determines the character code page by using either the codepage modifier, if you specify it, or the code page of the database if you do not specify the codepage modifier. Use the usegraphiccodepage modifier with the delimited data file that is generated by drop table recovery only if the table that you are recovering has graphic data.

Table 3. ASC file formats (non-delimited ASCII)
Modifier	Description
binarynumerics	Numeric (but not DECIMAL) data must be in binary form, not the character representation. This avoids costly conversions. This modifier is supported only with positional ASC, using fixed length records that you specify by using the reclen modifier. The following rules apply: No conversion between data types, except for BIGINT, INTEGER, and SMALLINT, is performed. Data lengths must match their target column definitions. FLOAT data must be in IEEE floating point format. Binary data in the load source file is assumed to be big endian, regardless of the platform on which the load operation is running. NULL values cannot be present in the data for columns that are affected by this modifier. Blanks (normally interpreted as NULL) are interpreted as a binary value when you use this modifier.
nochecklengths	If you specify the nochecklengths modifier, an attempt is made to load each row even if the source data has a column definition that exceeds the size of the target table column. Such rows can be successfully loaded if code page conversion causes the source data to shrink. For example, 4-byte EUC data in the source might shrink to 2-byte DBCS data in the target and require half the space. The modifier is particularly useful if you know that the source data fits in all cases despite mismatched column definitions.
packeddecimal	Use this modifier to load packed-decimal data directly. The binarynumerics modifier does not apply to the DECIMAL field type. This modifier is supported only with positional ASC, using fixed length records that you specify by using the reclen modifier. Supported values for the sign nibble are as follows: `+ = 0xC 0xA 0xE 0xF - = 0xD 0xB` NULL values cannot be present in the data for columns that are affected by this modifier. Blanks (normally interpreted as NULL) are interpreted as a binary value when you use this modifier. Regardless of the server platform, the byte order of binary data in the load source file is assumed to be big endian. That is, when you use this modifier on Windows operating systems, the byte order must not be reversed. You cannot use this modifier with the implieddecimal modifier.
reclen=`x`	The `x` value is an integer with a maximum value of 32767. For each row, `x` characters are read, and a newline character is not used to indicate the end of the row.
striptblanks	Use this modifier to truncate any trailing blank spaces when you load data into a variable-length field. If you do not specify this modifier, blank spaces are kept. You cannot specify this modifier with the striptnulls modifier.
striptnulls	Use this modifier to truncate any trailing NULL values (0x00 characters) when you load data into a variable-length field. If you do not specify this modifier, NULL values are kept. You cannot specify this modifier with the striptblanks modifier.
zoneddecimal	Use this modifier to load zoned decimal data. The binarynumerics modifier does not apply to the DECIMAL field type. The zoneddecimal modifier is supported only with positional ASC, using fixed length records that you specify by using the reclen modifier. Half-byte sign values can be one of the following values: `+ = 0xC 0xA 0xE 0xF 0x3 - = 0xD 0xB 0x7` Supported values for digits are `0x0` - `0x9`. Supported values for zones are `0x3` and `0xF`.

Table 4. DEL file formats (delimited ASCII)
Modifier	Description
chardel`x`	This modifier specifies a single-character string delimiter. This character is used instead of double quotation marks, which are the default character string delimiter.^2,³ If you want to explicitly specify the double quotation mark (") as the character string delimiter, specify it as follows: `modified by chardel""` You can specify the single quotation mark (') as a character string delimiter as follows: `modified by chardel''`
coldel`x`	This modifier specifies a single-character column delimiter. This character is used instead of a comma, which is the default column delimiter. ^2,³
decplusblank	This modifier specifies the plus sign character. It causes positive decimal values to be prefixed with a blank space instead of a plus sign (+). The default action is to prefix positive decimal values with a plus sign.
decpt`x`	This modifier specifies a single-character decimal point value. The default value is a period (.).
delprioritychar	The current default priority for delimiters is record delimiter (the new line character), character delimiter, and column delimiter. The delprioritychar modifier protects applications that depend on the older priority of character delimiter, record delimiter, and column delimiter. For example, consider the following DEL data file: `"Smith, Joshua",4000,34.98<row delimiter> "Vincent,<row delimiter>, is a manager", ... ... 4005,44.37<row delimiter>` If you specify the delprioritychar modifier, only two rows in this data file are recognized. The second <row delimiter> value is interpreted as part of the first data column of the second row, and the first and the third <row delimiter> values are interpreted as actual record delimiters. If you do not specify the delprioritychar modifier, three rows are recognized, each delimited by a <row delimiter> value.
escapechar	This modifier specifies that the backslash (\) is used as the escape character for the character string delimiter, column delimiter, and record delimiter. The default character and column delimiters and the modifiers for changing the defaults are mentioned earlier in this table. The default record delimiter is a new line character.
keepblanks	This modifier preserves the leading and trailing blanks in each field of type CHAR, VARCHAR, or LONG VARCHAR. Without this modifier, all leading and trailing blanks that are not inside character delimiters are removed, and a NULL is inserted into the table for all blank fields. The following example illustrates how to load data into the TABLE1 table while preserving all leading and trailing spaces in the data file: `load from delfile3 of del modified by keepblanks insert into table1`
nochardel	If you specify this modifier, character delimiters are parsed as part of column data; they are not treated as special characters. Do not specify this modifier if you exported the data from a DB2® database system unless you also specified this modifier at export time. The modifier is provided to support vendor data files that do not have character delimiters. Improper usage might result in data loss or corruption. You cannot specify this modifier with the nodoubledel modifier.
nodoubledel	This modifier suppresses recognition of double character delimiters.

Note:

Double quotation marks around the date format string are mandatory. Field separators cannot contain any of the following characters: a - z, A - Z, and 0 - 9. Use a field separator that is not the same as the character delimiter or field delimiter in the DEL file format. A field separator is optional if the start and end positions of an element are unambiguous. Ambiguity can exist if (depending on the modifier) you use elements such as D, H, M, or S, because of the variable length of the entries.
For time stamp formats, take care to avoid ambiguity between the month and the minute descriptors, because they both use the letter M. A month field must be next to other date fields. A minute field must be next to other time fields. Following are some ambiguous time stamp formats:
```
"M" (Might be a month or a minute)
"M:M" (Which is which?)
"M:YYYY:M" (Both are interpreted as month.)
"S:M:YYYY" (Adjacent to both a time value and a date value)
```
In ambiguous cases, the LOAD command issues an error message, and the load operation fails.
Following are some unambiguous time stamp formats:
```
"M:YYYY" (Month)
"S:M" (Minute)
"M:YYYY:S:M" (Month....Minute)
"M:H:YYYY:M:D" (Minute....Month)
```
You must precede some characters, such as double quotation marks and backslashes, by an escape character (for example, \).
The character values that you provide for the chardel and coldel file type modifiers must be in the code page of the source data. You can specify the character code point (instead of the character symbol) by using the syntax xJJ or 0xJJ, where JJ is the hexadecimal representation of the code point. For example, to specify the # character as a column delimiter, use one of the following statements:
```
... modified by coldel# ...
... modified by coldel0x23 ...
... modified by coldelX23 ...
```
Certain restrictions help prevent a specified delimiter character from being treated as a part of the data:
- Delimiters are mutually exclusive.
- A delimiter cannot be binary zero, a line-feed character, a carriage return, or a blank space. As well, the default decimal point (.) cannot be a string delimiter.
- In A DBCS environment, the pipe (|) character delimiter is not supported.
There are differences in delimiters for ASCII-family and EBCDIC-family code pages:
- The shift-in (0x0F) and the shift-out (0x0E) characters cannot be delimiters for an EBCDIC MBCS data file.
- Delimiters for MBCS, EUC, or DBCS code pages cannot be greater than 0x40, except the default decimal point for EBCDIC MBCS data, which is 0x4b.
- Default delimiters for data files in ASCII code pages or EBCDIC MBCS code pages are as follows:
  - String delimiter: " (0x22, double quotation mark)
  - Column delimiter: , (0x2c, comma)
- Default delimiters for data files in EBCDIC SBCS code pages are as follows:
  - String delimiter: " (0x7F, double quotation mark)
  - Column delimiter: , (0x6B, comma)
- The default decimal point for ASCII data files is 0x2e (period). The default decimal point for EBCDIC data files is 0x4B (period).
- If the code page of the server is different from the code page of the client, specify the hexadecimal representation of non-default delimiters.

Table 5. LOAD command behavior when you use the codepage and usegraphiccodepage modifiers
codepage=`x`	usegraphiccodepage	LOAD behavior
Absent	Absent	All data in the file is assumed to be in the database code page, not the application code page.
Present	Absent	All data in the file is assumed to be in code page `x`. Warning: Graphic data will be corrupted if you load it into the database if `x` is a single-byte code page.
Absent	Present	Character data in the file is assumed to be in the database code page. Graphic data is assumed to be in the code page of the database graphic data. If the database code page is single byte, all data is assumed to be in the database code page. Warning: Graphic data will be corrupted if you load it into a database with a single-byte code page.
Present	Present	Character data is assumed to be in code page `x`. Graphic data is assumed to be in the graphic code page of `x`. If `x` is a single-byte or double-byte code page, all data is assumed to be in code page `x`. Warning: Graphic data will be corrupted if you load it into the database if `x` is a single-byte code page.