Loading data to IBM DB2

These are the properties that can be set to customize the bulk load operation when the stage is configured for the IBM® DB2® database type.

Load method
Specifies the mechanism to use for providing data to the database to be loaded to the target table. The following values are supported
Sequential file
The connector stores data to a file from which the loader loads it to the database
Named pipe
Default. The connector delivers the data directly to the loader.
Load immediate
Specifies whether to load the data as the job runs or to store data to a file to be loaded later. The allowed values are Yes and No. The default is Yes which means to load data during the job runtime.
Load from client
Specifies whether the load is done to a remote database or to the local database (installed on the InfoSphere® DataStage® Server machine).
Remove intermediate data file
Specifies whether to remove the data file after completing the load operation. This property is applicable only when the Load method property is set to Sequential file.

The supported values are Yes and No.

The default value is Yes, meaning the data file is removed after the data has been loaded.

Data format
File type of the data format
Specifies the data format to use in the data files when the Load method property is set to Sequential file.
The supported values are:
Non-delimited ASCII format
ASC
ASCII data file with values specified in fixed size format.

This value is the default value.

Delimited ASCII format
DEL
ASCII data file with delimiter used as the value separator.

This value is the default value.

IXF
Proprietary data format called Integration Exchange Format (IXF), PC Version.
Cursor
A cursor declared against a Select of VALUES statement.
LOB path
Specifies the path to the directory with data files that contain LOB values that need to be loaded to the target table. There is no default value.
File type modifier options
File type modifier
Specifies the modifier options (as free-form text) to be provided for the MODIFY load parameter.

No default value is used for the stage. The default is an empty string.

The default value is lobsinfile noheader.

Method
Specifies how to perform the resolution of columns to which the data should be loaded based on the input data contents.
The supported values are:
Default
The stage automatically determines the method to use based on the specified data format.

This method is supported for DEL data format only.

Locations
The stage uses the start and end columns numbers specified by the user.
Names
The stage uses column names specified by the user
Positional
The stage uses the fields numbers of the input data fields specified by the user
Column-start Column-end
Applies when the Method property is set to Locations. It specifies a comma-separated list of start end pairs of column numbers for the columns in the input data file from which to load data to the target table. The value is in format: n1 n2, n3 n4, etc. The specified numbers represent byte offsets for the respective columns from the beginning of each data row. There is no default value.
Column name
Applies when the Method property is set to Names. It specifies a comma-separated list of column names in the input data file from which to load data to the target table. There is no default value.
Column position
Applies when the Method property is set to Positional. It specifies a comma-separated list of field numbers (starting with 1) of the input data fields to be loaded. There is no default value.
Null indicators
Applies when the Method property is set to Locations. It specifies a comma-separated list of column numbers for the null indicator fields. Column numbers represent byte offsets of the null indicator fields from the beginning of each data row. There is no default value.
Insert-column
Specifies a comma-separated list of column names in the target table to which to insert data. Double quotation marks should be used around column names that contain spaces. There is no default value.
Datalink specification
Provides column specifications for DATALINK columns. There can be one column specification for each DATALINK column. Column specifications are enclosed by parentheses. Each column specification consists of one or more of DL_LINKTYPE, prefix and DL_URL_SUFFIX specifications. The prefix specification can be either DL_URL_REPLACE_PREFIX or DL_URL_DEFAULT_PREFIX specification. There is no default value. This property is deprecated as the DATALINK SPECIFICATION option is not supported for LOAD utility in DB2 9.1 and later.
Load without prompting
Without prompting
Specifies whether the list of specified data files contains all the files that need to be loaded and that the devices or directories listed are sufficient for the entire load operation. The supported values are Yes and No.

The default value is Yes, which indicates that the provided list of data files is sufficient to complete the load operation.

The default value is No.

Rows buffer size
Specifies the size of the buffer in kilobytes to use for the rows from the input link.
Load mode
specifies the mode in which to load the data.
The supported values are:
Insert
Appends new data to the table without changing the existing data in the table.
Replace
Deletes all the data from the table before loading the new data.

This value is the default value.

Restart
Restarts the load operation following the interruption of the previous load operation.
Terminate
Terminates the load operation that was previously interrupted.
Save count
Specifies the number of rows to load before establishing consistency point. The default value 0 is used to specify that no consistency points are established, unless determined to be necessary by the load utility.
Row count
Specifies the total number of initial rows to load. The default value 0 specifies that all the rows in the input data file should be loaded.
Restart count
Specifies the number of rows to skip before starting to load rows. This property should be used when the previous load attempt failed with some records committed to the target table.

The default value 0 specifies that not records should be skipped and to start loading from the first row.

Restart phase
Specifies the phase at which to restart the load operation. Build and Delete values should not be specified if Insert or Replace values are specified for the Load mode property.
The supported values are:
Load
Start the operation in the Load phase.
Build
Start the operation in the Build phase.
Delete
Start the operation in Delete phase.
Warning count
Specifies the number of warnings after which to stop the load operation. The default value is 0 which means the load operation proceeds regardless of the number of warnings that were reported.
Indexing mode
Specifies whether to rebuild indexes or to extend them incrementally. The supported values are:
Auto-select
AUTOSELECT
The load utility automatically chooses the most optimal option.

This value is the default value.

Rebuild
REBUILD
The indexes are rebuilt after the load
Incremental
INCREMENTAL
The indexes are maintained (incrementally built) during the load
Deferred
DEFERRED
The decision is deferred
Load message file name
Specifies the path to the local file to use for storing warning and error messages during the load operation.

There is no default value.

Temporary files directory
Directory for temporary files
Specifies the path to the directory that DB2 uses for storing temporary files that it creates for the load operation.
Exception table name
Specifies the name of the table to which to insert rows in error.

There is no default value.

Statistics
The types of statistics to collect for the table.

The collection of statistics is not supported if the Load mode property is set to Insert or Replace.

The supported values are:
Table statistics
TableStats
Table statistics are gathered.
Table and index statistics
TableAndIndexStats
Table and index statistics are gathered.
Index statistics
Index statistics are gathered.
Table and distributed statistics
TableAndDistributedStats
Table and distributed statistics are gathered.
Table and distributed statistics and basic indexes
TableAndDistributedStatsAndBasicIndexes
Table, distributed and basic index statistics are gathered.
Extended statistics for index only
ExtendedStatsForIndexOnly
Extended index statistics are gathered.
Extended statistics for indexes and basic table statistics
ExtendedStatsForIndexesAndBasicTableStats
Extended index statistics and basic table statistics are gathered.
All statistics
AllStatistics
All statistics are gathered.
No statistics
NoStatistics
None of the statistics are gathered.

This value is the default value.

Unrecoverable load
Non Recoverable
Specifies whether the transaction for the data load operation unrecoverable. The supported values are Yes and No. The default value Yes indicates that the data load operation is unrecoverable, meaning that the loaded data cannot be recovered by a subsequent roll forward operation.
Data buffer size
Specifies the number of 4-kilobyte pages in the buffer to use for transferring data within the load utility. The default value 0 specifies that the load utility should automatically determine the optimal value.
Sort buffer size
Specifies the number of 4-kilobyte pages in the buffer to use for sorting index keys in the load utility when the Indexing mode property is not set to value Deferred. The default value 0 specifies that the load utility should automatically determine the optimal value.
Working directory
Specifies the optional working directory for sorting index keys. There is no default value which indicates that sqllib/tmp directory should be used. This property is deprecated as DB2 changed the sort operations to spill into a bufferpool associated with the temporary table space.
CPU parallelism
Specifies the number of processes or threads that the load utility will create for parsing, converting, and formatting records when building table objects. The default value 0 specifies that the load utility should automatically choose the optimal CPU parallelism value based on the current environment.
Disk parallelism
Specifies the number of processes or threads that the load utility will create for writing data to the table space containers. The default value 0 specifies that the load utility should automatically choose the optimal disk parallelism value based on the current environment.
Tracing level
Specifies the level of tracing for the messages added to the log. The actual value specified for the property is a combination of the supported values obtained by adding them together to produce the effective tracing level.

The DRS Connector supports different levels of logging through the CC_MSG_LEVEL environment variable.

The supported values are:
0
No tracing.
1
Tracing of important events.
2
Tracing performance messages.
4
Tracing function messages.
Copy loaded data
Specifies whether to save a copy of the loaded data.
The supported values are:
No
No copy is made. This is the default value.
Yes
C copy is made by the user-specified library.
Yes, Use Tivoli® Storage Manager
Use TSM
A copy is made using Tivoli Storage Manager.
Use ADSM
A copy is made using ADSTAR Distributed Transaction Manager.

This option is deprecated as it is not supported starting with DB2 9.1.

Copy to device or directory name
Copy To device/directory name
Specifies the name of the device or directory to which to save a copy of the loaded data. There is no default value. This property is applicable when the Copy loaded data property is not set to value No.
Copy library name
Copy Load library name
Specifies the name of the shared library containing the backup and restore I/O functions to be used for saving a copy of the loaded data. There is no default value. This property is applicable when the Copy loaded data property is set to value Yes
Allow access mode
The access level to allow to other applications for the target table to which the data is loaded.
The supported values are:
No access
No
The LOAD utility locks table for exclusive access during the load.
Read access
Read
The LOAD utility locks the table in shared access mode so that other applications can read data from it while the load is taking place.
Tablespace for read access
Use table space to allow read access
Specifies the optional tablespace to use if rebuilding indexes. A shadow copy of the index is built in the specified tablespace and then copied to the original tablespace at the end of the load.

There is no default value.

Set integrity pending cascade
Check pending cascade
Specifies whether to automatically set Integrity pending state to all the descendent tables.
The supported values are:
Deferred
Only the table that is loaded is placed in the Integrity pending state.

This value is the default value.

Immediate
The Integrity pending state for foreign key constraints is immediately extended to all descended foreign key tables.
Lock with force
Specifies whether the load utility should be allowed to force off other applications that hold conflicting locks on the target table in order to acquire the necessary locks for the load operation. The supported values are Yes and No. The default value is Yes.
Partitioned database configuration
Partitioned DB configuration
Specifies whether the data is loaded to a table distributed across multiple database partitions. The supported values are Yes and No.

The default value is No.

HOSTNAME
Specifies the host name of the machine where the data file resides. If the value is not provided, the value “nohost” is used.

This property is deprecated as this option has been deprecated starting with DB2 9.1.

FILE_TRANSFER_CMD
Specifies the name of the executable to use to provide data to the load utility.

There is no default value.

This property is deprecated as this option has been deprecated starting with DB2 9.1

PART_FILE_LOCATION
Specifies the path of the directory with partitioned files.

There is no default value.

Output partition numbers
OUTPUT_DBPARTNUMS
Specifies the comma-separated list of database partitions to which to load the data. The list must be enclosed by parenthesis. The word “to” can be used to specify a range or partitions instead of a single partition, for example: (0, 2 to 10, 15).

There is no default value.

Partitioning partition numbers
PARTITIONING_DBPARTNUMS
Specifies the comma-separated list of database partitions to be used in the distribution process. The list must be enclosed by parenthesis. The word “to” can be used to specify a range or partitions instead of a single partition number, for example: (0, 2 to 10, 15).

There is no default value.

MODE
Specifies the mode to use for loading data to the table in a partitioned database.
The supported values are:
PARTITION_AND_LOAD
Partition and load the data.
PARTITION_ONLY
Partition the data only.
LOAD_ONLY
Load the data.

This value is the default value.

LOAD_ONLY_VERIFY_PART
Load the data for verification only.
ANALYZE
Analyze the data only.
Maximum number of partitioning agents
MAX_NUM_PART_AGENTS
Specifies the maximum number of partitioning agents in a load session.

The default value is 25.

Partition errors isolation mode
MAX_NUM_PART_AGENTS
Specifies the mode for handling errors that occur on individual database partitions.
The supported values are:
Setup errors only
SETUP_ERRS_ONLY
Handle setup errors only.
Load errors only
LOAD_ERRS_ONLY
Handle load errors only.

This value is the default value.

Setup and load errors
SETUP_AND_LOAD_ERRS
Handle setup and load errors.
No isolation
NO_ISOLATION
Handle no isolation level errors.
Status interval
STATUS_INTERVAL
Specifies the amount of data to load in megabytes before issuing a progress message.

The valid values are in the range of 1 to 4000.

The default value is 100.

Port range
PORT_RANGE
Specifies the range of TCP ports used to create sockets for internal communications. The format is: (lower-port,higher-port).

The default value is 6000,6063.

Check truncation
CHECK_TRUNCATION
Specifies whether to check for data truncations on input/output.

The valid values are Yes and No.

The default value is No.

MAP_FILE_INPUT
Specifies the name of the input distribution map file name to use when the MODE property is set to ANALYZE.

There is no default value.

MAP_FILE_OUTPUT
Specifies the name of the output distribution map file name to use when the MODE property is set to ANALYZE.

There is no default value.

Trace record number
TRACE
Specifies the number of records to trace when a review of a dump of the data conversion process and the output of the hashing values is required.

The default value is 0.

NEWLINE
Specifies whether to force checking for newline character at the end of each record.

The option is valid when non-delimited ASCII format file is loaded and the [ASC] property is set and the reclen file type modifier is specified.

The supported values are Yes and No.

The default value is No.

Distribution file name
DISTFILE
Specifies the name of the database partition distribution file that the LOAD utility should generate. If the value is not specified, the file is not generated.

There is no default value.

Omit distribution map header
OMIT_HEADER
Specifies whether to prevent inclusion of the distribution map header in the distribution file.

The supported values are Yes and No.

The default value is No.

Statistics partition number
RUN_STAT_DBPARTNUM
Specifies the number of the partition for which to collect statistics if the stage was configured to collect statistics. The value -1 specifies to collect statistics on the first database partition in the output database partition list.

The default value is -1.