DB2 Version 10.1 for Linux, UNIX, and Windows

Load configuration options for partitioned database environments

There are a number of configuration options that you can use to modify a load operation in a partitioned database environment.

MODE X
Specifies the mode in which the load operation occurs when loading a multi-partition database. PARTITION_AND_LOAD is the default. Valid values are:
  • PARTITION_AND_LOAD. Data is distributed (perhaps in parallel) and loaded simultaneously on the corresponding database partitions.
  • PARTITION_ONLY. Data is distributed (perhaps in parallel) and the output is written to files in a specified location on each loading database partition. For file types other than CURSOR, the format of the output file name on each database partition is filename.xxx, where filename is the input file name specified in the LOAD command and xxx is the 3-digit database partition number. For the CURSOR file type, the name of the output file on each database partition is determined by the PART_FILE_LOCATION option. See the PART_FILE_LOCATION option for details on how to specify the location of the distribution file for each database partition.
    Note:
    1. This mode cannot be used for a CLI load operation.
    2. If the table contains an identity column that is needed for distribution, then this mode is not supported, unless the identityoverride file type modifier is specified.
    3. Distribution files generated for file type CURSOR are not compatible between DB2® releases. This means that distribution files of file type CURSOR that were generated in a previous release cannot be loaded using the LOAD_ONLY mode. Similarly, distribution files of file type CURSOR that were generated in the current release cannot be loaded in a future release using the LOAD_ONLY mode.
  • LOAD_ONLY. Data is assumed to be already distributed; the distribution process is skipped, and the data is loaded simultaneously on the corresponding database partitions. For file types other than CURSOR, the format of the input file name for each database partition should be filename.xxx, where filename is the name of the file specified in the LOAD command and xxx is the 3-digit database partition number. For the CURSOR file type, the name of the input file on each database partition is determined by the PART_FILE_LOCATION option. See the PART_FILE_LOCATION option for details on how to specify the location of the distribution file for each database partition.
    Note:
    1. This mode cannot be used for a CLI load operation, or when the CLIENT parameter of LOAD command is specified.
    2. If the table contains an identity column that is needed for distribution, then this mode is not supported, unless the identityoverride file type modifier is specified.
  • LOAD_ONLY_VERIFY_PART. Data is assumed to be already distributed, but the data file does not contain a partition header. The distributing process is skipped, and the data is loaded simultaneously on the corresponding database partitions. During the load operation, each row is checked to verify that it is on the correct database partition. Rows containing database partition violations are placed in a dump file if the dumpfile file type modifier is specified. Otherwise, the rows are discarded. If database partition violations exist on a particular loading database partition, a single warning is written to the load message file for that database partition. The format of the input file name for each database partition should be filename.xxx, where filename is the name of the file specified in the LOAD command and xxx is the 3-digit database partition number. See the PART_FILE_LOCATION option for details on how to specify the location of the distribution file for each database partition.
    Note:
    1. This mode cannot be used for a CLI load operation, or when the CLIENT parameter of LOAD command is specified.
    2. If the table contains an identity column that is needed for distribution, then this mode is not supported, unless the identityoverride file type modifier is specified.
  • ANALYZE. An optimal distribution map with even distribution across all database partitions is generated.
PART_FILE_LOCATION X
In the PARTITION_ONLY, LOAD_ONLY, and LOAD_ONLY_VERIFY_PART modes, this parameter can be used to specify the location of the distributed files. This location must exist on each database partition specified by the OUTPUT_DBPARTNUMS option. If the location specified is a relative path name, the path is appended to the current directory to create the location for the distributed files.

For the CURSOR file type, this option must be specified, and the location must refer to a fully qualified file name. This name is the fully qualified base file name of the distributed files that are created on each output database partition in the PARTITION_ONLY mode, or the location of the files to be read from for each database partition in the LOAD_ONLY mode. When using the PARTITION_ONLY mode, multiple files can be created with the specified base name if the target table contains LOB columns.

For file types other than CURSOR, if this option is not specified, the current directory is used for the distributed files.

OUTPUT_DBPARTNUMS X
X represents a list of database partition numbers. The database partition numbers represent the database partitions on which the load operation is to be performed. The database partition numbers must be a subset of the database partitions on which the table is defined. All database partitions are selected by default. The list must be enclosed in parentheses and the items in the list must be separated by commas. Ranges are permitted (for example, (0, 2 to 10, 15)).
PARTITIONING_DBPARTNUMS X
X represents a list of database partition numbers that are used in the distribution process. The list must be enclosed in parentheses and the items in the list must be separated by commas. Ranges are permitted (for example, (0, 2 to 10, 15)). The database partitions specified for the distribution process can be different from the database partitions being loaded. If PARTITIONING_DBPARTNUMS is not specified, the load utility determines how many database partitions are needed and which database partitions to use in order to achieve optimal performance.

If the anyorder file type modifier is not specified in the LOAD command, only one partitioning agent is used in the load session. Furthermore, if there is only one database partition specified for the OUTPUT_DBPARTNUMS option, or the coordinator partition of the load operation is not an element of OUTPUT_DBPARTNUMS, the coordinator partition of the load operation is used in the distribution process. Otherwise, the first database partition (not the coordinator partition) in OUTPUT_DBPARTNUMS is used in the distribution process.

If the anyorder file type modifier is specified, the number of database partitions used in the distribution process is determined as follows: (number of partitions in OUTPUT_DBPARTNUMS/4 + 1).

MAX_NUM_PART_AGENTS X
Specifies the maximum numbers of partitioning agents to be used in a load session. The default is 25.
ISOLATE_PART_ERRS X
Indicates how the load operation reacts to errors that occur on individual database partitions. The default is LOAD_ERRS_ONLY, unless both the ALLOW READ ACCESS and COPY YES parameters of the LOAD command are specified, in which case the default is NO_ISOLATION. Valid values are:
  • SETUP_ERRS_ONLY. Errors that occur on a database partition during setup, such as problems accessing a database partition, or problems accessing a table space or table on a database partition, cause the load operation to stop on the failing database partitions but to continue on the remaining database partitions. Errors that occur on a database partition while data is being loaded cause the entire operation to fail.
  • LOAD_ERRS_ONLY. Errors that occur on a database partition during setup cause the entire load operation to fail. If an error occurs while data is being loaded, the load operation will stop on the database partition where the error occurred. The load operation continues on the remaining database partitions until a failure occurs or until all the data is loaded. The newly loaded data will not be visible until a load restart operation is performed and completes successfully.
    Note: This mode cannot be used when both the ALLOW READ ACCESS and the COPY YES parameters of the LOAD command are specified.
  • SETUP_AND_LOAD_ERRS. In this mode, database partition-level errors during setup or loading data cause processing to stop only on the affected database partitions. As with the LOAD_ERRS_ONLY mode, when partition errors do occur while data is loaded, newly loaded data will not be visible until a load restart operation is performed and completes successfully.
    Note: This mode cannot be used when both the ALLOW READ ACCESS and the COPY YES options of the LOAD command are specified.
  • NO_ISOLATION. Any error during the load operation causes the load operation to fail.
STATUS_INTERVAL X
X represents how often you are notified of the volume of data that has been read. The unit of measurement is megabytes (MB). The default is 100 MB. Valid values are whole numbers from 1 to 4000.
PORT_RANGE X
X represents the range of TCP ports used to create sockets for internal communications. The default range is from 49152 to 65535. If defined at the time of invocation, the value of the DB2ATLD_PORTS registry variable replaces the value of the PORT_RANGE load configuration option. For the DB2ATLD_PORTS registry variable, the range should be provided in the following format:
   <lower-port-number:higher-port-number>
From the CLP, the format is:
   ( lower-port-number, higher-port-number )
CHECK_TRUNCATION
Specifies that the program should check for truncation of data records at input/output. The default behavior is that data is not checked for truncation at input/output.
MAP_FILE_INPUT X
X specifies the input file name for the distribution map. This parameter must be specified if the distribution map is customized, as it points to the file containing the customized distribution map. A customized distribution map can be created by using the db2gpmap program to extract the map from the database system catalog table, or by using the ANALYZE mode of the LOAD command to generate an optimal map. The map generated by using the ANALYZE mode must be moved to each database partition in your database before the load operation can proceed.
MAP_FILE_OUTPUT X
X represents the output filename for the distribution map. The output file is created on the database partition issuing the LOAD command assuming that database partition is participating in the database partition group where partitioning is performed. If the LOAD command is invoked on a database partition that is not participating in partitioning (as defined by PARTITIONING_DBPARTNUMS), the output file is created at the first database partition defined with the PARTITIONING_DBPARTNUMS parameter. Consider the following partitioned database environment setup:
      1 serv1 0
      2 serv1 1
      3 serv2 0
      4 serv2 1
      5 serv3 0

Running the following LOAD command on serv3, creates the distribution map on serv1.

LOAD FROM file OF ASC METHOD L ( ...) INSERT INTO table CONFIG 
MODE ANALYZE PARTITIONING_DBPARTNUMS(1,2,3,4) 
MAP_FILE_OUTPUT '/home/db2user/distribution.map'
This parameter should be used when the ANALYZE mode is specified. An optimal distribution map with even distribution across all database partitions is generated. If this parameter is not specified and the ANALYZE mode is specified, the program exits with an error.
TRACE X
Specifies the number of records to trace when you require a review of a dump of the data conversion process and the output of the hashing values. The default is 0.
NEWLINE
Used when the input data file is an ASC file with each record delimited by a new line character and the reclen file type modifier is specified in the LOAD command. When this option is specified, each record is checked for a new line character. The record length, as specified in the reclen file type modifier, is also checked.
DISTFILE X
If this option is specified, the load utility generates a database partition distribution file with the given name. The database partition distribution file contains 32 768 integers: one for each entry in the distribution map for the target table. Each integer in the file represents the number of rows in the input files being loaded that hashed to the corresponding distribution map entry. This information can help you identify skew in your data and also help you decide whether a new distribution map should be generated for the table using the ANALYZE mode of the utility. If this option is not specified, the default behavior of the load utility is to not generate the distribution file.
Note: When this option is specified, a maximum of one partitioning agent is used for the load operation. Even if you explicitly request multiple partitioning agents, only one is used.
OMIT_HEADER
Specifies that a distribution map header should not be included in the distribution file. If not specified, a header is generated.
RUN_STAT_DBPARTNUM X
If the STATISTICS USE PROFILE parameter is specified in the LOAD command, statistics are collected only on one database partition. This parameter specifies on which database partition to collect statistics. If the value is -1 or not specified at all, statistics are collected on the first database partition in the output database partition list.