Loading data files with dbload command

Depending on where the source data files are located, you can use the dbload command in local mode or remote mode, when the files must be transferred over the network.

About this task

Data can be loaded using local or remote mode:
  • Local mode means that the input data file is directly available for the database server on its filesystem. You would usually run the dbload command from the Db2® Warehouse container.
  • Remote mode is used if dbload is invoked from a client machine, that is you use the dbload command from the client container. In this case, data file is stored on client filesystem and it needs to be transferred over the network.
By default, the mode is identified automatically depending on the value of the –host parameter: if host is localhost or 127.0.0.1, then the transfer is local, otherwise it is remote. You can also control it with the -remoteSource <YES|GZIP> parameter. When set, remote mode is forced. With the GZIP option, data is compressed before it is transferred, and then decompressed. This might improve performance if network is bottleneck.

Procedure

  • Use the dbload command in the Db2 Warehouse container to load data in local mode:
    1. Ensure the data files are located in a directory specified by the parameter EXTBL_LOCATION of database configuration.
      The path is different for Db2 Warehouse and for Integrated Analytics System. Ensure you set dbload -df option to a proper directory. To find out EXTBL_LOCATION run the following command:
      db2 get db cfg | grep -i extbl_location
      Strict I/O for EXTBL_LOCATION (EXTBL_STRICT_IO) = NO
      Allowed paths for external tables (EXTBL_LOCATION) = /mnt/blumeta0/home
      Default paths are:
      • Db2 Warehouse: /mnt/blumeta0/home
      • Integrated Analytics System: /scratch/home
      Note that these are the paths of the container filesystem.
    2. Set -host option to localhost.
    3. Do not specify -remoteSource.
    Example:
    dbload -host localhost -port 50000 -u bluadmin -db bludb -schema clients -t ACCOUNT 
    -df /scratch/home/ACCOUNT.dat -delim '|' -verbose
  • Use the dbload command in the client container to load data in remote mode:
    1. Ensure that the source files reside on the client machine.
    2. Set -host option to Db2 Warehouse host. Remote mode is set automatically.
    Example:
    dbload -host db_host -port 50000  -u bluadmin -pw bluadmin -db bludb -schema clients 
    -t ACCOUNT -df /home/user1/dbload_dmo/ACCOUNT.dat -delim '|' 
    -outputDir /home/user1/dbload_demo -verbose 
  • If for some reason a client container is deployed on the same host machine as the Db2 Warehouse container, you can run dbload from this client container and use a local load option, but this requires additional configuration steps:
    1. The data file to be loaded must be in a directory specified by the parameter EXTBL_LOCATION of database configuration. The path is different for Db2 Warehouse and for Integrated Analytics System. Ensure you set dbload -df option to a proper directory. To find out EXTBL_LOCATION run the following command:
      db2 get db cfg | grep -i extbl_location
      Strict I/O for EXTBL_LOCATION (EXTBL_STRICT_IO) = NO
      Allowed paths for external tables (EXTBL_LOCATION) = /mnt/blumeta0/home
    2. When deploying the client container, you must add an option to the docker run command to mount a directory used by Db2 Warehouse database as EXTBL_LOCATION, for example -v /mnt/clusterfs/home:/mnt/blumeta0/home
    3. Set dbload -outputDir option to a directory under EXTBL_LOCATION. If you do not specify the outputDir parameter, the following error message is displayed:
      Error: [IBM][CLI Driver][DB2/LINUXX8664] SQL0644N Invalid value specified for keyword "LOGDIR" in statement "EXTERNAL TABLE". SQLSTATE=42615
    4. Set -host to localhost and do not specify -remoteSource.
    Example:
    dbload -host localhost -port 50000  -u bluadmin -pw bluadmin -db bludb -schema clients 
    -t ACCOUNT -df /mnt/blumeta0/home/dbload_demo/ACCOUNT.dat -delim '|’ 
    -outputDir /mnt/blumeta0/home/dbload_demo -verbose
    Note: When running dbload from the client container you must always provide either the -host value or set the variable DB_HOST. Otherwise you might get the following error: Error: [IBM][CLI Driver][DB2/LINUXX8664] SQL0644N Invalid value specified for keyword "REMOTESOURCE" in statement "EXTERNAL TABLE". SQLSTATE=42615