Data movement

IBM® Fast Data Movement enables you to quickly transfer your data between your Hadoop and Db2® Warehouse environments.

There are two ways in which data can flow between Db2 Warehouse and Hadoop:
  • Import

    Data transfer from Db2 Warehouse to Hadoop.

    Data is transferred in text mode and then stored on Hadoop in the format of your choice: you use the fq.data.format parameter in the configuration XML file to set the target format. You can select one of the following: TXT, PARQUET, ORC, RCFILE, AVRO, SEQUENCEFILE.

  • Export

    Data transfer from Hadoop to Db2 Warehouse.

    You can export the following data types to Db2 Warehouse:
    • Data files that were previously imported from Db2 Warehouse.
    • Hadoop data files in text format.
    • Hadoop data types in Hadoop-specific formats ( Parquet, Avro, ORC, RCFile) that were previously imported from Db2 Warehouse.
    • Hive tables that were not previously imported from Db2 Warehouse.
    • NZBAK files that were imported to Hadoop from NPS. The files are exported to Db2 Warehouse using the same process as exporting Hive tables.

You run the import and export operations from Hadoop or from the Db2 Warehouse container. Depending which system you plan to run the operations from, different configuration and execution steps apply. These different scenarios are described in Configuring and running data movement.

Notes on the data movement feature

  1. The data movement feature transfers data directly between Hadoop data nodes and Db2 Warehouse. Therefore, a working network connection between all data nodes and Db2 Warehouse is a prerequisite.
  2. The data movement feature is supported on systems delivered by BigInsights, Hortonworks, and Cloudera.
  3. Data types mappings are static. They are described in Data type mapping for data movement.
  4. Data movement with the same destination table executed in parallel is not supported.
  5. A Db2 Warehouse user needs to have the following privileges to import data with Fast Data Movement:
    USAGE on sequence "SYSTOOLS"."DB2LOOK_TOKEN"
    SELECT on view "SYSTOOLS"."DB2LOOK_INFO_V"
    SELECT,INSERT,DELETE on table "SYSTOOLS"."DB2LOOK_INFO"
    Run the following commands to grant the required privileges:
    GRANT USAGE ON SEQUENCE "SYSTOOLS"."DB2LOOK_TOKEN" TO USER "<DB_USER>";
    GRANT SELECT ON TABLE "SYSTOOLS"."DB2LOOK_INFO_V" TO USER "<DB_USER>";
    GRANT DELETE,INSERT,SELECT ON TABLE "SYSTOOLS"."DB2LOOK_INFO" TO USER "<DB_USER>";