Running data movement from Hadoop with nzcodec.jar

Note: This method of running data movement is alternative to fdm.sh script, which is described in Running data movement from Hadoop.

You can run the data transfer with nzcodec.jar directly from your Hadoop nodes on which you have installed the data movement software.

Before you begin

Ensure that you have installed the IBM® Fast Data Movement software on Hadoop, as described in Installing or upgrading the data movement feature on Hadoop.

Procedure

  1. Depending on whether you are running an import or an export transfer, prepare the XML configuration file on the basis of one of the templates:
    • fq-import-remote-conf.xml
    • fq-export-remote-conf.xml
    Each file defines a single transfer job, that is single or multiple tables from or to a particular database. The XML configuration details are described in Preparing configuration XML files.
  2. Export the classpath for the Hadoop job:
    • When running import or export via nzcodec.jar, the path depends on the service provider:
      For BigInsights 3:
      When using Hive, run the following command:
      export HADOOP_CLASSPATH="/opt/ibm/biginsights/hive/lib/*:/opt/ibm/biginsights/hive/conf/:/fastDataMovement/nzjdbc3.jar"
      When using BigSQL, run the following command:
      export HADOOP_CLASSPATH="/home/bigsql/sqllib/java/*:/opt/ibm/biginsights/hive/conf/:/fastDataMovement/nzjdbc3.jar"
      For BigInsights 4:
      When using Hive, run the following command:
      export HADOOP_CLASSPATH="/usr/iop/current/hive-client/lib/*:/usr/ibmpacks/bigsql/4.0/hive/conf/:/fastDataMovement/nzjdbc3.jar"
      When using BigSQL, run the following command:
      export HADOOP_CLASSPATH="/home/bigsql/sqllib/java/*:/usr/ibmpacks/bigsql/4.0/hive/conf/:/fastDataMovement/nzjdbc3.jar"
      For Hortonworks:
      For versions earlier than 2.3.2, run the following command:
      export HADOOP_CLASSPATH="/usr/hdp/current/hive-client/lib/*:/usr/hdp/current/hive-client/conf/:/fastDataMovement/nzjdbc3.jar"
      On Hortonworks 2.3.2 and above, run the following command:
      export HADOOP_CLASSPATH="/usr/hdp/current/atlas-server/hook/hive/*:
      /usr/hdp/current/hive-client/lib/*:/usr/hdp/current/hive-client/conf/:/fastDataMovement/nzjdbc3.jar"
      For Cloudera 4:
      Run the following command:
      export HADOOP_CLASSPATH="/etc/hive/conf/:/usr/lib/hive/lib/*:
      /usr/share/cmf/cloudera-navigator-server/libs/cdh4/*:/fastDataMovement/nzjdbc3.jar"
      For Cloudera 5:
      Run the following command:
      export HADOOP_CLASSPATH="/etc/hive/conf/:/opt/cloudera/parcels/CDH/lib/hive/lib/*:/fastDataMovement/nzjdbc3.jar"
      For Cloudera QuickStart:
      Run the following command:
      export HADOOP_CLASSPATH="/etc/hive/conf/:/usr/lib/hive/lib/*:/fastDataMovement/*"
  3. Optionally, you can validate connection configuration as described in Validating data movement configuration.
  4. From the command line, run nzcodec.jar and provide the import or export XML configuration file as the -conf parameter. For example:
    To import data, run the following command:
    hadoop jar /fastDataMovement/nzcodec.jar -conf fq-import-conf.xml
    To export data, run the following command:
    hadoop jar /fastDataMovement/nzcodec.jar -conf fq-export-conf.xml
    Tip: You can also overwrite any parameters included in the configuration file with -D parameter:
    hadoop jar /fastDataMovement/nzcodec.jar -conf fq-import-conf.xml -D fq.tables=import_test -D
          fq.append.mode=overwrite

Results

Data transfer is performed based on the properties that you provided.