Using the Hadoop Distributed File System (HDFS) with the CDC Replication Engine for InfoSphere DataStage
The CDC Replication Engine for InfoSphere® DataStage® version 11.3.3.1 and later supports specifying a Hadoop Distributed File System (HDFS) directory as the output directory for flat files.
When a table is mapped to local HDFS the following configuration options must have been set:
- The path for the Hadoop JAR file is specified in the CLASSPATH environment variable
- The environment parameter HADOOP_CONF_DIR must be set to point to a directory containing the target Hadoop cluster configuration files.