Db2® Warehouse supports conversion to the following
Hadoop-specific file formats: Parquet, Avro, ORC, RCFile, SEQUENCEFILE. You can import the files
from Db2 Warehouse and store them on Hadoop in a format of
your choice.
-
Edit the fq-import-remote-conf.xml template.
-
Set the fq.data.format property using one of the options: PARQUET, ORC,
RCFILE, AVRO, SEQUENCEFILE
<property>
<name>fq.data.format</name>
<value>PARQUET</value>
</property>
-
Set the fq.output.compressed parameter to select compression type. If you
set the property to
false or leave it empty, the default compression type that is
specified on Hadoop for the selected format will be used.
Depending on the format that you use, select one of the following values:
- PARQUET:
Snappy, gzip, uncompressed
- ORC:
NONE, ZLIB, SNAPPY
- RCFILE: The value has to contain the exact class name of the codec which is available on Hadoop
system. For example:
org.apache.hadoop.io.compress.SnappyCodec
- AVRO:
snappy, deflate
- SEQUENCEFILE: The value has to contain the exact class name of the codec which is available on
Hadoop system. For example:
org.apache.hadoop.io.compress.SnappyCodec
-
Because mixed mode of transfer is not supported when using Hadoop formats, the
fq.compress property must be set to
false or empty.
-
Save the XML file and take note of the file path.