Mapping a single source table to Hadoop using Web HDFS

You can use a wizard to map a source table to Hadoop using the Hadoop option in the CDC replication engine for InfoSphere® DataStage®.

Procedure

  1. Click Configuration > Subscriptions.
  2. Select a subscription, right-click and select Map Tables.
  3. Select Custom InfoSphere DataStage Mappings and click Next.
  4. Select Hadoop > Web HDFS and click Next.
  5. Expand the database, schema, or table from the Source Tables list to view tables from your database that are available for mapping. Right-click the database user or schema and click Refresh if you do not see your table listed.

    You may be prompted to filter databases, schemas, or tables if Automatically prompt for filter when expanding nodes is enabled in your preferences. To manually define a filter, select a datastore, database, or schema and click Specify Filter.

  6. Enable the table to map from the Source Table list. If you do not see your table listed, right-click the database user or schema and select Refresh.
  7. If you want to hide columns so that the target is not aware of them, select a source table and click Filter Columns. Clear the check box for the column you want to hide and click OK.
  8. Click Next.
  9. Enter the output directory for the flat files in the Web HDFS Directory box.
  10. Enable one of the following options for the flat file output records and click Next:
    Single Record
    An update operation is sent as a single row.
    Multiple Records
    An update operation is sent as two rows.
  11. Review the mapping settings.
  12. Click Finish.