Configuring the Hive connector for partitioned write

You can configure a Hive Connector stage to connect to a Hive data source and write data to a partitioned table.

When a Hive connector stage is configured to perform partitioned write, each of the processing nodes of the stage reads a portion of data from the data source and the records are inserted to the partitioned table based on the partition key values.
  1. On the job design canvas, double-click Hive Connector stage, and then click the Stage tab.
  2. On the Advanced page, set Execution mode to Parallel, or Default(Parallel), and then click the Input tab.
  3. Define the INSERT statement that the connector uses at run time:
    • Set Generate SQL at runtime to No, and then specify the INSERT statement in the Insert statement property, or set the Read select statement from file to Yes, specify the name of the file in Select statement property and include the SELECT statement in that file.
    • Set Generate SQL at runtime to Yes and then specify the name of the target table in the Table name field.
  4. Set Enable partitioned write to Yes. The following INSERT statement is an example of the INSERT statement when Generate SQL at runtime to Yes. INSERT into MY_TABLE PARTITION(Field001 = ORCHESTRATE.Field001) VALUES (ORCHESTRATE.Field002, ORCHESTRATE.Field003). The column names that follow the word ORCHESTRATE must have the matching key columns on the input link.
  5. Click OK, and then save the job.