Hive partition partitioned read method
When this method is specified, the connector determines the number of partitions in the table specified in the Usage > Table name property or in the Select statement property. The connector associates each node with a list of partitions. For each node, the connector reads the rows that belong to the partitions that are associated with that node.
Example of using the Hive partition partitioned read methodFor this example, the Hive connector is configured in the following way:
- The Generate SQL property is set to No.
- The Select statement property is set to SELECT * FROM TABLE1 WHERE COL2 = [[part-value]] AND COL1 > 10.
- The connector is configured to run in parallel mode on four nodes and the table has 8 partitions.
- The Partitioned reads method property is set to Hive partition.
The connector determines the partitioning keys of the partitions, which are „Paris”, „Warsaw”, „Berlin”, „Moscow”, „Tokyo”, „London”, „Prague” and „Rome”. The connector runs the following SELECT statements on the nodes:
The connector runs the following SELECT statement on four nodes:
SELECT * FROM TABLE1 WHERE (COL2 = ‘Paris’ OR COL2 = ‘Warsaw’) AND (COL1 > 10)
SELECT * FROM TABLE1 WHERE (COL2 = ‘Berlin’ OR COL2 = ‘Moscow’) AND (COL1 > 10)
SELECT * FROM TABLE1 WHERE (COL2 = ‘Tokyo’ OR COL2 = ‘London’) AND (COL1 > 10)
SELECT * FROM TABLE1 WHERE (COL2 = ‘Prague’ OR COL2 = ‘Rome’) AND (COL1 > 10)