Run parameters
mapreduce jar /nz/export/ae/products/netezza/mapreduce/current/mapreducestreaming.
jar
-db <db>
-input <table name> <key_column> <value column>
-output <output_table> <key_column> <value_column>
-mapper <mapper_cmd>
-mapper_out_key_size <size>
-mapper_out_value_size <size>
-reducer <reducer_cmd>
-reducer_out_key_size <size>
-reducer_out_value_size <size>
-file <file>
Parameter | Description |
---|---|
db | Specifies the name of the database containing input data. |
input <table_name> <key_column> <value_column> | Specifies the name of the table containing the input data and the names of the columns where key and value data is stored. |
output <table_name> <key_column> <value_column> | Specifies the name of the table where output data will be stored, followed by the names of key and value columns. |
mapper | Executes the map step on the SPU. |
mapper_out_key_size | Specifies the size (number of characters) of the output key column created after the map step. |
mapper_output_value_size | Specifies the size (number of characters) of the output value column created after the map step. |
Parameter | Description |
---|---|
combiner | Executes the combine step on the SPU. |
combiner_out_key_size |
Specifies the size (number of characters) of the output key column created after the combine step. |
combiner_output_value_size | Specifies the size (number of characters) of the output value column created after the combine step. reducer Executes the reduce step on the SPU. |
reducer_out_key_size | Specifies the size (number of characters) of the output key column created after the reduce step. |
reducer_output_value_size | Specifies the size (number of characters) of the output value column created after the reduce step. |
You must use the file parameter (specified by the streaming command syntax)to specify each file that you want to run mapper/combiner/reducer commands on. Multiple file parameters are allowed. All files are copied to a temporary directory that is accessible by the SPUs.
Within the streaming program, one line of input and output consists of key and value entries separated by a tab character. However, you can define other separators to be used to distinguish key from value by specifying the following parameters.
Input separators | |
mapper_output_separator |
Tab character by default, or set to any chosen symbol. |
combiner_output_separator |
|
reducer_output_separator |
|
Output separators | |
mapper_output_separator |
Tab character by default, or set to any chosen symbol. |
combiner_output_separator |
|
reducer_output_separator |
Parameters to the mapper, combiner, or reducer are passed using environment variables on the SPUs. To pass parameters, use the cmdenv option. For example, enter the following to set the environment variable “NAME” on the SPU to contain the value ADAM:
-cmdenv “NAME=ADAM”
The variable can be subsequently read in a program or script with commands specific to the programming language being used.