mrsh

The command to submit MapReduce jobs. The MapReduce framework is supported on IBM® Spectrum Symphony Advanced Edition on Linux®; however, you can run MapReduce jobs in an IBM Spectrum Symphony Developer Edition environment, and you can submit MapReduce jobs from IBM Spectrum Symphony Developer Edition to IBM Spectrum Symphony Advanced Edition.

Synopsis

mrsh subcommand [options]
mrsh -h
mrsh -V
mrsh version
mrsh subcommand -h

Description

Use the mrsh command to submit MapReduce jobs.
-h
Outputs command usage and exits.
-V
Prints the IBM Spectrum Symphony-MapReduce framework version to stderr and exits.
version
Prints the IBM Spectrum Symphony-MapReduce framework API version to stderr and exits.
subcommand -h
Outputs subcommand usage and exits. The mrsh command supports the jar, pipes, and cleanup subcommands. Refer to the subcommand synopsis for details.

Subcommand synopsis

jar jar_file [classname] [-options option] [args...]
pipes [-options]
cleanup [-u user_name] -x password

jar jarfile [classname] [-options option] [-args …]

Submits MapReduce jobs in the IBM Spectrum Symphony-MapReduce framework with optional class, option, and argument specifications.
jar jar_file
Runs the MapReduce framework packaged as a JAR file, using the specified JAR file name.
classname
Specifies the class to be invoked. If the class is not specified, the class that is specified by the JAR manifest is run.
-options
Specifies MapReduce job configuration options. Supported options are as follows:
-confconfiguration_file
Specifies an application configuration file.
-Dname=value
Specifies the job configuration property in name-value pairs. For example, to specify the number of threads to be used to execute a MapReduce job, use the pmr.reduce.multithread.num option with a value (for instance, to specify three threads in your job submission , specify -Dpmr.reduce.multithread.num=3).
Note: For IBM Spectrum Symphony Developer Edition, if you run MapReduce jobs to show the execution profile (by specifying pmr.performance.instrument.level=value_greater_than_0), you will see an error similar to The required file /directory_path/datasource.xml does not exist. This error occurs as a warning that the datasource.xml file is required by the reporting framework, and IBM Spectrum Symphony Developer Edition does not support this framework. You can safely ignore this message. This error does not display if you run the MapReduce job using IBM Spectrum Symphony.
-fslocal|namenode:port
Specifies a file system's address (specify either local, or the actual name address of the NameNode (not its URL)). The port is the file system port, such as the HDFS port.
-jtlocal|namenode:port
Specifies the job tracker's address (specify either local, or the actual name address of the NameNode (not its URL)). The port is the job tracker port, such as the HDFS port.
-filescomma_separated_list
Specifies a list of files to be copied to the MapReduce cluster. Separate file names with a comma.
-libjarscomma_separated_list
Specifies a list of JAR files to be included in the classpath. Separate file names with a comma.
-archivescomma_separated_list
Specifies a list of archives to be unarchived on the compute hosts.
-args
Specifies the arguments to be invoked for the MapReduce job.

pipes [-input path] | [-output path] | [-jar jar_file] | [-inputformat class] | [-map class] | [-partitioner class] | [-reduce class] | [-writer class] | [-program executable] | [-reduces number] | [-lazyOutput true|false]

Submits MapReduce jobs in the IBM Spectrum Symphony-MapReduce framework with pipe (redirection) specifications. Specify any of the supported pipe options to pass information for MapReduce jobs.
pipes [-input path]
Runs the MapReduce job from the specified input directory.
pipes [-output path]
Runs the MapReduce job to the specified output directory.
pipes [-jar jar_file]
Runs the MapReduce job packaged as a JAR file, using the specified JAR file name.
pipes [-inputformat class]
Runs the MapReduce job using the specified input format class.
pipes [-map class]
Runs the MapReduce job using the specified Java map class.
pipes [-partitioner class]
Runs the MapReduce job using the specified Java partitioner class.
pipes [-reduce class]
Runs the MapReduce job using the specified Java reduce class.
pipes [-writer class]
Runs the MapReduce job using the specified Java record writer.
pipes [-program executable]
Runs the MapReduce job using the specified executable URI.
pipes [-reduces number]
Runs the MapReduce job using the specified number of reduces.
pipes [-lazyOutput true|false]
Runs the MapReduce job in lazy output format. Specify mrsh pipes -lazyOutput true to specify lazy output. Otherwise, specify mrsh pipes -lazyOutput false. The default is no lazy output.

cleanup [-u user_name] [-x password]

Removes intermediate data for aborted or terminated MapReduce jobs. Intermediate data relates to files generated by map tasks on the local disk that are used as input for the reduce task.
-u user_name
Specifies the name of the user to connect to IBM Spectrum Symphony for this command. If you are already logged on to IBM Spectrum Symphony using the soamlogon command, for this command only the user name specified here overrides the user name entered in the soamlogon command.
-x password
Specifies the user password to connect to IBM Spectrum Symphony for this command. If you are already logged on to IBM Spectrum Symphony using the soamlogon command, for this command only the password specified here overrides the password entered in the soamlogon command.