MapReduce at a glance
Use the information described in this section as a quick reference for your MapReduce installation.
Installation location
The directory you specify for installing IBM® Spectrum Symphony is saved as the $EGO_TOP variable, whose value is by default /opt/ibm/spectrumcomputing.
- (csh or tcsh) $EGO_TOP/cshrc.platform
- (sh, ksh, or bsh) $EGO_TOP/profile.platform
The root directory for the MapReduce framework in IBM Spectrum Symphony is saved as the $PMR_HOME variable, whose value is by default $EGO_TOP/soam/mapreduce. This variable is set automatically when you source the environment script file.
JAVA_HOME environment variable
Set the JAVA_HOME environment variable to specify the directory under which Oracle or IBM Java is installed. For example:- For bash:
export JAVA_HOME=/usr/java/latest
- For csh:
setenv JAVA_HOME /usr/java/latest
Your Java installation location is required for enabling the MapReduce framework. You can do this by setting the JAVA_HOME environment variable before installing IBM Spectrum Symphony.
If you install IBM Spectrum Symphony without setting the JAVA_HOME environment variable, you can do so at a later date by defining the JAVA_HOME in the $SOAM_HOME/mapreduce/conf/pmr-env.sh file. See pmr-env.sh reference for details.
Application
Each application in IBM Spectrum Symphony has a profile that specifies attributes that are shared by workloads in the application.
IBM Spectrum Symphony provides a default application profile for its MapReduce framework: "MapReduceversion", which is registered by default to the MapReduceConsumer consumer.
- version identifies the IBM Spectrum Symphony release; for example, 7.3.2
- os_type identifies the platform on which IBM Spectrum Symphony is installed; for example, linux2.6-glibc2.3-x86_64. The MapReduce framework in IBM Spectrum Symphony is supported only on Linux® 64-bit hosts.
Configuration files
- pmr-env.sh
- The environment file used to set up the local host.Note: To configure the service side environment, define settings in the environment section of the application profile.
- pmr-site.xml
- The property file used to define settings that apply to all MapReduce jobs submitted on the host.
User roles
- Cluster administrator
- A super user able to accomplish all administrative and workload tasks, with access to all areas of the cluster management console and all actions within it.
- Cluster administrator (Read-only)
- This administrator has read-only access to all cluster information, useful for monitoring the cluster. This user role cannot perform any add, delete, or modify actions on the cluster.
- Consumer administrator
- This administrator has access and control only over own branch of the tree. Consumer
administrators are assigned at the first-level consumer and they are administrators for all
sub-consumers in that branch of the tree.
Within the MapReduce framework in IBM Spectrum Symphony, consumer administrators have access to all MapReduce applications that are registered by default to the MapReduceConsumer tree.
- Consumer user
- This user has access and control over their own workload units only. Within the MapReduce framework in IBM Spectrum Symphony, consumer users are assigned to individual consumers on the MapReduceConsumer tree.
- All execution users for the MapReduceConsumer consumer must belong to the same user group as the cluster administrator.
- If you are using the Hadoop Distributed File System (HDFS), the cluster administrator must ensure the following setup:
- That permissions for the OS user match permissions set for the work/input/output directory in HDFS.
- That the OS user has write permissions to access the HDFS directory defined by the hadoop.tmp.dir parameter in the Hadoop configuration files: core-default.xml or core-site.xml.
Samples
Samples for the MapReduce framework in IBM Spectrum Symphony are located under $PMR_HOME/version/os_type/samples/.