MapReduce at a glance

Use the information described in this section as a quick reference for your MapReduce installation.

Installation location

The directory you specify for installing IBM® Spectrum Symphony is saved as the $EGO_TOP variable, whose value is by default /opt/ibm/spectrumcomputing.

The scripts that you use to source properties for your environment using a particular shell are located under:
  • (csh or tcsh) $EGO_TOP/cshrc.platform
  • (sh, ksh, or bsh) $EGO_TOP/profile.platform

The root directory for the MapReduce framework in IBM Spectrum Symphony is saved as the $PMR_HOME variable, whose value is by default $EGO_TOP/soam/mapreduce. This variable is set automatically when you source the environment script file.

JAVA_HOME environment variable

Set the JAVA_HOME environment variable to specify the directory under which Oracle or IBM Java is installed. For example:
  • For bash: export JAVA_HOME=/usr/java/latest
  • For csh: setenv JAVA_HOME /usr/java/latest

Your Java installation location is required for enabling the MapReduce framework. You can do this by setting the JAVA_HOME environment variable before installing IBM Spectrum Symphony.

If you install IBM Spectrum Symphony without setting the JAVA_HOME environment variable, you can do so at a later date by defining the JAVA_HOME in the $SOAM_HOME/mapreduce/conf/pmr-env.sh file. See pmr-env.sh reference for details.

Application

Each application in IBM Spectrum Symphony has a profile that specifies attributes that are shared by workloads in the application.

IBM Spectrum Symphony provides a default application profile for its MapReduce framework: "MapReduceversion", which is registered by default to the MapReduceConsumer consumer.

MapReduce application profiles are located under $PMR_HOME/version/os_type/profile/, where:
  • version identifies the IBM Spectrum Symphony release; for example, 7.3.2
  • os_type identifies the platform on which IBM Spectrum Symphony is installed; for example, linux2.6-glibc2.3-x86_64. The MapReduce framework in IBM Spectrum Symphony is supported only on Linux® 64-bit hosts.
Note: To optimize performance for your MapReduce workload, ensure that you prestart services (in the application profile, select Pre-start Application).

Configuration files

Configuration files required for the MapReduce framework in IBM Spectrum Symphony are located under the $PMR_HOME/conf/ directory. The following table lists the key configuration files:
pmr-env.sh
The environment file used to set up the local host.
Note: To configure the service side environment, define settings in the environment section of the application profile.
pmr-site.xml
The property file used to define settings that apply to all MapReduce jobs submitted on the host.

User roles

IBM Spectrum Symphony by default has four user roles that can be assigned to any user account. Each user role has a predefined level of accessibility and control in the cluster management console:
Cluster administrator
A super user able to accomplish all administrative and workload tasks, with access to all areas of the cluster management console and all actions within it.
Cluster administrator (Read-only)
This administrator has read-only access to all cluster information, useful for monitoring the cluster. This user role cannot perform any add, delete, or modify actions on the cluster.
Consumer administrator
This administrator has access and control only over own branch of the tree. Consumer administrators are assigned at the first-level consumer and they are administrators for all sub-consumers in that branch of the tree.

Within the MapReduce framework in IBM Spectrum Symphony, consumer administrators have access to all MapReduce applications that are registered by default to the MapReduceConsumer tree.

Consumer user
This user has access and control over their own workload units only. Within the MapReduce framework in IBM Spectrum Symphony, consumer users are assigned to individual consumers on the MapReduceConsumer tree.
For a multi-user configuration, ensure the following prerequisites if you change the default OS user for the MapReduceConsumer consumer (to which MapReduce applications are registered by default):
  • All execution users for the MapReduceConsumer consumer must belong to the same user group as the cluster administrator.
  • If you are using the Hadoop Distributed File System (HDFS), the cluster administrator must ensure the following setup:
    • That permissions for the OS user match permissions set for the work/input/output directory in HDFS.
    • That the OS user has write permissions to access the HDFS directory defined by the hadoop.tmp.dir parameter in the Hadoop configuration files: core-default.xml or core-site.xml.

Samples

Samples for the MapReduce framework in IBM Spectrum Symphony are located under $PMR_HOME/version/os_type/samples/.