mrss.xml reference
The mrss.xml configuration file applies to MapReduce workload, which is available only with the IBM® Spectrum Symphony Advanced Edition. An entitlement key is required to enable the MapReduce framework.
The MapReduce framework in IBM Spectrum Symphony provides a data transfer
daemon for the shuffle service phase of a MapReduce job. This shuffle service daemon runs as an
IBM Spectrum Symphony service on each
host in the cluster. It optimizes local memory and disk usage to facilitate faster shuffling
processes for map task output to local and remote reduce tasks.
Configure the environment for the shuffle service daemon by using the mrss.xml file.
Location
This file is installed with IBM Spectrum Symphony at $EGO_ESRVDIR/esc/conf/services/.
Environment Variables
PMR_MRSS_SHUFFLE_CLIENT_PORT
The port which is used by the shuffle service. This port is by default BASEPORT+10. If you use the default port of 7869, the shuffle service port is 7879.Note: The shuffle service port is listed in mrss.xml and in
$SOAM_HOME/mapreduce/conf/pmr-env.sh. If you change the port, ensure that you
update the port number in both configuration files.
Default: 7879
PMR_MRSS_SHUFFLE_DATA_WRITE_PORT
The port which is used by the shuffle service for data writes. This port is by default BASEPORT+11. If you use the default base port of 7869, the shuffle service port for data writes is 7880.Note: The shuffle service port is listed in
mrss.xml and in $SOAM_HOME/mapreduce/conf/pmr-env.sh. If
you change the port, ensure that you update the port number in both configuration
files.
Default: 7881
PMR_MRSS_WORKING_THREADS_NUMBER
The number of working threads for data-copy requests.Default: 20
PMR_MRSS_DATA_WRITE_WORKING_THREADS_NUMBER
The number of working threads for data-write requests.Default: 24
PMR_MRSS_TASK_LOG_DIR
Tthe log directory for map and reduce tasks. The shuffle service checks this directory every PMR_MRSS_TASK_LOG_CLEAN_INTERVAL seconds. If a subdirectory is created PMR_MRSS_TASK_DIRECTORY_DELETE_INTERVAL seconds ago, the shuffle service deletes the subdirectory.Default: ${PMR_HOME}/logs
PMR_MRSS_TASK_LOG_CLEAN_INTERVAL
The interval, in minutes, at which the shuffle service checks and cleans the log directory for a map or reduce task.Default: 30
PMR_MRSS_TASK_DIRECTORY_DELETE_INTERVAL
The interval, in hours, at which the shuffle service deletes the log directory for a map or reduce task.Default: 48
PMR_MRSS_CHUNK_SIZE
The size of the data chunk, in KB, copied during the shuffle phase to the reducer.Default: 64
PMR_MRSS_CACHE_PATH
A location on the local disk to store the map file of the input split. Relates to the feature configuration for cache-aware scheduling, enabling a job to get its input split from the cache.Default: ${PMR_HOME}/work/datacache
PMR_MRSS_INPUTCACHE_MAX_MEMSIZE_MB
The maximum memory limit of the input split cache (in MB). If the size of the total memory cache does not exceed the configured size, the cache files are mapped to system memory and used as in-memory cache. If the size of the total memory cache exceeds the configured size, the cache files are not mapped to system memory but are instead used as on-disk cache. Relates to the feature configuration for cache-aware scheduling, enabling a job to get its input split from the cache.Default: 2
PMR_MRSS_INPUTCACHE_CLEAN_INTERVAL
The duration (in seconds) that a split is cached without being accessed by any job of any application. When a split cache exceeds this duration, the oldest data input split (and its local disk file) are deleted from the MapReduce shuffle service cache. Relates to the feature configuration for cache-aware scheduling, enabling a job to get its input split from the cache.Default: 3600
Example of mrss.xml file
<sc:ActivityDescription>
...
<ego:ActivitySpecification>
...
<ego:EnvironmentVariable name="PMR_MRSS_SHUFFLE_CLIENT_PORT">35010</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_SHUFFLE_DATA_WRITE_PORT">35011</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_WORKING_THREADS_NUMBER">20</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_WORKING_THREADS_NUMBER">24</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_TASK_LOG_DIR">${PMR_HOME}/logs</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_TASK_LOG_CLEAN_INTERVAL">30</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_TASK_DIRECTORY_DELETE_INTERVAL">48</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_CHUNK_SIZE">64</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_CACHE_PATH">${PMR_HOME}/work/datacache</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_INPUTCACHE_MAX_MEMSIZE_MB">2048</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_INPUTCACHE_CLEAN_INTERVAL">3600</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_PRINCIPALNAME">testuser/iMapReduce@EXAMPLE.COM</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_KEYTAB">/dev/sym_mr/kernel/conf/abcuser.keytab</ego:EnvironmentVariable>
<ego:EnvironmentVariable name="PMR_MRSS_PRINCIPALNAME">/usr/bin</ego:EnvironmentVariable>
...
</ego:ActivitySpecification>
</sc:ActivityDescription>