Configuring YARN and MapReduce
This topic lists the steps to configure YARN and MapReduce with Kerberos.
For Apache Hadoop, Yarn and MapReduce needs to be installed on the clients. For information, see the HDFS clients configuration and MapReduce/YARN clients configuration.
- Update
yarn-site.xml.
<property> <name>yarn.resourcemanager.principal</name> <value>rm/_HOST@IBM.COM</value> </property> <property> <name>yarn.resourcemanager.keytab</name> <value>/etc/security/keytab/rm.service.keytab</value> </property> <property> <name>yarn.nodemanager.principal</name> <value>nm/_HOST@IBM.COM</value> </property> <property> <name>yarn.nodemanager.keytab</name> <value>/etc/security/keytab/nm.service.keytab</value> </property>
- Update
mapreduce-site.xml.
<property> <name>mapreduce.jobhistory.keytab</name> <value>/etc/security/keytab/jhs.service.keytab</value> </property> <property> <name>mapreduce.jobhistory.principal</name> <value>jhs/_HOST@IBM.COM</value> </property>
- Synchronize /opt/hadoop-3.x.x to all the other Hadoop nodes and keep the
same location for all the hosts.
Use scp to copy the configuration files from HADOOP_HOME to the other Hadoop nodes with the services installed.
- On the Resource Manager node, run the following command to start the Yarn
service:
cd /opt/hadoop-3.0.x/sbin/ export YARN_NODEMANAGER_USER=root export YARN_RESOURCEMANAGER_USER=root ./start-yarn.sh
- Run the following command to submit the word count
jobs:
/opt/hadoop-3.0.x/bin/hadoop dfs -put /etc/passwd /passwd /opt/hadoop-3.0.x/bin/hadoop jar /opt/hadoop-3.0.x/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.2.jar wordcount /passwd /results
Note: The successful execution of the word count job indicates that the Yarn and MapReduce services are working properly.