Activating HA for the HDFS NameNode
Follow this procedure to activate HA for the HDFS NameNode in the default mode.
Before you begin
Before activating the HA functionality, make sure of the following configuration:
- Stop all processes running on the NameNode, Secondary NameNode and all DataNodes.
- EGOSC will manage starting and stopping of NameNode, Secondary NameNode, and all DataNodes.
- The HDFS NameNode and SecondaryNameNode are configured properly in HDFS.
- The HDFS NameNode and SecondaryNameNode join the IBM® Spectrum Symphony cluster as management hosts, that is, in the ManagementHosts (mg) group. To configure a host as a management host, use the egoconfig mghost shared_dir command. For details, refer to egoconfig.
- Set the PMR_HDFS_PORT environment variable to specify the port which the
HDFS nodes use to start. This port is by default 8020. For example:
- (bsh): export PMR_HDFS_PORT=9000
- (tcsh): setenv PMR_HDFS_PORT 9000
You can also set the HDFS port in the pmr-env.sh file:- In $PMR_HOME/conf/pmr-env.sh, add export PMR_HDFS_PORT=port_number
- Restart the application
manager:
soamcontrol app disable MapReduceversion -f soamcontrol app enable MapReduceversion
- If you have enabled Kerberos-based authentication,
set the execution user account of the existing data node consumer to root. Open
ConsumerTrees.xml under $EGO_CONFDIR and replace the value of the
<ExecutionUser> element under DataNodeConsumer; for
example:
<Consumer ConsumerName="DataNodeConsumer"> <ConsumerProperties> <Admin>Admin</Admin> <User>User1</User> <ExecutionUser>root</ExecutionUser> </ConsumerProperties> </Consumer>With Kerberos enabled for HDFS (secure HDFS), you can have only one instance of the SecondaryNameNode running at a time. This restriction is caused by a Kerberos limitation. As a result, you can start the SecondaryNameNode only on one host at a time.
-
If you are using a Cloudera HDFS, set the execution user account in the ConsumerTrees.xml configuration file to the value specified in the hadoop-env.sh environment settings file.
- In the Cloudera HDFS, stop all processes running on the NameNode, Secondary NameNode and all data nodes.
- On the primary or management host, open the ConsumerTrees.xml file located under $EGO_CONFDIR.
- Change the value of the <ExecutionUser> element to the value defined in
the default HDFS hadoop-env.sh file (located under
$HADOOP_HOME/conf/hadoop-env.sh). Replace the value of <ExecutionUser> element under NameNodeConsumer with the value of HADOOP_NAMENODE_USER. For example:
<Consumer ConsumerName="NameNodeConsumer"> <ConsumerProperties> <Admin>Admin</Admin> <User>User1</User> <ExecutionUser>hdfs</ExecutionUser> </ConsumerProperties> </Consumer>Replace the value of <ExecutionUser> element under SecondaryNodeConsumer with the value of HADOOP_SECONDARYNAMENODE_USER. For example:<Consumer ConsumerName="SecondaryNodeConsumer"> <ConsumerProperties> <Admin>Admin</Admin> <User>User1</User> <ExecutionUser>hdfs</ExecutionUser> </ConsumerProperties> </Consumer>Replace the value of <ExecutionUser> element under DataNodeConsumer with the value of HADOOP_DATANODE_USER. For example:<Consumer ConsumerName="DataNodeConsumer"> <ConsumerProperties> <Admin>Admin</Admin> <User>User1</User> <ExecutionUser>hdfs</ExecutionUser> </ConsumerProperties> </Consumer> - Save the ConsumerTrees.xml file.
- Restart the cluster:
- Log on to the primary host
as administrator and run:
$ soamcontrol app disable all -f - If global standby services are enabled,
run:
$ egosh standby kill -GLOBAL all - Run:
$ egosh standby kill -GLOBAL all $ egosh service stop all $ egosh ego shutdown all $ egosh ego start $ soamcontrol app enable MapReduceversion - Log on to each compute host as administrator and
run:
$ egosh ego start
- Log on to the primary host
as administrator and run: