Activating HA for the HDFS NameNode

Follow this procedure to activate HA for the HDFS NameNode in the default mode.

Before you begin

Before activating the HA functionality, make sure of the following configuration:
  • Stop all processes running on the NameNode, Secondary NameNode and all DataNodes.
  • EGOSC will manage starting and stopping of NameNode, Secondary NameNode, and all DataNodes.
  • The HDFS NameNode and SecondaryNameNode are configured properly in HDFS.
  • The HDFS NameNode and SecondaryNameNode join the IBM® Spectrum Symphony cluster as management hosts, that is, in the ManagementHosts (mg) group. To configure a host as a management host, use the egoconfig mghost shared_dir command. For details, refer to egoconfig.
  • Set the PMR_HDFS_PORT environment variable to specify the port which the HDFS nodes use to start. This port is by default 8020. For example:
    • (bsh): export PMR_HDFS_PORT=9000
    • (tcsh): setenv PMR_HDFS_PORT 9000
    You can also set the HDFS port in the pmr-env.sh file:
    1. In $PMR_HOME/conf/pmr-env.sh, add export PMR_HDFS_PORT=port_number
    2. Restart the application manager:
      soamcontrol app disable MapReduceversion -f
      soamcontrol app enable MapReduceversion
  • If you have enabled Kerberos-based authentication, set the execution user account of the existing data node consumer to root. Open ConsumerTrees.xml under $EGO_CONFDIR and replace the value of the <ExecutionUser> element under DataNodeConsumer; for example:
    <Consumer ConsumerName="DataNodeConsumer">
    		<ConsumerProperties>
    			<Admin>Admin</Admin>
    			<User>User1</User>
    			<ExecutionUser>root</ExecutionUser>
    	 </ConsumerProperties>
    </Consumer>
    

    With Kerberos enabled for HDFS (secure HDFS), you can have only one instance of the SecondaryNameNode running at a time. This restriction is caused by a Kerberos limitation. As a result, you can start the SecondaryNameNode only on one host at a time.

  • If you are using a Cloudera HDFS, set the execution user account in the ConsumerTrees.xml configuration file to the value specified in the hadoop-env.sh environment settings file.
    1. In the Cloudera HDFS, stop all processes running on the NameNode, Secondary NameNode and all data nodes.
    2. On the primary or management host, open the ConsumerTrees.xml file located under $EGO_CONFDIR.
    3. Change the value of the <ExecutionUser> element to the value defined in the default HDFS hadoop-env.sh file (located under $HADOOP_HOME/conf/hadoop-env.sh).
      Replace the value of <ExecutionUser> element under NameNodeConsumer with the value of HADOOP_NAMENODE_USER. For example:
      <Consumer ConsumerName="NameNodeConsumer">
               <ConsumerProperties>
                 <Admin>Admin</Admin>
                 <User>User1</User>
              <ExecutionUser>hdfs</ExecutionUser>
               </ConsumerProperties>
           </Consumer>
      
      Replace the value of <ExecutionUser> element under SecondaryNodeConsumer with the value of HADOOP_SECONDARYNAMENODE_USER. For example:
      <Consumer ConsumerName="SecondaryNodeConsumer">
               <ConsumerProperties>
                 <Admin>Admin</Admin>
                 <User>User1</User>
              <ExecutionUser>hdfs</ExecutionUser>
               </ConsumerProperties>
           </Consumer>
      
      Replace the value of <ExecutionUser> element under DataNodeConsumer with the value of HADOOP_DATANODE_USER. For example:
      <Consumer ConsumerName="DataNodeConsumer">
               <ConsumerProperties>
                 <Admin>Admin</Admin>
                 <User>User1</User>
              <ExecutionUser>hdfs</ExecutionUser>
               </ConsumerProperties>
           </Consumer>
      
    4. Save the ConsumerTrees.xml file.
    5. Restart the cluster:
      1. Log on to the primary host as administrator and run:
        $ soamcontrol app disable all -f
      2. If global standby services are enabled, run:
        $ egosh standby kill -GLOBAL all 
      3. Run:
        $ egosh standby kill -GLOBAL all
        $ egosh service stop all
        $ egosh ego shutdown all
        $ egosh ego start
        $ soamcontrol app enable MapReduceversion
      4. Log on to each compute host as administrator and run:
        $ egosh ego start

Procedure

  1. Once you have configured the prerequisites for high availability that is described in the previous section, install and configure a Hadoop 2.7.2 HDFS cluster.
    Ensure that you configure the HDFS NameNode to store metadata in an NFS shared file system (set dfs.name.dir property in $HADOOP_HOME/conf/hdfs-site.xml on the HDFS server and all standby primary hosts).
    Note: The MapReduce framework in IBM Spectrum Symphony is pre-configured with the following resource groups: NameNodeRG, SecondaryNodeRG, and DataNodeRG. By default, DataNodeRG shares slots with ComputeHosts on the same host. ComputeHosts have MapReduce compute slots (for example, slots equal to the number of CPUs) while DataNodeRG has only one overlapped slot to run the DataNode daemon. NameNode and SecondaryNode groups only include the primary host and management hosts. NameNode and SecondaryNode groups share metadata in the NFS shared file system.
  2. Verify that the configuration property with the HDFS URL is present in $HADOOP_HOME/conf/core-site.xml on all primary hosts and data nodes.
    Note: When you activate the HA feature for the first time, the core-site.xml.store file is created in the $HADOOP_HOME/conf directory. You must modify the core-site.xml.store file, not the core-site.xml file, to change your configuration. The changes made to core-site.xml.store are propagated to core-site.xml.
    For example:
    <property>
      <name>fs.default.name</name>
      <value>hdfs://NameNode.ego:8020/</value>
    </property>
    
    where, by default:
    • NameNode identifies the default NameNode service name.
    • .ego is the default Ego service domain name.
    • 8020 is the default HDFS port.
    For example:
       <value>hdfs://mapredhost01.mydomain.com:8020/</value>
    
  3. Check NameNode, SecondaryNode, and DataNode services in the System Services panel of the cluster management console to ensure that HADOOP_HOME points to the Hadoop HDFS installation path. Use the following command to start the NameNode service (the other two services start automatically):

    $ egosh service start NameNode

    The HDFS cluster starts up using services for fully automated failover.

  4. Submit MapReduce jobs using the default NameNode setting:

    mrsh ... hdfs://NameNode.ego:9000/dir1/file1