Follow this procedure to activate HA for the HDFS NameNode
in virtual IP mode.
Before activating the HA functionality in virtual IP mode,
make sure of the following configuration:
- The NameNode must join the IBM® Spectrum Symphony cluster as a management host; that is, it must be a part
of the ManagementHosts (mg) group. To configure a
host as a management host, use the egoconfig mghost shared_dir command. For details, refer to the Reference Guide.
- All IBM Spectrum Symphony
configuration data and NameNode metadata must be stored on a NFS shared file system, which is
accessible to all primary-candidate hosts. To configure this location, set the dfs.name.dir property
in $HADOOP_HOME/conf/hdfs-site.xml on the HDFS server and all standby primary hosts.
Note: Virtual IP mode is available only for the HDFS NameNode.
The SecondaryNameNode must be on a static IP and must not run on either
the primary NameNode or backup NameNode. With virtual IP configured
for the NameNode, the static virtual IP for the NameNode daemon is
automatically reassigned. The NameNode configuration for the SecondaryNode
(set as
dfs.secondary.http.address in
$HADOOP_HOME/conf/hdfs-site.xml) is therefore not dynamically
updated. Ensure the following configuration for the Secondary NameNode:
- Configure a static IP address for the Secondary NameNode daemon.
- Restrict the HA SecondaryService to run only on a specific host
by defining only one host in the SecondaryNodeRG resource
group in IBM Spectrum Symphony.
About this task
Follow these steps to configure HA for HDFS NameNode in virtual
IP mode.
-
From the cluster management console, configure the NameNode (NameNodeRG), SecondaryNode (SecondaryNodeRG), and DataNode (DataNodeRG) resource groups.
Note: By default, DataNodeRG shares slots with ComputeHosts on the same host. ComputeHosts have
MapReduce compute slots (for example, slots equal to the number of CPUs) while DataNodeRG has only
one overlapped slot to run the DataNode daemon. NameNode and SecondaryNode groups include only the
primary host and management
hosts. NameNode and SecondaryNode groups share metadata in the NFS shared file system.
-
Start the cluster management console, which is available by default at http://host_name:8080/platform.
-
Log in with your credentials.
-
From the Dashboard's Common Tasks menu, click .
-
Click NameNodeRG from the list.
-
Choose Static (List of Names) from the Resource Selection Method drop-down list.
The page refreshes to display a list of possible hosts.
-
Select the hosts that you want to add and click Apply.
-
Repeat steps e and f for the SecondaryNodeRG and the DataNodeRG.
- Configure the NameNode service profile.
-
From the cluster management console, go to .
- Click the NameNode service.
The Service Profile editor opens.
- Locate the sc::ActivityDescription section.
- In the Actions drop-down list
of the ego:ActivitySpecification parameter,
click Insert "ego:EnvironmentVariable", set
the name to SYM_HA_HDFS_VIRTUAL_IP and the value
to the virtual IP you have chosen.
- In the Actions drop-down list
of the ego:ActivitySpecification parameter,
click Insert "ego:ExecutionUser" and set its
value to that of the HDFS administrative OS user.
- In the ego:EnvironmentVariable parameters, add or modify the values for the following variables:
- HADOOP_HOME: Set this value to $HADOOP_HOME.
- HADOOP_CONF_DIR: Set this value to $HADOOP_CONF_DIR.
- PMR_HDFS_PORT: Set this value to the HDFS
port, which is by default 8020.
- Click Save and OK.
- Repeat step 2 for the SecondaryNode and DataNode
service profiles.
Note: Set SYM_HA_HDFS_VIRTUAL_IP in all the three HA services to the same static accessible virtual
IP.
-
(Optional) Add the following environment variables to customize the virtual IP network alias configuration:
Environment variable |
Description |
Default |
SYM_HA_HDFS_BROADCAST
|
Broadcast address for the virtual IP. |
x.y.z.255 for virtual IP x.y.z.q |
SYM_HA_HDFS_NETMASK |
Netmask for the virtual IP. |
255.255.255.0 |
SYM_HA_HDFS_ETH |
Ethernet device for the virtual IP. |
eth0 |
SYM_HA_HDFS_ETH_ALIAS |
Ethernet Alias Index for the virtual IP. |
0 - Ethernet alias is created as eth0:0 |