Adding DataNodes manually

The CES HDFS NameNodes and DataNodes do not need to be stopped when adding or deleting DataNodes from the cluster.

The following are the two ways to add DataNodes into an existing CES HDFS cluster:
  • Add new DataNodes into an existing CES HDFS cluster.
  • Add existing IBM Storage® Scale nodes that are already a part of the IBM Storage Scale cluster as new DataNodes, to the CES HDFS cluster.

Adding new DataNodes into an existing CES HDFS cluster

  1. If you have a new DataNode, install IBM Storage Scale by following the Manually installing the installation toolkit topic and then add the new nodes into the existing IBM Storage Scale cluster by using the mmaddnode -N command.

    If you have existing IBM Storage Scale nodes that already have the IBM Storage Scale packages installed and configured, go to the next step.

  2. Log in to the new DataNode as root.
  3. Install HDFS Transparency package into the new DataNode.
    On Red Hat® Enterprise Linux®, issue the following command:
       # rpm -ivh gpfs.hdfs-protocol-<version>.<arch>.rpm
  4. On the NameNode as root, edit the worker configuration file to add in the new DataNode.
    # vi /var/mmfs/hadoop/etc/hadoop/workers 
  5. On the NameNode as root, upload the modified configuration.
    # mmhdfs config upload
  6. On the NameNode as root, copy the init directory to the DataNode.
    # scp -r /var/mmfs/hadoop/init [datanode]:/var/mmfs/hadoop/
  7. If the CES HDFS cluster is Kerberos enabled, ensure that you configure Kerberos for the new DataNode by following Setting up Kerberos for HDFS Transparency nodes.
  8. On the DataNode as root, start the DataNode.
    # /usr/lpp/mmfs/hadoop/sbin/mmhdfs datanode start
  9. On the NameNode, confirm if the DataNode is shown from the DataNode list with correct status by running the following command:
    # /usr/lpp/mmfs/hadoop/sbin/mmhdfs hdfs-dn status