Firewall recommendations for HDFS Transparency

Firewalls that are associated with open systems are specific to deployments, operating systems, and it varies from customer to customer. It is the responsibility of the system administrator or Lab Service (LBS) to set the firewall accordingly; similar to what Linux® distributions do presently. For information on IBM Storage Scale firewall, see the IBM Storage Scale system using firewall section in the IBM Storage Scale: Administration Guide.

This section describes only the recommendations for HDFS Transparency firewall settings.

Table 1. Recommended port number settings for HDFS Transparency
HDFS Transparency Property Port Number Comments
dfs.namenode.rpc-address nn-host1: 8020 RPC address that handles all clients requests.

In the case of HA/Federation where multiple NameNodes exist, the name service id is added to the name. For example, dfs.namenode.rpc-address.ns1 dfs.namenode.rpc-address.EXAMPLENAMESERVICE.

The value of this property will take the form of nn-host1:rpc-port.

The default RPC port for NameNode is 8020.

dfs.namenode.http-address 0.0.0.0:9870 The address and the base port where the dfs NameNode web UI will listen on.
dfs.datanode.address 0.0.0.0:9866 The DataNode server address and port for data transfer.
dfs.datanode.http.address 0.0.0.0:9864 The DataNode HTTP server address and port.
dfs.datanode.ipc.address 0.0.0.0:9867 The DataNode IPC server address and port.

Setting the firewall policies for HDFS Transparency

  1. Run the firewall-cmd to add and reload the recommended ports.

    On each of the HDFS Transparency NameNodes, set the NameNode server port.

    The following example uses 8020:
    # firewall-cmd --add-port=8020/tcp --permanent
    On each of the HDFS Transparency NameNodes, set the NameNode webui port:
    # firewall-cmd --add-port=9870/tcp --permanent
    On each of the HDFS Transparency DataNodes, set the following ports:
    # firewall-cmd --add-port=9864/tcp --permanent
    # firewall-cmd --add-port=9866/tcp --permanent
    # firewall-cmd --add-port=9867/tcp --permanent
    
    For all HDFS Transparency that ran --add-port, run reload and check the ports:
    # firewall-cmd --reload
    # firewall-cmd --zone=public --list-ports
    For example:
    [root@c8f2n01 webhdfs]# firewall-cmd --zone=public --list-ports
    1191/tcp 60000-61000/tcp 8020/tcp 9870/tcp 9864/tcp 9866/tcp 9867/tcp
  2. For the changes to reflect, restart HDFS Transparency.

    If HDFS Transparency is running, find the standby NameNode and restart the services followed by a failover.

    1. Get the standby NameNode.
      # /usr/lpp/mmfs/hadoop/bin/hdfs haadmin -getAllServiceState
      For example:
      [root@c8f2n01 webhdfs]# /usr/lpp/mmfs/hadoop/bin/hdfs haadmin -getAllServiceState
      c8f2n01:8020                                       active
      c8f2n05:8020                                       standby
      
    2. Restart the Standby NameNode (for example, on c8f2n05).
      For HDFS Transparency 3.1.0 or earlier, run the following command:
      # /usr/lpp/mmfs/hadoop/sbin/mmhadoopctl connector restart
      For HDFS Transparency 3.1.1 or later, run the following command:
      # /usr/lpp/mmfs/bin/mmces service stop HDFS
      # /usr/lpp/mmfs/bin/mmces service start HDFS
    3. Transition standby to active NameNode.

      For example: nn1 is c8f2n01 and nn2 is c8f2n05.

      For HDFS Transparency 3.1.0 and earlier, run the following command:
      # /usr/lpp/mmfs/hadoop/bin/hdfs haadmin -transitionToActive nn2
      # /usr/lpp/mmfs/hadoop/bin/hdfs haadmin -getAllServiceState
      For HDFS Transparency 3.1.1 and later, run the following command:
      # /usr/lpp/mmfs/bin/mmces address move --ces-ip x.x.x.x --ces-node nn2
      # /usr/lpp/mmfs/hadoop/bin/hdfs haadmin -getAllServiceState
      
    4. The original NameNode is now the standby NameNode.

      Restart the new Standby NameNode (for example, c8f2n01).

      For HDFS Transparency 3.1.0 and earlier, run the following command:
      # /usr/lpp/mmfs/hadoop/sbin/mmhadoopctl connector restart
      For HDFS Transparency 3.1.1 and later, run the following command:
      # /usr/lpp/mmfs/bin/mmces service stop HDFS
      # /usr/lpp/mmfs/bin/mmces service start HDFS
    5. You can now transition back to the original NameNode by running the following command:
      For HDFS Transparency 3.1.0 and earlier, run the following command:
      # /usr/lpp/mmfs/hadoop/bin/hdfs haadmin -transitionToActive nn1
      # /usr/lpp/mmfs/hadoop/bin/hdfs haadmin -getAllServiceState
      For HDFS Transparency 3.1.1 and later, run the following command:
      # /usr/lpp/mmfs/bin/mmces address move --ces-ip x.x.x.x --ces-node nn1
      # /usr/lpp/mmfs/hadoop/bin/hdfs haadmin -getAllServiceState
  3. Restart all Hadoop services on all the nodes.
    For example, on node with Yarn service:
    /opt/hadoop-3.1.3/sbin/stop-yarn.sh
    /opt/hadoop-3.1.3/sbin/start-yarn.sh