ESS networking considerations

This topic describes the networking requirements for installing ESS.

Note: The references to HMC are not applicable for the PPC64LE platform.

Networking requirements

The following networks are required:
  • Service network

    This network connects the flexible service processor (FSP) on the management server and I/O server nodes (with or without the HMC, depending on the platform) as shown in blue in Figure 1 and 2 on the following pages.

  • Management and provisioning network

    This network connects the management server to the I/O server nodes (and HMCs, if available) as shown in yellow in in Figure 1 and 2 on the following pages. The management server runs DHCP on the management and provisioning network. If a management server is not included in the solution order, a customer-supplied management server is used.

  • Clustering network

    This high-speed network is used for clustering and client node access. It can be a 10 Gigabit Ethernet (GbE), 25 GbE, 40 GbE, 100 GbE, or InfiniBand network. It might not be included in the solution order.

  • External and campus management network

    This public network is used for external and campus management of the management server, the HMC (if available), or both.

  • IBM Elastic Storage® Server networking with Mellanox adapters
    Mellanox ConnectX-2 adapter cards improve network performance by increasing available CPU bandwidth, which enhances performance in virtualized server environments. Mellanox ConnectX-2 adapter cards provide:
    • Data Center Bridging (DCB)
    • Fibre Channel over Ethernet (FCoE)
    • SR-IOV

    For information on using Mellanox adapter cards, see: http://www.mellanox.com/page/ethernet_cards_overview

Figure 1, Network Topology, is a high-level logical view of the management and provisioning network and the service network for an ESS building block (on PPC64BE).

Figure 1. The management and provisioning network and the service network: a logical view (on PPC64BE)
The management and provisioning network and the service network: a logical view (PPC64LE)

Figure 2, Network Topology, is a high-level logical view of the management and provisioning network and the service network for an ESS building block (on PPC64LE).

Figure 2. The management and provisioning network and the service network: a logical view (on PPC64LE)
The management and provisioning network and the service network: a logical view (PPC64LE)
The management and provisioning network and the service network must run as two non-overlapping networks implemented as two separate physical networks or two separate virtual local-area networks (VLANs).
Tip: HMC 2 is an optional third cable on the management server node that can be connected either to the management network or any other external network provided by the customer. This connection can be added in case the ability to service or control the management server node remotely is required.

The HMC, the management server, and the switches (1 GbE switches and high-speed switches) might not be included in a solution order in which an existing or customer-supplied HMC or management server is used. Perform any advance planning tasks that might be needed to access and use these solution components.

Customer networking considerations

Review the information about switches and switch firmware that were used to validate this ESS release. For information about available IBM® networking switches, see the IBM networking switches page on IBM Knowledge Center.

It is recommended that if two switches are used in a high availability (HA) configuration, both switches be at the same firmware level.
To check the firmware version, do the following:
  1. SSH to the switch.
  2. Issue the following commands.
    # en
    # show version
    For example:
    
    login as: admin
    Mellanox MLNX-OS Switch Management
    Using keyboard-interactive authentication.
    Password:
    Last login: Mon Mar 5 12:03:14 2018 from 9.3.17.119
    Mellanox Switch
    io232 [master] >
    io232 [master] > en
    io232 [master] # show version
    Example output:
    
    Product name: MLNX-OS
    Product release: 3.4.3002
    Build ID: #1-dev
    Build date: 2015-07-30 20:13:19
    Target arch: x86_64
    Target hw: x86_64
    Built by: jenkins@fit74
    Version summary: X86_64 3.4.3002 2015-07-30 20:13:19 x86_64
    Product model: x86
    Host ID: E41D2D52A040
    System serial num: Defined in system VPD
    System UUID: 03000200-0400-0500-0006-000700080009

Infiniband with multiple fabric

In a multiple fabric network, the Infiniband Fabric ID might not be properly appended in the verbsPorts configuration statement during the cluster creation. Incorrect verbsPorts setting might cause the outage of the IB network. It is advised to do the following to ensure that the verbsPorts setting is accurate:
  1. Use gssgennetworks to properly set up IB or Ethernet bonds on the ESS system.
  2. Create a cluster. During cluster creation, the verbsPorts setting is applied and there is a probability that the IB network becomes unreachable, if multiple fabrics are set up during the cluster deployment.
  3. Ensure that the GPFS daemon is running and then run the mmfsadm test verbs config | grep verbsPorts command.
These steps show the Fabric ID found for each link.
For example:
# mmfsadm test verbs config | grep verbsPorts
mmfs verbsPorts: mlx5_0/1/4 mlx5_1/1/7
In this example, the adapter mlx5_0, port 1 is connected to fabric 4 and the adapter mlx5_1 port 1 is connected to fabric 7. Now, run the following command and ensure that verbsPorts settings are correctly configured to the GPFS cluster.
# mmlsconfig | grep verbsPorts
verbsPorts mlx5_0/1 mlx5_1/1
Here, it can be seen that the fabric has not been configured even though IB was configured with multiple fabric. This is a known issue.
Now using mmchconfig, modify the verbsPorts setting for each node or node class to take the subnet into account.
[root@gssio1 ~]# verbsPorts="$(echo $(mmfsadm test verbs config | \
grep verbsPorts | awk ’{ $1=""; $2=""; $3=""; print $0} ’))"
# echo $verbsPorts
mlx5_0/1/4 mlx5_1/1/7
# mmchconfig verbsPorts="$verbsPorts" -N gssio1
mmchconfig: Command successfully completed
mmchconfig: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
Here, the node can be any GPFS node or node class. Once the verbsPorts setting is changed, make sure that the new, correct verbsPorts setting is listed in the output of the mmlsconfig command.
# mmlsconfig | grep verbsPorts
verbsPorts mlx5_0/1/4 mlx5_1/1/7

Switch information

ESS release updates are independent of switch updates. Therefore, it is recommended that Ethernet and Infiniband switches used with the ESS cluster be at their latest switch firmware levels. Customers are responsible for upgrading their switches to the latest switch firmware.
Start of change
Table 1. Network switch firmware
Type IBM MTM Melannox switch model Description Latest validated switch OS (June 2020)
IB - FDR 8828-F36
8828-F37
SX6036 36-port FDR switch mlnxOS 3.6.8012
IB - EDR 8828-E36
8828-E37
SB7700 36-port EDR switch (switchIB1) mlnxOS 3.9.0300
IB - EDR 8828-G36
8828-G37
SB7800 36-port EDR switch (switchIB2) mlnxOS 3.9.0300
ETH - 1GbE 8831-S52 Edgecore AS4610 48-port 1G + 4-port 10G SFP+ cumulus-3.7.12a
ETH - 40GbE 8831-NF2 SX1710 36-port 40G switch Onyx 3.6.8012
ETH - 10GbE 8831-S48 SX1410 48-port 10G + 12-port 40G Onyx 3.6.8012
ETH - 100GbE 8831-00M SN2700 32-port 40G/100G Onyx 3.9.0300
ETH - 10/25/40/100 8831-25M SN2410 48-port 10G/25G + 8-port 40G/100G Onyx 3.9.0300
End of change