ESS software deployment preparation
Install an ESS software package and deploy the storage servers by using the following information. The goal is to create a cluster that allows client or protocol nodes to access the file systems.
ESS 3500 | ESS 3200 | ESS 3000 | ESS 5000 | ESS Legacy | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Runs on | POWER9™ EMS | POWER9 EMS | POWER8® or POWER9 EMS | POWER9 EMS | POWER8 or POWER9 EMS | ||||||||||||||||||||||||
I/O node OS | Red Hat® Enterprise Linux® 8.6 x86_64 | Red Hat Enterprise Linux 8.6 x86_64 |
Red Hat Enterprise Linux 8.6 x86_64 |
Red Hat Enterprise Linux 8.6 x86_64 | Red Hat Enterprise Linux 7.9 PPC64LE | ||||||||||||||||||||||||
Architecture | x86_64 | x86_64 | x86_64 | PPC64LE | PPC64LE | ||||||||||||||||||||||||
IBM Spectrum® Scale | 5.1.4.1 efix17 | 5.1.4.1 efix17 | 5.1.4.1 efix17 | 5.1.4.1 efix17 | 5.1.4.1 efix17 | ||||||||||||||||||||||||
Kernel | 4.18.0-372.16.1.el8 | 4.18.0-372.16.1.el8 | 4.18.0-372.16.1.el8 | 4.18.0-372.16.1.el8 |
3.10.0-1160.71.1.el7.ppc64le |
||||||||||||||||||||||||
Systemd | 239-58.el8.x86_64 | 239-58.el8.x86_64 | 239-58.el8.x86_64 | 239-58.el8.ppc64le | 219-78.el7_9.5.ppc64leESS_DME_BASEIMAGE_3000 |
||||||||||||||||||||||||
Network manager | 1.36.0-7.el8_6.x86_64 | 1.36.0-7.el8_6.x86_64 | 1.36.0-7.el8_6.x86_64 | 1.36.0-0-7.el8_6.ppc64le | 1.18.8-2.el7_9.ppc64le | ||||||||||||||||||||||||
GNU C Library | glibc-1.36.0-7.el8_6 | glibc-1.36.0-7.el8_6 | glibc-1.36.0-7.el8_6 | glibc-1.36.0-7.el8_6 | glibc-2.17-326.el7_9.ppc64le | ||||||||||||||||||||||||
OFED | MLNX_OFED_LINUX-5.6-2.0.9.0-rhel8.6-x86_64.iso | MLNX_OFED_LINUX-5.6-2.0.9.0 Separate binary for firmware (mlxfwmanager_sriov_dis_x86_64) |
MLNX_OFED_LINUX-5.6-2.0.9.0-rhel8.6-x86_64.iso Separate binary for firmware |
MLNX_OFED_LINUX-5.6-2.0.9.0-rhel8.6-ppc64le.iso Separate firmware binary |
MLNX_OFED_LINUX-4.9-5.1.0.2-rhel7.9-ppc64le.iso Firmware binary included |
||||||||||||||||||||||||
Firmware RPM | 6.0.0.51 | 6.0.0.51 | 6.0.0.51 | 6.0.0.51 | 6.0.0.51 | ||||||||||||||||||||||||
SAS Adapter Firmware | N/A | N/A |
16.00.11.00 - 4U106 and 5U92 |
16.00.11.00 |
|||||||||||||||||||||||||
Mpt3sas | N/A | N/A |
38.00.00.00 - 5U92 (not in box) 41.00.00.00 - 4U106 (not in box) |
34.00.00.00 (not in box) |
|||||||||||||||||||||||||
Platform RPM | gpfs.ess.platform.ess3500-5.1.4-1.17.x86_64.rpm | gpfs.ess.platform.ess3200-5.1.4-1.17x86_64.rpm | gpfs.ess.platform.ess3000-5.1.4-1.17.x86_64.rpm | N/A | N/A | ||||||||||||||||||||||||
Drive format | 4 KiB + 0 B (non-FCM); 512 KiB+0 (FCM) | 4 KiB + 0 B | |||||||||||||||||||||||||||
Support RPM | gpfs.gnr.support-ess3500-6.1.4-1.noarch.rpm | gpfs.gnr.support-ess3200-6.1.4-1.noarch.rpm | gpfs.gnr.support-ess3000-6.1.4-1.noarch.rpm |
gpfs.gnr.support-ess5000-6.1.4-1.noarch.rpm |
gpfs.gnr.support-essbase-6.1.4-1.noarch.rpm |
||||||||||||||||||||||||
Podman | 1.6.4-11 | 1.6.4-11 | 1.6.4 RHEL6 |
1.6.4-11 | 1.6.4-11 (1.4.4 RHEL7) | ||||||||||||||||||||||||
Container version | Red Hat UBI 8.6 | Red Hat UBI 8.6 | Red Hat UBI 8.6 | Red Hat UBI 8.6 | Red Hat Enterprise Linux 7.9 | ||||||||||||||||||||||||
Ansible® | 2.9.27-1 | 2.9.27-1 | 2.9.27-1 | 2.9.27-1 | 2.9.27-1 | ||||||||||||||||||||||||
xCAT | 2.16.3 (For internal use only - not on IBM Fix Central.) | 2.16.3 Not used in customer shipped image - only for SCT |
2.16.3 | 2.16.3 (for SCT only) |
2.16.3 (for SCT only) |
||||||||||||||||||||||||
PEMS | 1111 | N/A | N/A | ||||||||||||||||||||||||||
ndctl | N/A | N/A | ndctl-65-1.el8 | N/A | |||||||||||||||||||||||||
OPAL | opal-prd-ess.v4-1.el8.x86_64.rpm | N/A |
opal-prd-3000.0-1.el8 opal-prd-ess.v4.1-1.el8.ppc64le.rpm |
N/A | |||||||||||||||||||||||||
System firmware | Canister firmware
|
RWH1-12.16.00_12.52_0140_0140_0343_0343_0343_0326_954300P0_954300P0
|
2.02.000_0B0G_1.73_FB30005 |
FW950.50 (FW950.105) |
FW860.B1 (SV860_243) |
||||||||||||||||||||||||
Boot drive |
|
|
|
9F23 |
E700 | ||||||||||||||||||||||||
Enclosure firmware | E11G | E114 | N/A |
5U92 - E558 4U106 - 5266 |
PPC64LE Slider 2U24 - 4230 5U84 - 4087 4U106 - 5284
|
||||||||||||||||||||||||
NVMe firmware |
|
|
SN1MSN1M | N/A | N/A | ||||||||||||||||||||||||
Network adapter |
|
|
CX5-VPI
|
|
MT4120 CX-5 EN 01FT741 MT4121 CX-5 VPI 01LL584 MT4122 CX-5 SRIOV VF 01LL584
|
||||||||||||||||||||||||
ESA |
esagent.pLinux-4.5.7-0 |
esagent.pLinux-4.5.7-0 |
esagent.pLinux-4.5.7-0 |
esagent.pLinux-4.5.7-0 |
esagent.pLinux-4.5.7-0 |
||||||||||||||||||||||||
BIOS | RWH3LJ-12.07.00 | 12.16.00 | 52 | N/A | N/A | ||||||||||||||||||||||||
HAL | ibm.ess-hal-2.1.1.0-5.1.x86_64.rpm | ibm.ess-hal-2.1.1.0-5.1.x86_64.rpm | N/A | N/A | N/A |
Changes in this release
- Support for IBM Spectrum Scale 5.1.4.1 efix17
- OFED 5.6.x (ESS 3000/ESS 5000/ESS 3200)
- Support for a new P8/P9 firmware (ESS Legacy/ESS 5000)
- Support for Red Hat Enterprise Linux 8.6 (ESS 3000/ESS 3200/ESS 3500)
- Support for a new glibc
- Support for a new kernel
POWER9 EMS stack
Item | Version |
---|---|
IBM Spectrum Scale | IBM Spectrum Scale 5.1.4.1 efix17 |
Operating system | Red Hat Enterprise Linux 8.6 |
ESS | ESS 6.1.4.1 |
Kernel | 4.18.0-372.16.1.el8 |
Systemd | 239-58.el8 |
Network Manager | 1.36.0-7.el8_6.ppc64le |
GNU C Library | 1.36.0-7.el8_6 |
Mellanox OFED | MLNX_OFED_LINUX-5.6-2.0.9.0 Separate firmware binary (mlxfwmanager_sriov_dis_ppc64le) |
ESA | 4.5.7-0 |
Ansible | 2.9.27-1 |
Podman | 1.6.4 |
Container OS | Red Hat UBI 8.6 |
xCAT | 2.16.3 (Not used in customer-shipped image; only for SCT) |
Firmware RPM | gpfs.ess.firmware-6.1.4-02.ppc64le.rpm |
System firmware | FW950.50 (FW950.105) |
Boot drive adapter | IPR 19512c00 |
Boot drive firmware |
|
1Gb NIC firmware |
|
Support RPM |
|
Network adapter |
|
Support matrix
Release | OS | Runs on | Can upgrade or deploy |
---|---|---|---|
ESS 3500 6.1.4 | Red Hat Enterprise Linux 8.6 (x86_64) | POWER9 EMS |
|
ESS 3200 6.1.4 | Red Hat Enterprise Linux 8.6 (x86_64) |
|
|
ESS 3000 6.1.4 | Red Hat Enterprise Linux 8.6 (x86_64) |
|
|
ESS 5000 6.1.4 | Red Hat Enterprise Linux 8.6 (PPC64LE) |
|
|
ESS Legacy 6.1.4 |
|
|
|
Prerequisites
- This document (ESS Software Quick Deployment Guide)
- SSR completes physical hardware installation and code 20.
- SSR uses Worldwide Customized Installation Instructions (WCII) for racking, cabling, and disk placement information.
- SSR uses the respective ESS Hardware Guide (ESS 3000 or ESS 5000 or ESS 3200 or ESS 3500) for hardware checkout and setting IP addresses.
- Worksheet notes from the SSR
- Latest ESS xz downloaded to
the EMS node from Fix Central (If a newer version is available).
- Data Access Edition or Data Management Edition: Must match the order. If the edition does not match your order, open a ticket with the IBM® Service.
- High-speed switch and cables have been run and configured.
- Low-speed host names are ready to be defined based on the IP addresses that the SSR have configured.
- High-speed host names (suffix of low speed) and IP addresses are ready to be defined.
- Container host name and IP address are ready to be defined in the /etc/hosts file.
- Host and domain name (FQDN) are defined in the /etc/hosts file.
-
ESS Legacy 6.1.x.x Only: You must convert to
mmvdisk
before deploying the ESS Legacy 6.1.x.x container if you are coming from a non-container version such as ESS 5.3.x.x. If you have not done so already, convert tommvdisk
by using the following steps:- Check whether there are any
mmvdisk
node classes.mmvdisk nodeclass list
There should be one node class per ESS Legacy building-block. If the command output does not show
mmvdisk
for your ESS Legacy nodes, convert tommvdisk
before running the ESS Legacy 6.1.0.x container. - Convert to
mmvdisk
by running the following command from one of the POWER8 I/O nodes or from the POWER8 EMS node.gssgenclusterrgs -G gss_ppc64 --suffix=-hs --convert
You can also use-N
with a comma-separated list of nodes.Note: Wait for 5 minutes for daemons to recycle. The file system remains up.
- Check whether there are any
What is in the /home/deploy directory on the EMS node?
- ESS 3500 tgz used in manufacturing (may not be the latest)
- ESS 5000 tgz used in manufacturing (may not be the latest)
- ESS 3000 tgz used in manufacturing (may not be the latest)
- ESS Legacy tgz used in manufacturing (may not be the latest)
- ESS 3200 tgz used in manufacturing (may not be the latest)
Support for signed RPMs
ESS or IBM Spectrum Scale RPMs are signed by IBM.
-rw-r-xr-x 1 root root 907 Dec 1 07:45 SpectrumScale_public_key.pgp
- Import the PGP
key.
rpm --import /opt/ibm/ess/tools/conf/SpectrumScale_public_key.pgp
- Verify the RPM.
rpm -K RPMFile
ESS 3000, ESS 5000, 3500, and ESS Legacy networking requirements
- Management VLAN
- Service/FSP VLANNote: To future proof your environment for ESS 3200, modify any existing management switches to the new VLAN configuration. For more information, see Switch VLAN configuration instructions.
ESS 3000
POWER8 or POWER9 EMS
- If you are adding ESS 3000 to a POWER8 EMS:
- An additional connection for the container to the management VLAN must be added. A C10-T2 cable must be run to this VLAN.
- A public/campus connection is required in C10-T3.
- A management connection must be run from C10-T1 (This should be already in place if adding to an existing POWER8 EMS with legacy nodes).
- Port 1 on each ESS 3000 canister must be connected to the management VLAN.
- If you are using an ESS 3000 with a POWER9 EMS:
- C11-T1 must be connected on the EMS to the management VLAN.
- Port 1 on each ESS 3000 canister must be connected to the management VLAN.
- C11-T2 must be connected on the EMS to the FSP VLAN.
- HMC1 must be connected on the EMS to the FSP VLAN.
ESS 5000 or ESS 3200
POWER9 EMS support only
- C11-T1 to the management VLAN
- C11-T2 to the FSP VLAN
- C11-T3 to the campus network
- HMC1 to the FSP VLAN
- C11-T1 to the management VLAN
- HMC1 to the FSP VLAN
- Single management connection per canister:
- Each connection is split between 2 MAC addresses:
- BMC
- Operating system
- The BMC connection requires a VLAN tag to be set for proper communication with the EMS node.
- Each connection is split between 2 MAC addresses:
- ESS 3200 requirements
-
- Management connections
- Shared management port (visible to OS)
- BMC connection
- Shared management port (visible to BMC)
- High-speed connections
- InfiniBand or Ethernet
- Management connections
- Management switch
-
- Typically, a 48-port switch
- Two VLANs required
- Management VLAN (VLAN 102)
- FSP/BMC VLAN (VLAN101)
- ESS 3200 dedicated trunk ports
- Routes BMC traffic to VLAN 101
Note: The VLANs shown here are default for the IBM Cumulus switch. The VLAN value can be modified according to your environment.
IBM racked orders have the switch preconfigured. Only the VLAN tag needs to be set. If you have an existing IBM Cumulus switch or customer supplied switch, it needs to be modified to accommodate the ESS 3200 trunk port requirement. For more information, see Switch VLAN configuration instructions.
ESS Legacy
POWER8 or POWER9 EMS supported
- C10-T1 to the management VLAN
- C10-T4 to the FSP/Service VLAN
- C10-T2 to the management VLAN
- C10-T3 optional campus connection
- HMC1 to the FSP/Service VLAN
- C11-T1 to the management VLAN
- C11-T2 to the FSP VLAN
- HMC1 to the FSP VLAN
- C11-T3 to the campus or management network/VLAN
- C12-T1 to the management VLAN
- HMC1 to the FSP VLAN
- Before creating the network bridges:
# ip a |grep "enP3\|bridge" 2: enP3p9s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 inet 192.168.45.20/24 brd 192.168.45.255 scope global noprefixroute enP3p9s0f0 3: enP3p9s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 4: enP3p9s0f2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 inet 9.155.113.184/20 brd 9.155.127.255 scope global noprefixroute enP3p9s0f2 5: enP3p9s0f3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
# Sample essmgr.yml CONTAINER: BKUP: /home/backup CONTAINER_DOMAIN_NAME: gpfs.net CONTAINER_HOSTNAME: cems0 FSP_BRIDGE_IP: 10.0.0.2 FSP_BRIDGE_NAME: fsp_bridge FSP_CONTAINER_IP: 10.0.0.5 FSP_INTERFACE: enP3p9s0f3 FSP_SUBNET: 10.0.0.0/24 INSTALLER_HOSTNAME: ems1 LOG: /home/log MGMT_BRIDGE_IP: 192.168.45.2 MGMT_BRIDGE_NAME: mgmt_bridge MGMT_CONTAINER_IP: 192.168.45.80 MGMT_INTERFACE: enP3p9s0f1 MGMT_SUBNET: 192.168.45.0/24
- After creating the bridges (
./essmgr -n
):# ip a |grep "enP3\|bridge" 2: enP3p9s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 inet 192.168.45.20/24 brd 192.168.45.255 scope global noprefixroute enP3p9s0f0 3: enP3p9s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master mgmt_bridge state UP group default qlen 1000 4: enP3p9s0f2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 inet 9.155.113.184/20 brd 9.155.127.255 scope global noprefixroute enP3p9s0f2 5: enP3p9s0f3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master fsp_bridge state UP group default qlen 1000 65: mgmt_bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 inet 192.168.45.2/24 brd 192.168.45.255 scope global noprefixroute mgmt_bridge 67: fsp_bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 inet 10.0.0.2/24 brd 10.0.0.255 scope global noprefixroute fsp_bridge
Code version
ess_6.1.4.1_0919-18_dme_ppc64le.tar.xz
ess_6.1.4.1_0919-18_dae_ppc64le.tar.xz
- The versions shown here might not be the GA version available on IBM FixCentral. It is recommended to go to IBM FixCentral and download the latest code.
- ppc64le in the package name implies that each container runs on a POWER®-based EMS. For details about functions supported by respective containers, see Support matrix.
// Unified Container (Data Access and Data Management versions)
ESS_DAE_BASEIMAGE-6.1.4.1-ppc64LE-Linux.tgz
ESS_DME_BASEIMAGE-6.1.4.1-ppc64LE-Linux.tgz
POWER8 considerations
- You must add an additional management network connection to C10-T2.
- A public or additional management connection is mandatory in C10-T3.
- You must stop and uninstall xCAT and all xCAT dependencies before installing the container.
Remote management considerations
- Always add a campus connection to the EMS (POWER8 and POWER9).
- Consider adding campus connections to the HMC2 ports on all POWER servers (ESS Legacy, ESS 5000, POWER8 or POWER9 EMS). Consider cabling this port to a public network and setting a campus IP. This will allow remote recovery or debug of the EMS in case of an outage.
- Consider adding campus connections to C11-T3 (POWER9 nodes) or C10-T3 (POWER8 nodes).
- Consult with service about adding USB to Ethernet dongle to enable campus connections on the ESS 3200 system.
- Add campus connection to a free port on each ESS 3000 canister. Also consider adding SMART PDUs on ESS 3000 frames to help remotely power cycle the system.
POWER8 + POWER9 considerations
- If both POWER8 and POWER9 EMS nodes are in an environment, it is recommended that you use only the POWER9 EMS for management functions (containers, GUI, ESA, collector).
- Only a single instance of all management services is recommended and solely on the POWER9 EMS.
- POWER8 only needs to exist as a management node if you are mixing a non-container-based release (5.3.x) with a container-based release (6.x.x.x).
- It is recommended that all nodes in the storage cluster contain the same ESS release and IBM Spectrum Scale version.
- It is recommended that you upgrade to the latest level before adding a building block.
- You cannot upgrade the EMS node from the ESS 3000 container.
- ESS 3000 detects if xCAT is installed on the host EMS node. If xCAT is installed, it stops the upgrade.
- You must upgrade the EMS node by using the legacy deployment procedure outlined in ESS 5.3.x Quick Deployment Guide.
Migrating from an ESS Legacy environment (xCAT-based 5.3.x) to an ESS Legacy container-based environment (6.1.x.x)
- POWER9 EMS
- You cannot run both POWER8 and POWER9 EMS nodes in the same environment for ESS Legacy. If you are moving a POWER9 EMS, migrate all services from the POWER8 EMS and uninstall xCAT. You can then re-use the POWER8 EMS for other purposes such as quorum node, client node, or spare EMS. The preference is to always use a POWER9 EMS if possible and you must not run multiple instances of GUI, performance monitoring collectors, etc. in the same cluster. For this requirement, there are exceptions for certain stretch cluster environments and if you are mixing ESS Legacy and container-based deployments such as ESS 5.3.7 on POWER8 and ESS 6.0.2.x on POWER9.
- POWER8 EMS
- If you are migrating from ESS 5.3.x to ESS 6.1.0.x on a POWER8 EMS, do the following steps.
- Stop and uninstall xCAT by doing the following steps on a POWER8 EMS, outside of the container.
- Stop xCAT.
systemctl stop xcatd
- Uninstall xCAT.
yum remove xCAT*
- Remove
dependencies.
yum remove dbus-devel dhcp bind java-1.8.0-openjdk
- Stop xCAT.
- Add a container connection to C10-T2.
- Add a campus connection to C10-T3, if it is not done already.
- Update /etc/hosts with the desired container host name and IP address.
- Stop and uninstall xCAT by doing the following steps on a POWER8 EMS, outside of the container.
Other notes
- The following tasks must be complete before starting a new installation (tasks done by
manufacturing and the SSR):
- SSR has ensured all hardware is clean, and IP addresses are set and pinging over the proper networks (through the code 20 operation).
- /etc/hosts is blank.
- The ESS tgz file (for the correct edition) is in the /home/deploy directory. If upgrade is needed, download from Fix Central and replace.
- Network bridges are cleared.
- Images and containers are removed.
- SSH keys are cleaned up and regenerated.
- All code levels are at the latest at time of manufacturing ship.
- Customer must make sure that the high-speed connections are cabled and the switch is ready before starting.
- All node names and IP addresses in this document are examples.
- Changed root password should be same on each node, if possible. The default password is
ibmesscluster
. It is recommended to change the password after deployment is completed. - Each server's IPMI and ASMI passwords (POWER nodes only) are set to the server serial number. Consider changing these passwords when the deployment is complete.
- Check whether the SSSD service is running on EMS and other nodes. Shut down the SSSD service on those nodes manually, before you upgrade the nodes.
- RHEL server nodes might be communicating to root DNS directly and are not routed through internal DNS. If this is not permitted in the environment, you might override the default service configuration or disable it. For more information about background and resolution options, see https://access.redhat.com/solutions/3553031.
ESS best practices
- ESS 6.x.x.x uses a new embedded license. It is important to know that installation of any Red Hat packages outside of the deployment upgrade flow is not supported. The container image provides everything required for a successful ESS deployment. If additional packages are needed, contact IBM for possible inclusion in future versions.
- For ESS 3000, consider enabling TRIM support. This is outlined in detail in IBM Spectrum Scale RAID Administration. By default, ESS 3000 only allocates 80% of available space. Consult with IBM development, if going beyond 80% makes sense for your environment, that is if you are not concerned about the performance implications due to this change.
- You must setup a campus or additional management connection before deploying the container.
- If running with a POWER8 and a POWER9 EMS in the same environment, it is best to move all containers to the POWER9 EMS. If there is a legacy PPC64LE system in the environment, it is best to migrate all nodes to ESS 6.1.x.x and decommission the POWER8 EMS altogether. This way you do not need to run multiple ESS GUI instances.
- If you have a POWER8 EMS, you must upgrade the EMS by using the legacy flow if there are xCAT based PPC64LE nodes in the environment (including protocol nodes). If there are just an ESS 3000 system and a POWER8 EMS, you can upgrade the EMS from the ESS 3000 container.
- If you are migrating the legacy nodes to ESS 6.1.x.x on the POWER8 EMS, you must first uninstall xCAT and all dependencies. It is best to migrate over to the POWER9 EMS if applicable.
- You must be at ESS 5.3.7 (Red Hat Enterprise Linux 7.7 / Python3) or later to run the ESS 3000 container on the POWER8 EMS.
- You must run the essrun config load command against all the storage nodes (including EMS and protocol nodes) in the cluster before enabling admin mode central or deploying the protocol nodes by using the installation toolkit.
- If you are running a stretch cluster, you must ensure that each node has a unique
hostid
. Thehostid
might be non-unique if the same IP addresses and host names are being used on both sides of the stretch cluster. Run gnrhealthcheck before creating recovery groups when adding nodes in a stretch cluster environment. You can manually check thehostid
on all nodes as follows:mmdsh -N { NodeClass | CommaSeparatedListofNodes } hostid
If
hostid
on any node is not unique, you must fix by running genhostid. These steps must be done when creating a recovery group in a stretch cluster. - Consider placing your protocol nodes in file system maintenance mode before upgrades. This is not a requirement but you should strongly consider doing it. For more information, see File system maintenance mode.
- Do not try to update the EMS node while you are logged in over the high-speed network. Update the EMS node only through the management or the campus connection.
- After adding an I/O node to the cluster, run the gnrhealthcheck command to ensure that there are no issues before creating vdisk sets. For example, duplicate host IDs. Duplicate host IDs cause issues in the ESS environment.
- Run the container from a direct SSH connection. Do not SSH from an I/O node or any node that might be rebooted by the container.
- Do not log in and run the container over the high-speed network. You must log in through the campus connection.
- You must stop Spectrum Scale tracing (mmtrace | mmtracectl) before starting the container or deploying any node. The container attempts to block if tracing is detected, it is recommended to manually inspect each ESS node before attempting to deploy.
- Heavy IBM Spectrum Scale and I/O operations must be suspended before upgrading an ESS
environment. Wait for any of the following commands that are performing file system maintenance tasks to complete:
- mmadddisk
- mmapplypolicy
- mmcheckquota
- mmdeldisk
- mmfsck
- mmlssnapshot
- mmrestorefs
- mmrestripefile
- mmrestripefs
- mmrpldisk
Stop the creation and deletion of the snapshots by using the mmcrsnapshot and mmdelsnapshot commands during the upgrade.
Support notes and rules
- Multiple EMS nodes are not supported in the same cluster. If you are adding a POWER9 EMS to an existing cluster run by a POWER8 EMS, the POWER9 EMS must be the only one used for management functions such as GUI, performance monitoring collector, etc.
- Multiple GUI instances are not supported in the same cluster.
- One collector node must be run at a time in the cluster. This must be on the same node as the GUI.
- You cannot mix majoresagent.pLinux-4.5 IBM Spectrum Scale versions in the storage cluster. All nodes must be updated to the latest level.
- ESA must be running on the EMS.
- You can run call home on the EMS.
- If possible, run the client nodes in a separate cluster than the storage.
- The essrun (ESS deployment Ansible wrapper tool run within the container) tool does not use the GPFS admin network. It uses the management network only to communicate from the container to each of the nodes.
- If POWER8 EMS only, consolidate potential xCAT and
non-xCAT offerings to container versions.
Example: If you have ESS 5.3.7.x (Legacy POWER8 offering on Scale 5.0.5.x) and ESS 3000 (Containerized support for ESS 3000 on Scale 5.x.x.x and above), convert the Legacy 5.3.7.x to 6.1.x.x so that only containers are running on POWER8 EMS.
Note: This only applies to situations where there was already Scale 5.1.x.x+ in the environment.Note: There is no container offering for BE so environments with BE would have to remain at 5.0.5 release level (but the POWER8 EMS could still move to all container version). - If POWER8 EMS and POWER9 EMS are owned by the customer, it is recommended to consolidate to POWER9 EMS (all container versions).
Example: If POWER8 EMS was running 5.1.x.x (ESS 3000, ESS Legacy or both) and customer has a POWER9 EMS (running ESS 5000 or ESS 3200) then should migrate the containers from POWER8 EMS to POWER9 and discard the POWER8 EMS (single management node).
- If migrating from xCAT-based legacy offering to container based you must go from ESS 5.3.7.x.
- When you update ESS to 6.1.2.x for the first time, you must consider the implications of moving to MOFED 5.x. Review the following flash carefully for more information Mellanox OFED 5.x considerations in IBM ESS V6.1.2.x.
- IBM Spectrum Fusion, IBM Spectrum Scale Container Native, and IBM Spectrum Scale CSI utilize the GUI rest-api server for provisioning of storage to container applications. Persistent Volume (PV) provisioning will halt when the ESS GUI is shut down and remain halted for the duration of the ESS upgrade, until the GUI is restarted. Ensure that the OpenShift and Kubernetes administrators are aware of this impact before proceeding.
- For ESS 3500, you must keep 1.5 TB or more space free if future capacity MES is planned (performance to hybrid). Thus, it is recommended to not use all available space when you create a file system for the performance model. The default allocation is 80% of available space when you use the essrun filesystem command (for x86 nodes).
Client nodes
Client nodes need to be at MOFED 4.9.x or higher and converted to verbsRDMA core libs after the ESS cluster is moved to 6.1.2.x or higher. Moving to verbsRDMA core libs is especially important if verbsRDMA is in use in the storage cluster.
Upgrade guidance
Further legacy container migration guidance
You must migrate first to ESS 5.3.7.x before you upgrade to ESS 6.1.x.x (container version).
-
- You can upgrade to 5.3.7.x from 5.3.5.x (online) or 5.3.6.x (online).
- For online upgrade you can jump one OS version and for offline upgrade you can jump two OS
versions.
Only exception is RHEL 7.7 to RHEL 7.9 upgrade. Because there is no RHEL 7.8.
Online upgrade to RHEL 7.7 from RHEL 7.6 can be done.
Upgrade to RHEL 7.7 from RHEL 7.5 must be done online.
-
It is recommended to convert from ESS 5.3.7.x to ESS 6.1.2.x and follow the normal N-X rules. To convert to ESS 6.1.2.x, use the following table (based on the RHEL 7.9 kernel):
Table 1. RHEL kernels ESS Kernel 6.1.2.4 3.10.0-1160.71.1.el7 6.1.2.3 3.10.0-1160.62.1.el7 6.1.2.2 3.10.0-1160.49.1.el7 5.3.7.6 3.10.0-1160.62.1.el7 5.3.7.5 3.10.0-1160.59.1.el 5.3.7.4 3.10.0-1160.49.1.el7 5.3.7.3 3.10.0-1160.45.1.el7 5.3.7.2 3.10.0-1160.31.1 5.3.7.1 3.10.0-1160.24.1 5.3.7.0 3.10.0-1160.11.1.el7 An example of upgrade jump is as follows:- To upgrade to ESS 6.1.2.2, you can only upgrade from 5.3.7.4 or lower versions (that is, less than equal to 5.3.7.4).
- To upgrade to ESS 6.1.2.3, you can only upgrade from 5.3.7.6 or lower versions.
- It is not recommended to upgrade from ESS 5.3.7.x to ESS 6.1.1.2 anymore. Upgrade directly to ESS 6.1.2.3 or ESS 6.1.2.4. If you are updating from ESS 6.1.1.2, upgrade to 6.1.2.3 or higher (do not upgrade to 6.1.2.2).
- For ESS 5.3.7.3, consider downgrading MOFED to MLNX_OFED_LINUX-4.9-3.1.5.3, and then convert to 6.1.2.3 or 6.1.2.4. This is to obtain full support for online upgrade when converting to RDMA core libs.
- When upgrading to 5.3.x.x, first upgrade to ESS 5.3.7.2 or ESS 5.3.7.3, and then upgrade to 6.1.2.3 or 6.1.2.4. This upgrade is to obtain full support for online upgrade when converting to RDMA core libs.
- You may need to modify the container unblock jumps from a specific 5.3.7.x level. Issue to the
following command to upgrade the ESS level in the
container:
vim /opt/ibm/ess/deploy/ansible/vars.yml
- Change (an example if you want to convert from ESS 5.3.7.1 or higher) LEGACY_SUPPORTED_VERSION: "5.3.7.3" to LEGACY_SUPPORTED_VERSION: "5.3.7.1".