ESS software deployment preparation

Install an ESS software package and deploy the storage servers by using the following information. The goal is to create a cluster that allows client or protocol nodes to access the file systems.

ESS 3500 ESS 3200 ESS 3000 ESS 5000 ESS Legacy

Runs on POWER9™ EMS POWER9 EMS POWER8® or POWER9 EMS POWER9 EMS POWER8 or POWER9 EMS

I/O node OS

Red Hat® Enterprise Linux® 8.6 x86_64 End of change

Red Hat Enterprise Linux 8.6 x86_64

Red Hat Enterprise Linux Start of change

8.6 x86_64 End of change

Red Hat Enterprise Linux 7.9 PPC64LE

Architecture x86_64 x86_64 x86_64 PPC64LE PPC64LE

IBM Spectrum® Scale 5.1.4.1 efix17 5.1.4.1 efix17 5.1.4.1 efix17 5.1.4.1 efix17 5.1.4.1 efix17

Kernel

4.18.0-372.16.1.el8 End of change

Start of change 3.10.0-1160.71.1.el7.ppc64le End of change

Systemd 239-58.el8.x86_64 239-58.el8.x86_64 239-58.el8.x86_64 239-58.el8.ppc64le 219-78.el7_9.5.ppc64leESS_DME_BASEIMAGE_3000

Network manager 1.36.0-7.el8_6.x86_64 1.36.0-7.el8_6.x86_64 1.36.0-7.el8_6.x86_64 1.36.0-0-7.el8_6.ppc64le 1.18.8-2.el7_9.ppc64le

GNU C Library glibc-1.36.0-7.el8_6 glibc-1.36.0-7.el8_6 glibc-1.36.0-7.el8_6 glibc-1.36.0-7.el8_6 glibc-2.17-326.el7_9.ppc64le

OFED

MLNX_OFED_LINUX-5.6-2.0.9.0-rhel8.6-x86_64.iso

MLNX_OFED_LINUX-5.6-2.0.9.0 End of change

Start of change Separate binary for firmware (mlxfwmanager_sriov_dis_x86_64) End of change

MLNX_OFED_LINUX-5.6-2.0.9.0-rhel8.6-x86_64.iso End of change

Separate binary for firmware

Start of change MLNX_OFED_LINUX-5.6-2.0.9.0-rhel8.6-ppc64le.iso End of change

Start of change Separate firmware binary End of change

MLNX_OFED_LINUX-4.9-5.1.0.2-rhel7.9-ppc64le.iso End of change

Firmware binary included

Firmware RPM 6.0.0.51 6.0.0.51 6.0.0.51 6.0.0.51 6.0.0.51

SAS Adapter Firmware

N/A

16.00.11.00 - 4U106 and 5U92

16.00.11.00

Mpt3sas

N/A

Start of change 38.00.00.00 End of change - 5U92 (not in box)

Start of change 41.00.00.00 End of change - 4U106 (not in box)

34.00.00.00

(not in box)

Platform RPM gpfs.ess.platform.ess3500-5.1.4-1.17.x86_64.rpm gpfs.ess.platform.ess3200-5.1.4-1.17x86_64.rpm gpfs.ess.platform.ess3000-5.1.4-1.17.x86_64.rpm N/A N/A

Drive format 4 KiB + 0 B (non-FCM); 512 KiB+0 (FCM) 4 KiB + 0 B

Support RPM End of change

gpfs.gnr.support-ess3500-6.1.4-1.noarch.rpm End of change

gpfs.gnr.support-ess3200-6.1.4-1.noarch.rpm End of change

gpfs.gnr.support-ess3000-6.1.4-1.noarch.rpm End of change

Start of change gpfs.gnr.support-ess5000-6.1.4-1.noarch.rpm End of change

Start of change gpfs.gnr.support-essbase-6.1.4-1.noarch.rpm End of change

Podman

1.6.4-11

Start of change 1.6.4 RHEL6 End of change

1.6.4-11

1.6.4-11 (1.4.4 RHEL7)

Container version Red Hat UBI 8.6 Red Hat UBI 8.6 Red Hat UBI 8.6 Red Hat UBI 8.6 Red Hat Enterprise Linux 7.9

Ansible® 2.9.27-1 2.9.27-1 2.9.27-1 2.9.27-1 2.9.27-1

xCAT

2.16.3 (For internal use only - not on IBM Fix Central.)

2.16.3

Not used in customer shipped image - only for SCT

2.16.3

Start of change 2.16.3 (for SCT only) End of change

PEMS 1111 N/A N/A

ndctl N/A N/A ndctl-65-1.el8 N/A

OPAL

opal-prd-ess.v4-1.el8.x86_64.rpm End of change

N/A

opal-prd-3000.0-1.el8

Start of change opal-prd-ess.v4.1-1.el8.ppc64le.rpm End of change

N/A

System firmware

Canister firmware

BIOS: RWH3LJ-12.07.00
BMC: 12.56
Server0FPGA: 0110
Server1FPGA: 0110
Midplane1PrimaryFPGA: 0344
Midplane1SecondaryFPGA: 0344
Midplane2PrimaryFPGA: 0344
Midplane2SecondaryFPGA: 0344
Midplane3PrimaryFPGA: 0344
Midplane3SecondaryFPGA: 0344
DriveplanePrimaryFPGA: 0527 DriveplaneSecondaryFPGA: 0527

Start of change RWH1-12.16.00_12.52_0140_0140_0343_0343_0343_0326_954300P0_954300P0 End of change

BMC: 12.52
Server0FPGA: 0140
Server1FPGA: 0140
Midplane1PrimaryFPGA: 0343
Midplane1SecondaryFPGA: 0343
Midplane2PrimaryFPGA: 0343
Midplane2SecondaryFPGA: 0343
Midplane3PrimaryFPGA: 0343
Midplane3SecondaryFPGA: 0343
DriveplanePrimaryFPGA: 0326
DriveplaneSecondaryFPGA: 0326

Start of change 2.02.000_0B0G_1.73_FB30005 End of change

Start of change FW950.50 (FW950.105)
NVDIMM ver: Bundled
BPM ver: Bundled End of change

Start of change FW860.B1 (SV860_243) End of change

Boot drive End of change

Bootdrive1_Micron_7300_MTFDHBA960TDF: 954300P0
Bootdrive2_Micron_7300_MTFDHBA960TDF: 954300P0

Bootdrive1_Micron_7300_MTFDHBA960TDF: 954300P0
Bootdrive2_Micron_7300_MTFDHBA960TDF: 954300P0

SMART:
Prod ID: SRM2S86Q800GQT51IM
P/N: 01LL447IBM
FRU: 01LL447
FW: 1361
Micron: MTFDDAV960TDS
P/N: 01LL446IBM
FRU: 01LL587
FW: ML32

Start of change 9F23 End of change

E700

Enclosure firmware

E11G

E114

N/A

5U92 - E558

4U106 - Start of change 5266 End of change

PPC64LE Slider

2U24 - 4230

5U84 - 4087

4U106 - Start of change 5284 End of change

NVMe firmware

Prod ID	FRU	Firmware version
3.84 TB NVMe Tier-1 Flash	01LL727	SN5ASN5A
7.64 TB NVMe Tier-1 Flash	01LL728	SN5ASN5A
15.36 TB NVMe Tier-1 Flash	01LL729	SN5ASN5A

Prod ID	FRU	Firmware version
3.84 TB NVMe Tier-1 Flash	01LL727	SN5ASN5A
7.64 TB NVMe Tier-1 Flash	01LL728	SN5ASN5A
15.36 TB NVMe Tier-1 Flash	01LL729	SN5ASN5A

SN1MSN1M

N/A

Network adapter

MT27500 = 10.16.1020
MT4099 = 2.42.5000
MT26448 = 2.9.1326
MT4103 = 2.42.5000
MT4113 = 10.16.1200
MT4115 = 12.28.2006
MT4117 = 14.32.1010
MT4118 = 14.32.1010
MT4119 = 16.33.1048
MT4120 = 16.33.1048
MT4121 = 16.33.1048
MT4122 = 16.33.1048
MT4123 = 20.33.1048
MT4125 = 22.33.1048

MT27500 = 10.16.1020
MT4099 = 2.42.5000
MT26448 = 2.9.1326
MT4103 = 2.42.5000
MT4113 = 10.16.1200
MT4115 = 12.28.2006
MT4117 = 14.32.1010
MT4118 = 14.32.1010
MT4119 = 16.33.1048
MT4120 = 16.33.1048
MT4121 = 16.33.1048
MT4122 = 16.33.1048
MT4123 = 20.33.1048
MT4125 = 22.33.1048

CX5-VPI

MT27500 = 10.16.1020
MT4099 = 2.42.5000
MT26448 = 2.9.1326
MT4103 = 2.42.5000
MT4113 = 10.16.1200
MT4115 = 12.28.2006
MT4117 = 14.32.1010
MT4118 = 14.32.1010
MT4119 = 16.33.1048
MT4120 = 16.33.1048
MT4121 = 16.33.1048
MT4122 = 16.33.1048
MT4123 = 20.33.1048
MT4125 = 22.33.1048

MT27500 = 10.16.1020
MT4099 = 2.42.5000
MT26448 = 2.9.1326
MT4103 = 2.42.5000
MT4113 = 10.16.1200
MT4115 = 12.28.2006
MT4117 = 14.32.1010
MT4118 = 14.32.1010
MT4119 = 16.33.1048
MT4120 = 16.33.1048
MT4121 = 16.33.1048
MT4122 = 16.33.1048
MT4123 = 20.33.1048
MT4125 = 22.33.1048

MT4120 CX-5 EN 01FT741

MT4121 CX-5 VPI 01LL584

MT4122 CX-5 SRIOV VF 01LL584

MT27500 = 10.16.1020
MT4099 = 2.42.5000
MT26448 = 2.9.1326
MT4103 = 2.42.5000
MT4113 = 10.16.1200
MT4115 = 12.28.2006
MT4117 = 14.32.1010
MT4118 = 14.32.1010
MT4119 = 16.33.1048
MT4120 = 16.33.1048
MT4121 = 16.33.1048
MT4122 = 16.33.1048
MT4123 = 20.33.1048
MT4125 = 22.33.1048

ESA

esagent.pLinux-4.5.7-0

esagent.pLinux- Start of change 4.5.7-0 End of change

esagent.pLinux-4.5.7-0

BIOS RWH3LJ-12.07.00 12.16.00 52 N/A N/A

HAL ibm.ess-hal-2.1.1.0-5.1.x86_64.rpm ibm.ess-hal-2.1.1.0-5.1.x86_64.rpm N/A N/A N/A

Changes in this release

Support for IBM Spectrum Scale 5.1.4.1 efix17
OFED 5.6.x (ESS 3000/ESS 5000/ESS 3200)
Support for a new P8/P9 firmware (ESS Legacy/ESS 5000)
Support for Red Hat Enterprise Linux 8.6 (ESS 3000/ESS 3200/ESS 3500)
Support for a new glibc
Support for a new kernel

POWER9 EMS stack

Item	Version
IBM Spectrum Scale	IBM Spectrum Scale 5.1.4.1 efix17
Operating system	Red Hat Enterprise Linux 8.6
ESS	ESS 6.1.4.1
Kernel	4.18.0-372.16.1.el8
Systemd	239-58.el8
Network Manager	1.36.0-7.el8_6.ppc64le
GNU C Library	1.36.0-7.el8_6
Mellanox OFED	MLNX_OFED_LINUX-5.6-2.0.9.0 Separate firmware binary (mlxfwmanager_sriov_dis_ppc64le)
ESA	4.5.7-0
Ansible	2.9.27-1
Podman	1.6.4
Container OS	Red Hat UBI 8.6
xCAT	2.16.3 (Not used in customer-shipped image; only for SCT)
Firmware RPM	gpfs.ess.firmware-6.1.4-02.ppc64le.rpm
System firmware	FW950.50 (FW950.105)
Boot drive adapter	IPR 19512c00
Boot drive firmware	Firmware: 9F23 Host adapter driver: 38.00.00.00 Host adapter firmware: 16.00.11.00
1Gb NIC firmware	Driver: tg3 Version: 3.137 Firmware version: 5719-v1.24i
Support RPM	gpfs.gnr.support-ess3000-6.1.4-1.noarch.rpm gpfs.gnr.support-ess3200-6.1.4-1.noarch.rpm gpfs.gnr.support-essbase-6.1.4-1.noarch.rpm gpfs.gnr.support-ess5000-6.1.4-1.noarch.rpm gpfs.gnr.support-ess3500-6.1.4-1.noarch.rpm
Network adapter	MT27500 = 10.16.1020 MT4099 = 2.42.5000 MT26448 = 2.9.1326 MT4103 = 2.42.5000 MT4113 = 10.16.1200 MT4115 = 12.28.2006 MT4117 = 14.32.1010 MT4118 = 14.32.1010 MT4119 = 16.32.2004 MT4120 = 16.32.2004 MT4121 = 16.32.2004 MT4122 = 16.32.2004 MT4123 = 20.32.2004 MT4125 = 22.32.2004

Support matrix

Release	OS	Runs on	Can upgrade or deploy
ESS 3500 6.1.4	Red Hat Enterprise Linux 8.6 (x86_64)	POWER9 EMS	ESS 3500 nodes POWER9 EMS POWER9 protocol nodes
ESS 3200 6.1.4	Red Hat Enterprise Linux 8.6 (x86_64)	POWER9 EMS	ESS 3200 nodes POWER9 EMS POWER9 protocol nodes
ESS 3000 6.1.4	Red Hat Enterprise Linux 8.6 (x86_64)	POWER8 EMS POWER9 EMS	ESS 3000 nodes POWER8 EMS POWER9 EMS POWER8 protocol nodes POWER9 protocol nodes
ESS 5000 6.1.4	Red Hat Enterprise Linux 8.6 (PPC64LE)	POWER9 EMS	ESS 5000 nodes POWER9 EMS POWER9 protocol nodes
ESS Legacy 6.1.4	Red Hat Enterprise Linux 8.6 (PPC64LE) Red Hat Enterprise Linux 7.9 (PPC64LE)	POWER8 EMS POWER9 EMS	ESS POWER8 I/O nodes (PPC64LE) ESS POWER8 protocol nodes (PPC64LE) ESS POWER9 protocol nodes (PPC64LE)* POWER8 EMS POWER9 EMS

Prerequisites

This document (ESS Software Quick Deployment Guide)
SSR completes physical hardware installation and code 20.
- SSR uses Worldwide Customized Installation Instructions (WCII) for racking, cabling, and disk placement information.
- SSR uses the respective ESS Hardware Guide (ESS 3000 or ESS 5000 or ESS 3200 or ESS 3500) for hardware checkout and setting IP addresses.
Worksheet notes from the SSR
Latest ESS xz downloaded to the EMS node from Fix Central (If a newer version is available).
- Data Access Edition or Data Management Edition: Must match the order. If the edition does not match your order, open a ticket with the IBM® Service.
High-speed switch and cables have been run and configured.
Low-speed host names are ready to be defined based on the IP addresses that the SSR have configured.
High-speed host names (suffix of low speed) and IP addresses are ready to be defined.
Container host name and IP address are ready to be defined in the /etc/hosts file.
Host and domain name (FQDN) are defined in the /etc/hosts file.
ESS Legacy 6.1.x.x Only: You must convert to mmvdisk before deploying the ESS Legacy 6.1.x.x container if you are coming from a non-container version such as ESS 5.3.x.x. If you have not done so already, convert to mmvdisk by using the following steps:
1. Check whether there are any mmvdisk node classes.
```
mmvdisk nodeclass list
```
  There should be one node class per ESS Legacy building-block. If the command output does not show mmvdisk for your ESS Legacy nodes, convert to mmvdisk before running the ESS Legacy 6.1.0.x container.
2. Convert to mmvdisk by running the following command from one of the POWER8 I/O nodes or from the POWER8 EMS node.
```
gssgenclusterrgs -G gss_ppc64 --suffix=-hs --convert 
```
  You can also use -N with a comma-separated list of nodes.
  Note: Wait for 5 minutes for daemons to recycle. The file system remains up.

What is in the /home/deploy directory on the EMS node?

ESS 3500 tgz used in manufacturing (may not be the latest)
ESS 5000 tgz used in manufacturing (may not be the latest)
ESS 3000 tgz used in manufacturing (may not be the latest)
ESS Legacy tgz used in manufacturing (may not be the latest)
ESS 3200 tgz used in manufacturing (may not be the latest)

Support for signed RPMs

ESS or IBM Spectrum Scale RPMs are signed by IBM.

The PGP key is located in /opt/ibm/ess/tools/conf.

-rw-r-xr-x 1 root root 907 Dec 1 07:45 SpectrumScale_public_key.pgp

You can check whether an ESS or IBM Spectrum Scale RPM is signed by IBM as follows.

Import the PGP key.

rpm --import  /opt/ibm/ess/tools/conf/SpectrumScale_public_key.pgp

Verify the RPM.
```
rpm -K RPMFile
```

ESS 3000, ESS 5000, 3500, and ESS Legacy networking requirements

In any scenario you must have an EMS node and a management switch. The management switch must be split into two VLANs.

Management VLAN
Service/FSP VLAN
Note: To future proof your environment for ESS 3200, modify any existing management switches to the new VLAN configuration. For more information, see Switch VLAN configuration instructions.

You also need a high-speed switch (IB or Ethernet) for cluster communication.

ESS 3000

POWER8 or POWER9 EMS

It is recommended to buy a POWER9 EMS with ESS 3000. If you have a legacy environment (POWER8), it is recommended to migrate to IBM Spectrum Scale 5.1.x.x and use the POWER9 EMS as the single management server.

If you are adding ESS 3000 to a POWER8 EMS:
- An additional connection for the container to the management VLAN must be added. A C10-T2 cable must be run to this VLAN.
- A public/campus connection is required in C10-T3.
- A management connection must be run from C10-T1 (This should be already in place if adding to an existing POWER8 EMS with legacy nodes).
- Port 1 on each ESS 3000 canister must be connected to the management VLAN.
If you are using an ESS 3000 with a POWER9 EMS:
- C11-T1 must be connected on the EMS to the management VLAN.
- Port 1 on each ESS 3000 canister must be connected to the management VLAN.
- C11-T2 must be connected on the EMS to the FSP VLAN.
- HMC1 must be connected on the EMS to the FSP VLAN.

Note: It is mandatory that you connect C11-T3 to a campus connection or run an additional management connection. If you do not do this step, you will lose the connection to the EMS node when the container starts.

ESS 5000 or ESS 3200

POWER9 EMS support only

EMS must have the following connections:

C11-T1 to the management VLAN
C11-T2 to the FSP VLAN
C11-T3 to the campus network
HMC1 to the FSP VLAN

ESS 5000 nodes must have the following connections:

C11-T1 to the management VLAN
HMC1 to the FSP VLAN

ESS 3200 nodes must have the following connections:

Single management connection per canister:
- Each connection is split between 2 MAC addresses:
  1. BMC
  2. Operating system
- The BMC connection requires a VLAN tag to be set for proper communication with the EMS node.

ESS 3200 requirements

Management connections
- Shared management port (visible to OS)
BMC connection
- Shared management port (visible to BMC)
High-speed connections
- InfiniBand or Ethernet

Management switch

Typically, a 48-port switch
Two VLANs required
- Management VLAN (VLAN 102)
- FSP/BMC VLAN (VLAN101)
ESS 3200 dedicated trunk ports
- Routes BMC traffic to VLAN 101

Note: The VLANs shown here are default for the IBM Cumulus switch. The VLAN value can be modified according to your environment.

Figure 3. ESS 3200 Ethernet ports and switch

The ports highlighted in green are the ESS 3200 trunk ports. These are special ports that are for the ESS 3200 only. The reason for these ports is that each ESS 3200 canister has a single interface for both the BMC and the OS but unique MAC addresses. By using a VLAN tag, canister BMC MAC addresses are routed to the BMC/FSP/Service VLAN (Default is 101).
IBM racked orders have the switch preconfigured. Only the VLAN tag needs to be set. If you have an existing IBM Cumulus switch or customer supplied switch, it needs to be modified to accommodate the ESS 3200 trunk port requirement. For more information, see Switch VLAN configuration instructions.

ESS 3500 network requirements

ESS Legacy

POWER8 or POWER9 EMS supported

POWER8 EMS must have the following connections:

C10-T1 to the management VLAN
C10-T4 to the FSP/Service VLAN
C10-T2 to the management VLAN
C10-T3 optional campus connection
HMC1 to the FSP/Service VLAN

POWER9 EMS must have the following connections:

C11-T1 to the management VLAN
C11-T2 to the FSP VLAN
HMC1 to the FSP VLAN
C11-T3 to the campus or management network/VLAN

POWER8 nodes:

C12-T1 to the management VLAN
HMC1 to the FSP VLAN

An output example after migrating the network from ESS 5.3.7.x before attempting to start the ESS 6.x.x.x container is as follows:

Before creating the network bridges:

# ip a |grep "enP3\|bridge"
2: enP3p9s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    inet 192.168.45.20/24 brd 192.168.45.255 scope global noprefixroute enP3p9s0f0
3: enP3p9s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
4: enP3p9s0f2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    inet 9.155.113.184/20 brd 9.155.127.255 scope global noprefixroute enP3p9s0f2
5: enP3p9s0f3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000

# Sample essmgr.yml
CONTAINER:
  BKUP: /home/backup
  CONTAINER_DOMAIN_NAME: gpfs.net
  CONTAINER_HOSTNAME: cems0
  FSP_BRIDGE_IP: 10.0.0.2
  FSP_BRIDGE_NAME: fsp_bridge
  FSP_CONTAINER_IP: 10.0.0.5
  FSP_INTERFACE: enP3p9s0f3
  FSP_SUBNET: 10.0.0.0/24
  INSTALLER_HOSTNAME: ems1
  LOG: /home/log
  MGMT_BRIDGE_IP: 192.168.45.2
  MGMT_BRIDGE_NAME: mgmt_bridge
  MGMT_CONTAINER_IP: 192.168.45.80
  MGMT_INTERFACE: enP3p9s0f1
  MGMT_SUBNET: 192.168.45.0/24

After creating the bridges (

./essmgr
-n

# ip a |grep "enP3\|bridge"
2: enP3p9s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    inet 192.168.45.20/24 brd 192.168.45.255 scope global noprefixroute enP3p9s0f0
3: enP3p9s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master mgmt_bridge state UP group default qlen 1000
4: enP3p9s0f2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    inet 9.155.113.184/20 brd 9.155.127.255 scope global noprefixroute enP3p9s0f2
5: enP3p9s0f3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master fsp_bridge state UP group default qlen 1000
65: mgmt_bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    inet 192.168.45.2/24 brd 192.168.45.255 scope global noprefixroute mgmt_bridge
67: fsp_bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    inet 10.0.0.2/24 brd 10.0.0.255 scope global noprefixroute fsp_bridge

Code version

ESS Legacy, ESS 3000, ESS 3200, ESS 5000, and ESS 3500 releases are included in ESS 6.1.4.x with two editions: Data Management Edition and Data Access Edition. An example of package names is as follows: End of change

ess_6.1.4.1_0919-18_dme_ppc64le.tar.xz
ess_6.1.4.1_0919-18_dae_ppc64le.tar.xz

Note:

The versions shown here might not be the GA version available on IBM FixCentral. It is recommended to go to IBM FixCentral and download the latest code.
ppc64le in the package name implies that each container runs on a POWER®-based EMS. For details about functions supported by respective containers, see Support matrix.

You can download the latest 6.1.x.x code ( Start of change

6.1.4.1

is the latest) from IBM Fix Central by using the following link.

IBM Fix Central download link

A unified container is offered with two versions each (Data management + Data access). Example package names for each container are as follows: Start of change

// Unified Container (Data Access and Data Management versions)
ESS_DAE_BASEIMAGE-6.1.4.1-ppc64LE-Linux.tgz
ESS_DME_BASEIMAGE-6.1.4.1-ppc64LE-Linux.tgz

Note: The container installs and runs on the EMS only. The EMS supported is Power-based only. Running container on a x86-based node is not supported as of now.

POWER8 considerations

If you are moving from an xCAT-based release (5.3.x) to a container-based release (6.1.x.x), the following considerations apply:

You must add an additional management network connection to C10-T2.
A public or additional management connection is mandatory in C10-T3.
You must stop and uninstall xCAT and all xCAT dependencies before installing the container.

Remote management considerations

Data center access has become more restrictive nowadays. Here are some considerations to enable remote support:

Always add a campus connection to the EMS (POWER8 and POWER9).
Consider adding campus connections to the HMC2 ports on all POWER servers (ESS Legacy, ESS 5000, POWER8 or POWER9 EMS). Consider cabling this port to a public network and setting a campus IP. This will allow remote recovery or debug of the EMS in case of an outage.
Consider adding campus connections to C11-T3 (POWER9 nodes) or C10-T3 (POWER8 nodes).
Consult with service about adding USB to Ethernet dongle to enable campus connections on the ESS 3200 system.
Add campus connection to a free port on each ESS 3000 canister. Also consider adding SMART PDUs on ESS 3000 frames to help remotely power cycle the system.

POWER8 + POWER9 considerations

If both POWER8 and POWER9 EMS nodes are in an environment, it is recommended that you use only the POWER9 EMS for management functions (containers, GUI, ESA, collector).
Only a single instance of all management services is recommended and solely on the POWER9 EMS.
POWER8 only needs to exist as a management node if you are mixing a non-container-based release (5.3.x) with a container-based release (6.x.x.x).
It is recommended that all nodes in the storage cluster contain the same ESS release and IBM Spectrum Scale version.
It is recommended that you upgrade to the latest level before adding a building block.

Note: If you are mixing ESS Legacy 5.3.x and ESS 3000 on a POWER8 EMS, the following considerations apply:

You cannot upgrade the EMS node from the ESS 3000 container.
ESS 3000 detects if xCAT is installed on the host EMS node. If xCAT is installed, it stops the upgrade.
You must upgrade the EMS node by using the legacy deployment procedure outlined in ESS 5.3.x Quick Deployment Guide.

Migrating from an ESS Legacy environment (xCAT-based 5.3.x) to an ESS Legacy container-based environment (6.1.x.x)

The following guidance is for customers migrating from an xCAT-based release to a container-based release for POWER8 offerings.

POWER9 EMS

You cannot run both POWER8 and POWER9 EMS nodes in the same environment for ESS Legacy. If you are moving a POWER9 EMS, migrate all services from the POWER8 EMS and uninstall xCAT. You can then re-use the POWER8 EMS for other purposes such as quorum node, client node, or spare EMS. The preference is to always use a POWER9 EMS if possible and you must not run multiple instances of GUI, performance monitoring collectors, etc. in the same cluster. For this requirement, there are exceptions for certain stretch cluster environments and if you are mixing ESS Legacy and container-based deployments such as ESS 5.3.7 on POWER8 and ESS 6.0.2.x on POWER9.

POWER8 EMS

If you are migrating from ESS 5.3.x to ESS 6.1.0.x on a POWER8 EMS, do the following steps.

Stop and uninstall xCAT by doing the following steps on a POWER8 EMS, outside of the container.
1. Stop xCAT.
```
systemctl stop xcatd
```
2. Uninstall xCAT.
```
yum remove xCAT*
```
3. Remove dependencies.
```
yum remove dbus-devel dhcp bind java-1.8.0-openjdk
```
Add a container connection to C10-T2.
Add a campus connection to C10-T3, if it is not done already.
Update /etc/hosts with the desired container host name and IP address.

Other notes

The following tasks must be complete before starting a new installation (tasks done by manufacturing and the SSR):
- SSR has ensured all hardware is clean, and IP addresses are set and pinging over the proper networks (through the code 20 operation).
- /etc/hosts is blank.
- The ESS tgz file (for the correct edition) is in the /home/deploy directory. If upgrade is needed, download from Fix Central and replace.
- Network bridges are cleared.
- Images and containers are removed.
- SSH keys are cleaned up and regenerated.
- All code levels are at the latest at time of manufacturing ship.
Customer must make sure that the high-speed connections are cabled and the switch is ready before starting.
All node names and IP addresses in this document are examples.
Changed root password should be same on each node, if possible. The default password is ibmesscluster. It is recommended to change the password after deployment is completed.
Each server's IPMI and ASMI passwords (POWER nodes only) are set to the server serial number. Consider changing these passwords when the deployment is complete.
Check whether the SSSD service is running on EMS and other nodes. Shut down the SSSD service on those nodes manually, before you upgrade the nodes.
RHEL server nodes might be communicating to root DNS directly and are not routed through internal DNS. If this is not permitted in the environment, you might override the default service configuration or disable it. For more information about background and resolution options, see https://access.redhat.com/solutions/3553031.

ESS best practices

ESS 6.x.x.x uses a new embedded license. It is important to know that installation of any Red Hat packages outside of the deployment upgrade flow is not supported. The container image provides everything required for a successful ESS deployment. If additional packages are needed, contact IBM for possible inclusion in future versions.
For ESS 3000, consider enabling TRIM support. This is outlined in detail in IBM Spectrum Scale RAID Administration. By default, ESS 3000 only allocates 80% of available space. Consult with IBM development, if going beyond 80% makes sense for your environment, that is if you are not concerned about the performance implications due to this change.
You must setup a campus or additional management connection before deploying the container.
If running with a POWER8 and a POWER9 EMS in the same environment, it is best to move all containers to the POWER9 EMS. If there is a legacy PPC64LE system in the environment, it is best to migrate all nodes to ESS 6.1.x.x and decommission the POWER8 EMS altogether. This way you do not need to run multiple ESS GUI instances.
If you have a POWER8 EMS, you must upgrade the EMS by using the legacy flow if there are xCAT based PPC64LE nodes in the environment (including protocol nodes). If there are just an ESS 3000 system and a POWER8 EMS, you can upgrade the EMS from the ESS 3000 container.
If you are migrating the legacy nodes to ESS 6.1.x.x on the POWER8 EMS, you must first uninstall xCAT and all dependencies. It is best to migrate over to the POWER9 EMS if applicable.
You must be at ESS 5.3.7 (Red Hat Enterprise Linux 7.7 / Python3) or later to run the ESS 3000 container on the POWER8 EMS.
You must run the essrun config load command against all the storage nodes (including EMS and protocol nodes) in the cluster before enabling admin mode central or deploying the protocol nodes by using the installation toolkit.
If you are running a stretch cluster, you must ensure that each node has a unique hostid. The hostid might be non-unique if the same IP addresses and host names are being used on both sides of the stretch cluster. Run gnrhealthcheck before creating recovery groups when adding nodes in a stretch cluster environment. You can manually check the hostid on all nodes as follows:
```
mmdsh -N { NodeClass | CommaSeparatedListofNodes } hostid
```
If hostid on any node is not unique, you must fix by running genhostid. These steps must be done when creating a recovery group in a stretch cluster.
Consider placing your protocol nodes in file system maintenance mode before upgrades. This is not a requirement but you should strongly consider doing it. For more information, see File system maintenance mode.
Do not try to update the EMS node while you are logged in over the high-speed network. Update the EMS node only through the management or the campus connection.
After adding an I/O node to the cluster, run the gnrhealthcheck command to ensure that there are no issues before creating vdisk sets. For example, duplicate host IDs. Duplicate host IDs cause issues in the ESS environment.
Run the container from a direct SSH connection. Do not SSH from an I/O node or any node that might be rebooted by the container.
Do not log in and run the container over the high-speed network. You must log in through the campus connection.
You must stop Spectrum Scale tracing (mmtrace | mmtracectl) before starting the container or deploying any node. The container attempts to block if tracing is detected, it is recommended to manually inspect each ESS node before attempting to deploy.
Heavy IBM Spectrum Scale and I/O operations must be suspended before upgrading an ESS environment.
Wait for any of the following commands that are performing file system maintenance tasks to complete:
- mmadddisk
- mmapplypolicy
- mmcheckquota
- mmdeldisk
- mmfsck
- mmlssnapshot
- mmrestorefs
- mmrestripefile
- mmrestripefs
- mmrpldisk
Stop the creation and deletion of the snapshots by using the mmcrsnapshot and mmdelsnapshot commands during the upgrade.

Support notes and rules

Multiple EMS nodes are not supported in the same cluster. If you are adding a POWER9 EMS to an existing cluster run by a POWER8 EMS, the POWER9 EMS must be the only one used for management functions such as GUI, performance monitoring collector, etc.
Multiple GUI instances are not supported in the same cluster.
One collector node must be run at a time in the cluster. This must be on the same node as the GUI.
You cannot mix majoresagent.pLinux-4.5 IBM Spectrum Scale versions in the storage cluster. All nodes must be updated to the latest level.
ESA must be running on the EMS.
You can run call home on the EMS.
If possible, run the client nodes in a separate cluster than the storage.
The essrun (ESS deployment Ansible wrapper tool run within the container) tool does not use the GPFS admin network. It uses the management network only to communicate from the container to each of the nodes.
If POWER8 EMS only, consolidate potential xCAT and non-xCAT offerings to container versions.
Example: If you have ESS 5.3.7.x (Legacy POWER8 offering on Scale 5.0.5.x) and ESS 3000 (Containerized support for ESS 3000 on Scale 5.x.x.x and above), convert the Legacy 5.3.7.x to 6.1.x.x so that only containers are running on POWER8 EMS.

Note: This only applies to situations where there was already Scale 5.1.x.x+ in the environment.

Note: There is no container offering for BE so environments with BE would have to remain at 5.0.5 release level (but the POWER8 EMS could still move to all container version).
If POWER8 EMS and POWER9 EMS are owned by the customer, it is recommended to consolidate to POWER9 EMS (all container versions).
Example: If POWER8 EMS was running 5.1.x.x (ESS 3000, ESS Legacy or both) and customer has a POWER9 EMS (running ESS 5000 or ESS 3200) then should migrate the containers from POWER8 EMS to POWER9 and discard the POWER8 EMS (single management node).
If migrating from xCAT-based legacy offering to container based you must go from ESS 5.3.7.x.
When you update ESS to 6.1.2.x for the first time, you must consider the implications of moving to MOFED 5.x. Review the following flash carefully for more information Mellanox OFED 5.x considerations in IBM ESS V6.1.2.x.
IBM Spectrum Fusion, IBM Spectrum Scale Container Native, and IBM Spectrum Scale CSI utilize the GUI rest-api server for provisioning of storage to container applications. Persistent Volume (PV) provisioning will halt when the ESS GUI is shut down and remain halted for the duration of the ESS upgrade, until the GUI is restarted. Ensure that the OpenShift and Kubernetes administrators are aware of this impact before proceeding.
For ESS 3500, you must keep 1.5 TB or more space free if future capacity MES is planned (performance to hybrid). Thus, it is recommended to not use all available space when you create a file system for the performance model. The default allocation is 80% of available space when you use the essrun filesystem command (for x86 nodes).

Client nodes

Client nodes need to be at MOFED 4.9.x or higher and converted to verbsRDMA core libs after the ESS cluster is moved to 6.1.2.x or higher. Moving to verbsRDMA core libs is especially important if verbsRDMA is in use in the storage cluster.

Upgrade guidance

Note:

Upgrades to ESS 6.1.2.x follow the N-2 rule. You can upgrade from ESS 6.1.2.x, 6.1.1.x (that is, 6.1.1.2) or 6.1.0.x. End of change

Further legacy container migration guidance

You must migrate first to ESS 5.3.7.x before you upgrade to ESS 6.1.x.x (container version).

ESS 5.3.x.x upgrade guidance

- You can upgrade to 5.3.7.x from 5.3.5.x (online) or 5.3.6.x (online).
- For online upgrade you can jump one OS version and for offline upgrade you can jump two OS versions.
  Only exception is RHEL 7.7 to RHEL 7.9 upgrade. Because there is no RHEL 7.8.
  
  Online upgrade to RHEL 7.7 from RHEL 7.6 can be done.
  
  Upgrade to RHEL 7.7 from RHEL 7.5 must be done online.

ESS 6.1.x.x upgrade guidance

It is recommended to convert from ESS 5.3.7.x to ESS 6.1.2.x and follow the normal N-X rules. To convert to ESS 6.1.2.x, use the following table (based on the RHEL 7.9 kernel):

Table 1. RHEL kernels
ESS	Kernel
6.1.2.4	3.10.0-1160.71.1.el7
6.1.2.3	3.10.0-1160.62.1.el7
6.1.2.2	3.10.0-1160.49.1.el7
5.3.7.6	3.10.0-1160.62.1.el7
5.3.7.5	3.10.0-1160.59.1.el
5.3.7.4	3.10.0-1160.49.1.el7
5.3.7.3	3.10.0-1160.45.1.el7
5.3.7.2	3.10.0-1160.31.1
5.3.7.1	3.10.0-1160.24.1
5.3.7.0	3.10.0-1160.11.1.el7

An example of upgrade jump is as follows:

To upgrade to ESS 6.1.2.2, you can only upgrade from 5.3.7.4 or lower versions (that is, less than equal to 5.3.7.4).
To upgrade to ESS 6.1.2.3, you can only upgrade from 5.3.7.6 or lower versions.

It is not recommended to upgrade from ESS 5.3.7.x to ESS 6.1.1.2 anymore. Upgrade directly to ESS 6.1.2.3 or ESS 6.1.2.4. If you are updating from ESS 6.1.1.2, upgrade to 6.1.2.3 or higher (do not upgrade to 6.1.2.2).
For ESS 5.3.7.3, consider downgrading MOFED to MLNX_OFED_LINUX-4.9-3.1.5.3, and then convert to 6.1.2.3 or 6.1.2.4. This is to obtain full support for online upgrade when converting to RDMA core libs.
When upgrading to 5.3.x.x, first upgrade to ESS 5.3.7.2 or ESS 5.3.7.3, and then upgrade to 6.1.2.3 or 6.1.2.4. This upgrade is to obtain full support for online upgrade when converting to RDMA core libs.
You may need to modify the container unblock jumps from a specific 5.3.7.x level. Issue to the following command to upgrade the ESS level in the container:
```
vim /opt/ibm/ess/deploy/ansible/vars.yml
```
Change (an example if you want to convert from ESS 5.3.7.1 or higher) LEGACY_SUPPORTED_VERSION: "5.3.7.3" to LEGACY_SUPPORTED_VERSION: "5.3.7.1".

For more information about the ESS 6.1.x.x upgrade, see IBM Spectrum Scale Alert: Mellanox OFED 5.x considerations in IBM ESS V6.1.2.x+.