
Installing the ESS software
This topic includes information about installing and configuring the ESS software.
This topic includes the installation and configuration procedure for an ESS 3.0 system with one or more building blocks. To complete this procedure, you need to have a working knowledge of Power Systems servers, GPFS, and xCAT.
For information about known issues, mitigation, and workarounds,
see Table 1.
Depending on which fix level you are installing, these might or might not apply to you.
For information about upgrading to ESS 3.0 from ESS 2.5, see Upgrading the Elastic Storage Server.
Networking requirements
- Service network
This network connects the flexible service processor (FSP) on the management server and I/O server nodes with the HMC, as shown in yellow in Figure 1 and Figure 2. The HMC runs the Dynamic Host Configuration Protocol (DHCP) server on this network. If the HMC is not included in the solution order, a customer-supplied HMC is used.
Management
and provisioning network
This network connects the management server to the I/O server nodes and HMCs, as shown as blue in Figure 1 and Figure 2. The management server runs DHCP on the management and provisioning network. If a management server is not included in the solution order, a customer-supplied management server is used.
Clustering
network
This
high-speed
network is used for clustering and client node access.
It can be a 10 Gigabit Ethernet (GbE), 40 GbE, or InfiniBand network.
It might not be included in the solution order. - External and campus management network
This
public
network is used for external and campus management of the management server, the HMC, or both.
The management and provisioning network and the service network must run as two non-overlapping networks implemented as two separate physical networks or two separate virtual local-area networks (VLANs).
The HMC, the management server, and the switches (1 GbE switches and high-speed switches) might not be included in a solution order in which an existing or customer-supplied HMC or management server is used. Perform any advance planning tasks that might be needed to access and use these solution components.
management
and provisioning network and the
service
network for an ESS building block.
The management and provisioning network and the service network

management
and provisioning network and the
service
network for an ESS building block.
management and provisioning network and the service
network: a logical view 
Installing the ESS 3.0 software
Preparing for the installation
- Obtain the current ESS 3.0 installation code from
the Fix Central website.
To download from Fix Central, you must have entitlement for the given installation package. Check with your IBM representative if you have questions.
- Obtain a Red Hat Enterprise Linux 7.1 ISO image file or DVD for
64-bit IBM Power Systems architecture, for example:
RHEL-7.1-20150219.1-Server-ppc64-dvd1.iso
For more information, see Creating a temporary repository for Red Hat Enterprise Linux 7.1 and the Red Hat Enterprise Linux website.
Perform the following tasks and gather all required information before starting the installation process. Table 1 includes information about components that must be set up before you start installing the ESS 3.0 software.
For tips about how to name nodes, see Node name considerations.
| ESS component | Description | Required actions | System settings |
|---|---|---|---|
| 1. Service network | This private network connects the HMC with
the management server's FSP and the I/O server nodes.
The service network must not be seen by the OS running on the node being managed
(that is, the management server or the I/O server node).
The HMC uses this network to discover the management server and the I/O server nodes and perform such hardware management tasks as creating and managing logical partitions, allocating resources, controlling power, and rebooting. |
Perform any advance planning tasks that might be needed to access and use the HMC if it is not part of the solution order
and a customer-supplied HMC will be used.
Set up this network if it has not been set up already. |
Set the HMC to be the DHCP server for the service network. |
| 2. Management and provisioning network | This network connects the management server node with the HMC and the I/O server nodes.
It typically runs over 1Gb.
|
Perform any advance planning tasks that might be needed to access and use the management server if it is not part of the solution order
and a customer-supplied management server will be used.
Set up this network if it has not been set up already. |
|
3. Clustering network |
This network is for high-performance data access. In most cases, this network is also part of the clustering network. It is typically composed of 10GbE, 40GbE, or InfiniBand networking components. | Set up this network if it has not been set up already. | |
| 4. Management network domain | The management server uses this domain for the proper resolution of hostnames . |
Set the domain name using lowercase characters. Do not use any uppercase characters. | Example:
|
5. HMC node (IP address and hostname ) |
The IP address of the HMC node on the management network
has a console name, which is the hostname and a domain name .
|
Set the fully-qualified domain name (FQDN) and the hostname
using lowercase characters.
Do not use any uppercase characters.
Do not use a suffix of -enx, where x is any character. |
Example:
|
| 6. Management server node (IP address) | The IP address of the management server node has an FQDN and a hostname .
|
Set the FQDN and hostname using lowercase characters.
Do not use any uppercase characters.
Do not use a suffix of -enx, where x is any character. |
Example:
|
| 7. I/O server nodes (IP addresses) | The IP addresses of the I/O server nodes have FQDNs and hostnames .
|
Set the FQDN and hostname using lowercase characters.
These names must match the name of the partition created for these nodes using the HMC.
Do not use any uppercase characters.
Do not use a suffix of -enx, where x is any character. |
Example:
I/O server 1:
I/O server 2:
|
| 8. Management server node (management network interface) | The management network interface of the management server node must have the IP address that you set in item 6 assigned to it. This interface must have only one IP address assigned. | To obtain this address, run:
|
Example:
|
| 9. HMC (hscroot password) | Set the password for the hscroot user ID. | Example:
This is the default password. |
|
| 10. I/O servers (user IDs and passwords) | The user IDs and passwords of the I/O servers are assigned during deployment. | Example:
User ID: root Password: cluster (this is the default password) |
|
11. Clustering network (hostname prefix or suffix) |
This high-speed network
is implemented on a 10Gb Ethernet, 40Gb Ethernet or InfiniBand network. |
Set a hostname for this network. It is customary to use hostnames for the high-speed network that use the prefix and suffix of the actual hostname. Do not use a suffix of -enx, where x is any character. | Examples:
Suffixes: -ib, -10G, -40G Hostnames with a suffix: gssio1-ib, gssio2-ib |
| 12. High-speed cluster network (IP address) | The IP addresses of the management server nodes and I/O server nodes on the high-speed cluster network
have FQDNs and hostnames.
In the example, 172.10.0.11 is the IP address that the GPFS daemon uses for clustering.
The corresponding ![]() |
Set the FQDNs and hostnames .
Do not make changes in the /etc/host file for the high-speed network until the deployment is complete. Do not create or enable the high-speed network interface until the deployment is complete. |
Example:
Management server:
I/O server 1:
I/O server 2:
|
| 13. Red Hat Enterprise Linux 7.1 | The Red Hat Enterprise Linux 7.1 DVD or ISO file is used to create a temporary repository for the xCAT installation. xCAT uses it to create a Red Hat Enterprise Linux repository on the management server node. | Obtain this DVD or ISO file and download.
For more information, see the Red Hat Enterprise Linux website. |
Example:
|
14. Management network switch |
The switch that implements the management network must allow
the Bootstrap Protocol (BOOTP) to go through. |
Obtain the IP address and access credentials (user ID and password) of this switch.
Some switches generate many Spanning Tree Protocol (STP) messages, which interfere with the network boot process. You need to disable STP to mitigate this. |
|
| 15. Target file system | You need to provide information about the target file system
that is created using storage in the ESS building blocks.![]() |
Set the target file system name, the mount point, the block size, the number of data NSDs, and the number of metadata NSDs. | Example:
|
Set up the HMC and the management server (MS)
For information about setting up the HMC network for use by xCAT, see the xCAT website.
- Connect the ESS I/O server nodes and management server (if it is part of the order) to the HMC. If the HMC is not part of the order, you will need to provide it.
- Verify that the partitions of the I/O server and management server (if it is part of the order) are visible on the HMC. (The HMC might prompt you for the FSP password. The default password is abc123.) The HMC discovers the I/O server and management server nodes automatically when the nodes are powered on. If this does not happen, power cycle the nodes.
- Typically, server names, or central processor complex (CPC) names, are derived from the serial number. It is recommended that you do not change the server name. Make sure the server name and logical partition (LPAR) name are not identical.
- The default partition names follow.
- Management server: ems1
- I/O server 1: gssio1
- I/O server 2: gssio2
- If there are more building blocks in the same order, the additional I/O server node partition names are: gssio3, gssio4, gssio5, ... gssion, where n is the total number of I/O servers.
- The management server nodes and I/O server nodes are shipped from IBM with Red Hat Enterprise Linux 7.1
installed in an R10 disk array. The I/O server nodes are redeployed (including
reinstallation of Red Hat Enterprise Linux 7.1) at the customer location from the management server.
Typically, this process takes approximately 30 minutes to complete. Completion of this process ensures that
the installation is consistent with various site-specific parameters.
It also minimizes
configuration mismatches and incompatibilities
between the management server nodes and I/O server nodes.
There is no need to reinstall the management server. It is reinstalled only if the OS cannot boot any more due to hardware damage or failure. See Installing Red Hat Enterprise Linux on the management server to reinstall the management server if needed.
- Verify that you can access the management server console using the HMC. After network connectivity is established to the management server node (see the next section), it is recommended that you access the management server over the network using an available secure shell (SSH) client such as PuTTY.
Configure an IP address for the xCAT network on the management server using the HMC console
- Log in to the system as root. The default root password from IBM is cluster.
- List the available interfaces, which should begin with a prefix of enP7:
If you do not see any interfaces with a state of UP, check your network connections before proceeding.ip link show | egrep "P7.*state UP" - Select the interface that ends with a suffix of f0.
Example:By default, enP7p128s0f0 is C10-port 0 and is configured at IBM with an IP address of 192.168.45.10, 192.168.45.11, or 192.168.45.20.
enP7p128s0f0 - Edit the network configuration for this interface and change it as needed.
The file name is:
In this file, change the value of BOOTPROTO from dhcp to static and set the value of ONBOOT to yes if it is not set already:/etc/sysconfig/network-scripts/ifcfg-enP7p128s0f0BOOTPROTO=static ONBOOT=yes - Add or change the management server's IP address and netmask as needed:
IPADDR=192.168.45.20 NETMASK=255.255.255.0 - Restart network services if the address is changed:
systemctl restart network - Verify that the management server's management network interface is up.
Example:
ping 192.168.45.20 - After the interface is configured, you can log in to the management server node using an SSH client.
Command sequence overview
- Obtain the packed, compressed ESS 3.0 software.
Unpack and uncompress the software. For example, run:
tar zxvf gss_install-3.0.0_ppc64_advanced_
1506081846
.tgz
The name of your ESS 3.0 software tar (.tgz) file could differ
based on the GPFS edition you are using and the fix levels of the ESS release you are installing.
- Check the MD5 checksum:
md5sum -c gss_install-3.0.0_ppc64_advanced_
1506081846
.md5 - Obtain the ESS 3.0
license, accept the license, and run this command to extract the software:
/bin/sh
gss_install-3.0.0_ppc64_standard_1506081846 --text-only
Clean up the current xCAT installation and associated configuration:
gssdeploy -c
- Customize the gssdeploy script and run it to configure xCAT:
gssdeploy -x
In this case, the gssdeploy script runs one step at a time, which is recommended,
and waits for user responses.
- Install the ESS 3.0 packages on the management server node:
gssinstall -m manifest -u - Update the management server node:
updatenode ems1 -P gss_updatenode
If indicated by the previous step, reboot the management server node to reflect changes
from the management server update node. After rebooting, run updatenode again
if instructed to do so. 
- Update OFED on the management server node:
updatenode ems1 -P gss_ofed - Reboot the management server node to reflect changes for the OFED update.
- Deploy on the I/O server nodes:
gssdeploy -d

Obtain the ESS 3.0 installation software and install it on the management server node
- Obtain the software from the Fix Central website.
The name of your ESS 3.0 software tar (.tgz) file could differ
based on the GPFS edition you are using and the fix levels of the ESS release you are installing.
- Unpack and uncompress the file to create the installation software and MD5 checksum of the installation software file.
To unpack and uncompress the file, run this command:
The system displays output similar to this:tar zxvf gss_install-3.0.0_ppc64_advanced_
1506081846
.tgz root@gems5 deploy]# tar zxvf gss_install-3.0.0_ppc64_advanced_
1506081846
.tgz
gss_install-3.0.0_ppc64_advanced_
1506081846
gss_install-3.0.0_ppc64_advanced_
1506081846
.md5 - To verify the MD5 checksum of the software, run:
The system displays output similar to this:md5sum -c gss_install-3.0.0_ppc64_advanced_
1506081846
.md5 [root@gems5 deploy]# md5sum -c gss_install-3.0.0_ppc64_advanced_
1506081846
.md5
gss_install-3.0.0_ppc64_advanced_
1506081846
: OK - Use the gss_install* command to accept the ESS 3.0 product license and install the ESS 3.0 software package.
The ESS 3.0 installation software is integrated with the product license acceptance tool. To install the ESS 3.0 software, you must accept the product license. To accept the license and install the package, run the gss_install* command - for example:
/bin/sh gss_install-3.0.0_ppc64_advanced_1506081846
- with the appropriate
options. The gss_install* command you run
could differ
based on the GPFS edition you are using and the fix levels of the ESS release you are installing.
For example, run:
See gss_install* command for more information about this command./bin/sh gss_install-3.0.0_ppc64_advanced_
1506081846
--text-only
Clean the current xCAT installation and associated configuration:
gssdeploy -c
By default,
the product license acceptance tool places the code in the following directory:
/opt/ibm/gss/install
You can use the -dir option to specify a different directory.
- Run the change directory command:
cd /opt/ibm/gss/install - Use the gssinstall script to install the ESS 3.0 packages on the management server node.
This script is in the /opt/ibm/gss/install/installer directory.
For example, run:
The system displays output similar to this:installer/gssinstall -m manifest -u
See gssinstall script for more information about this script.[root@ems1 install]# installer/gssinstall -m manifest -u ======================================================================== GSS package installer Log: /var/log/gss/gssinstall.log ======================================================================== [EMS] Audit Summary: [EMS] Installer Ver: 3.0.0_rc2-1506151808_ppc64_standard [EMS] Group gpfs RPMs: Not Inst: 0, Current: 9, New: 0, Old: 0 [EMS] Group gss RPMs: Not Inst: 2, Current: 0, New: 0, Old: 0 [EMS] Group gui RPMs: Not Inst: 0, Current: 3, New: 0, Old: 0 [EMS] Group ofed RPMs: Not Inst: 0, Current: 1, New: 0, Old: 0 [EMS] Group xcat-core RPMs: Not Inst: 5, Current: 0, New: 0, Old: 0 [EMS] Group xcat-dfm RPMs: Not Inst: 2, Current: 0, New: 0, Old: 0 Update EMS software repositories? [y/n]: y Updating EMS software repository Loaded plugins: product-id, versionlock ======================================================================== GSS package installer - Update complete. ========================================================================
Configure the installed packages on the management server node and prepare for deployment
/opt/ibm/gss/install/samples
directory.
- Copy the gssdeploy script from the
/opt/ibm/gss/install/samples
directory
to another directory and then customize the copy to match your environment.
You need to make changes to several lines at the top of your copy
of this script for the target configuration, as shown in bold typeface
in the following example.
For ESS 3.0, DEPLOY_OSIMAGE must be set to rhels7.1-ppc64-install-gss. You might see other OSIMAGE values that correspond to earlier releases (xCAT command lsdef -l osimage, for example).######################################################################### # # Customize/change following to your environment # ######################################################################### #[RHEL] # Set to Y if RHEL DVD is used otherwise iso is assumed. RHEL_USE_DVD="N" # Device location of RHEL DVD used instead of iso RHEL_DVD="/dev/cdrom" # Mount point to use for RHEL media. RHEL_MNT="/opt/ibm/gss/mnt" # Directory containing ISO. RHEL_ISODIR=/opt/ibm/gss/iso # Name ISO file. RHEL_ISO="RHEL-7.1-20150219.1-Server-ppc64-dvd1.iso" #[EMS] # Hostname of EMS EMS_HOSTNAME="ems1" # Network interface for xCAT management network EMS_MGTNETINTERFACE="enP7p128s0f0" #[HMC] # Hostname of HMC HMC_HOSTNAME="hmc1" # Default userid of HMC HMC_ROOTUID="hrcroot" # Default password of HMC HMC_PASSWD="Passw0rd" #[IOSERVERS] # Default userid of IO Server. IOSERVERS_UID="root" # Default password of IO Server. IOSERVERS_PASSWD="cluster" # Array of IO servers to provision and deploy. IOSERVERS_NODES=(gssio1 gssio2) #[DEPLOY] # OSIMAGE stanza to deploy to IO servers. DEPLOY_OSIMAGE="rhels7.1-ppc64-install-gss" ######################################################################## # # End of customization # ######################################################################## - Run the gssdeploy script.
The gssdeploy script can be run in interactive mode or non-interactive ("silent") mode. Running gssdeploy in interactive mode is recommended.
The gssdeploy script is run in two phases. In the first phase, it is run with the -x option to set up the management server and xCAT. In the second phase, it is run with the -d option to deploy on the I/O server node.
See gssdeploy script for more information about this script.
Every step of the gssdeploy script shows the current step to be run and a brief description of the step. For example, the command to be run is shown in bold typeface and the response of the command in shown in italics:[STEP]: Deploy 4 of 7,, Set osimage attributes for the nodes so current values will be used for rnetboot or updatenode [CMD]: => nodeset gss_ppc64 osimage=rhels7.1-ppc64-install-gss Enter 'r' to run [CMD]: Enter 's' skip this step, or 'e' to exit this script Enter response: r [CMD_RESP]: gssio1: install rhels7.1-ppc64-gss [CMD_RESP]: gssio2: install rhels7.1-ppc64-gss [CMD_RESP]: RC: 0 - Configure xCAT and the management server node
To configure xCAT and the management server node, you will run gssdeploy -x. If xCAT is installed on the node already, the script will fail. If it fails, clean the previous xCAT installation by running gssdeploy -c.
Suppose your modified gssdeploy script is in the /home/deploy directory. Run:
The script goes through/home/deploy/gssdeploy -x
several
steps and configures xCAT on the management server node.
gssdeploy -x -r steps shows details of these steps.
Some of the steps (those in which copycds or getmacs is run,
for example) take some time to complete. - Update the management server node.
In this step, the GPFS RPMs, kernel, and OFED updates are installed on the node. This step prepares the node to run as a cluster member node in the GNR cluster.
- Run the updatenode ManagementServerNodeName -P gss_updatenode command.
For example, run:
The system displays output similar to this:updatenode ems1 -P gss_updatenode
This step could take some time to complete if vpdupdate is run before the actual update.[root@ems1 deploy]# updatenode ems1 -P gss_updatenode ems1: Mon Jun 15 18:02:50 CDT 2015 Running postscript: gss_updatenode ems1: gss_updatenode [INFO]: Using LOG: /var/log/xcat/xcat.log ems1: gss_updatenode [INFO]: Performing update on ems1 ems1: gss_updatenode [INFO]: Erasing gpfs rpms ems1: gss_updatenode [INFO]: Erase complete ems1: gss_updatenode [INFO]: Updating ospkgs on ems1 (Please wait...) ems1: gss_updatenode [INFO]: Version unlocking kernel for the update ems1: gss_updatenode [INFO]: Disabling repos: ems1: gss_updatenode [INFO]: Updating otherpkgs on ems1 (Please wait...) ems1: gss_updatenode [INFO]: Enabling repos: ems1: gss_updatenode [INFO]: Version locking kernel ems1: gss_updatenode [INFO]: Checking that GPFS GPL layer matches running kernel ems1: gss_updatenode [INFO]: GPFS GPL layer matches running kernel ems1: gss_updatenode [INFO]: Checking that OFED ISO supports running kernel ems1: gss_updatenode [INFO]: Upgrade complete ems1: Postscript: gss_updatenode exited with code 0 ems1: Running of postscripts has completed. - To determine whether you are waiting for vpdupdate,
run this command:
The system displays output similar to this:ps ef | grep vpd[root@ems1 ~]# ps ef | grep vpd root 75272 75271 0 17:05 ? 00:00:00 /usr/sbin/lsvpd root 75274 75272 0 17:05 ? 00:00:00 sh -c /sbin/vpdupdate >/dev/null 2>&1 root 75275 75274 2 17:05 ? 00:00:03 /sbin/vpdupdate root 76106 73144 0 17:08 pts/0 00:00:00 grep -color=auto vpd - Reboot the node after the updatenode command completes.
- Run the OFED update using
updatenode ManagementServerNodeName -P gss_updatenode.
For example, run:
The system displays output similar to this:updatenode ems1 -P gss_ofed[root@ems1 deploy]# updatenode ems1 -P gss_ofed ems1: Mon Jun 15 18:20:54 CDT 2015 Running postscript: gss_ofed ems1: Starting to install OFED..... ems1: Mellanox controller found, install Mellanox OFED ems1: Unloading HCA driver:[ OK ] ems1: Mounting OFED ISO... ems1: /tmp //xcatpost ems1: mount: /dev/loop0 is write-protected, mounting read-only ems1: Loaded plugins: product-id, subscription-manager, versionlock ems1: This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register. ems1: Error: Error: versionlock delete: no matches ems1: Installing OFED stack... ems1: TERM environment variable not set. ems1: Logs dir: /tmp/MLNX_OFED_LINUX-2.4-1.0.2.6012.logs ems1: ems1: Log File: /tmp/MLNX_OFED_LINUX-2.4-1.0.2.6012.logs/fw_update.log ems1: Unloading HCA driver:[ OK ] ems1: Loading HCA driver and Access Layer:[ OK ] ems1: Loaded plugins: product-id, subscription-manager, versionlock ems1: This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register. ems1: Adding versionlock on: 0:dapl-devel-2.1.3mlnx-OFED.2.4.37.gb00992f ems1: Adding versionlock on: 0:srptools-1.0.1-OFED.2.4.40.g68b353c-OFED.2.3.47.gc8011c5 . . . ems1: Adding versionlock on: 0:opensm-devel-4.3.0.MLNX20141222.713c9d5-0.1 ems1: versionlock added: 60 ems1: //xcatpost ems1: Postscript: gss_ofed exited with code 0 ems1: Running of postscripts has completed.
If instructed to do so,
reboot the node after the OFED update is complete.
- After rebooting,
run updatenode again.
- To make sure the OFED is updated and reflects the installed kernel,
run this command:
The system displays output similar to this:ofed_info | grep -e kernel | grep ppc64[root@ems1 deploy]# ofed_info | grep -e kernel | grep ppc64 kernel-mft-3.8.0-3.10.0_229.el7.ppc64.ppc64 kernel-ib-devel-2.4-3.10.0_229.el7.ppc64_OFED.2.4.1.0.2.1.ge234f2b.ppc64 kernel-ib-2.4-3.10.0_229.el7.ppc64_OFED.2.4.1.0.2.1.ge234f2b.ppc64
- Run the updatenode ManagementServerNodeName -P gss_updatenode command.
For example, run:
Deploy the nodes
- Close all console (rcons) sessions on the management server and on the HMC.
- If the switch is supplied by the customer (that is, not shipped from IBM), make sure all nodes can communicate using BOOTP and there are no excessive STP messages. BOOTP could fail in the presence of excessive STP messages. You might consider enabling PortFast on the ports that are connected to the I/O server node.
- Make sure no other DHCP server is acting on the network.
- Make sure the external JBOD storage is powered off or disconnected.
gssdeploy -d
At this point, the I/O server nodes are restarted and the OS and other software packages
are installed on them. Monitoring the I/O server node installation process
Use the remote console feature of xCAT to monitor the installation process. The preferred method for monitoring the progress is to watch the console logs using the Linux tailf command.
tailf /var/log/consoles/gssio1
rcons NodeName
If you connect to the console when the Red Hat installer, called Anaconda, is running, you are sent to a menu system. To display various menus, press <Ctrl-b> n, where n is the number of the menu you want to view. For example, if you press <Ctrl-b> 2, you are placed in the Anaconda shell. It is recommended that you not perform any actions using the Anaconda menu unless instructed to do so.
makeconservercf
nodestat gss_ppc64
The system displays output similar to this:
[root@ems1 ~]#nodestat gss_ppc64
gssio1: installing post
gssio2: installing post
[root@ems1 ~]#nodestat gss_ppc64
gssio1: sshd
gssio2: sshd
xdsh gss_ppc64 "ps -eaf | grep -v grep | grep xcatpost"
If there are any processes still running, wait for them to complete.It is possible that the installation could fail due to network boot issues.
If the installation fails, run makeconservercf before trying it again.
Retry the installation at least three times and see if that fixes the issue.
nodeset gssio2 osimage=rhels7.1-ppc64-install-gss
rnetboot gssio2 -V
This command sequence restarts the installation process on gssio2.
Monitor the console using tailf or rcons.
Check the messages that are displayed during the initial phase of the boot process.
Most issues will occur during this phase. Check for synchronization files
As part of the operating system and I/O server code installation, xCAT runs post-installation scripts. These scripts install the required RPMs, upgrade and configure the networks (10 GbE, 40GbE, and InfiniBand), and configure the SAS adapters.
xdsh gss_ppc64 "ls /install/gss/sync"
The system displays output similar to this:
gssio1: mofed
gssio2: mofed
updatenode gss_ppc64 -F
Check for post-installation scripts
updatenode gss_ppc64 -V -P gss_ofed,gss_sashba
The updatenode command could take some time to complete. This is because
updatenode calls vpdupdate on the node.
You can check by running ps -ef | grep vpd on each node. If you see vpdupdate
running, the updatenode command is waiting for it to complete. Apply Red Hat updates
After deployment is complete, you can apply Red Hat updates as needed. Note that kernel and OFED components are matched with the ESS software stack and are therefore locked during deployment to prevent unintended changes during update.
There is a known issue with systemd in Red Hat Enterprise Linux 7.1 that could prevent nodes that are running Red Hat Enterprise Linux 7.1 from mounting the GPFS file system. The systemd fix should be prepared and be ready to be applied before the file system is created. See item 5 in Table 1 for more information.
See Red Hat Enterprise Linux update considerations for additional considerations.
Check the system hardware
- gssstoragequickcheck checks the server, adapter, and storage configuration quickly.
- gssfindmissingdisks checks the disk paths and connectivity.
- gsscheckdisks checks for disk errors under various I/O operations.
Power on JBODs
After the I/O server nodes have been installed successfully, power on the JBODs. Wait approximately 5 to 10 minutes from power on to discover the disks before moving on to the next step.
System check 1: run gssstoragequickcheck
gssstoragequickcheck -G gss_ppc64
The system displays output similar to this:
[root@ems1 deploy]# gssstoragequickcheck -G gss_ppc64
2015-06-15T20:17:07.036867 Start of storage quick configuration check
2015-06-15T20:17:08.745084 nodelist: gssio1 gssio2
gssio1: Machine Type: 8247-22L
gssio2: Machine Type: 8247-22L
gssio1: Valid SAS Adapter Configuration. Number of Adapter(s) found 3
gssio1: Valid Network Adapter Configuration. Number of Adapter(s) found: 3
gssio2: Valid SAS Adapter Configuration. Number of Adapter(s) found 3
gssio2: Valid Network Adapter Configuration. Number of Adapter(s) found: 3
gssio1: Enclosure DCS3700 found 2
gssio1: Disk ST2000NM0023 found 116
gssio1: SSD PX02SMF040 found 2
gssio1: Total disk found 116, expected 116
gssio1: Total SSD found 2, expected 2
gssio2: Enclosure DCS3700 found 2
gssio2: Disk ST2000NM0023 found 116
gssio2: SSD PX02SMF040 found 2
gssio2: Total disk found 116, expected 116
gssio2: Total SSD found 2, expected 2
2015-06-15T20:17:25.670645 End of storage quick configuration check
xdsh gss_ppc64 "modprobe mpt2sas"
After running modprobe, run gssstoragequickcheck again.See gssstoragequickcheck command for more information about this command.
System check 1a: run lsifixnv
xdsh gss_ppc64 "/xcatpost/gss_sashba"
System check 1b: Check the RAID firmware
xdsh ems1,gss_ppc64 "for IOA in \$(lsscsi -g | grep SISIOA | awk '{print \$NF}');
do iprconfig -c query-ucode-level \$IOA; done"
The system displays output similar to this:
[root@ems1 deploy]# xdsh ems1,gss_ppc64 "for IOA in \$(lsscsi -g | grep SISIOA |
awk '{print \$NF}'); do iprconfig -c query-ucode-level \$IOA; done"
ems1: 12511700
gssio2: 12511700
gssio1: 12511700
If this system is upgraded from a previous version, you might see a RAID firmware level of 12511400. If the RAID adapter firmware is not at the correct level, contact the IBM Support Center for update instructions.
System check 1c: Make sure 64-bit DMA is enabled for InfiniBand slots
xdsh gss_ppc64,bgqess-mgt1 journalctl -b | grep 64-bit | grep -v dma_rw | grep mlx
The system displays output similar to this:
[root@ems1 gss]# xdsh gss_ppc64,bgqess-mgt1 journalctl -b | grep 64-bit | grep -v dma_rw | grep mlx
gssio1: Feb 13 09:28:34 bgqess-gpfs02.scinet.local kernel: mlx5_core 0000:01:00.0: Using 64-bit direct DMA at offset 800000000000000
gssio1: Feb 13 09:29:02 bgqess-gpfs02.scinet.local kernel: mlx5_core 0004:01:00.0: Using 64-bit direct DMA at offset 800000000000000
gssio1: Feb 13 09:29:30 bgqess-gpfs02.scinet.local kernel: mlx5_core 0009:01:00.0: Using 64-bit direct DMA at offset 800000000000000
gssio2: Jan 30 16:46:55 bgqess-gpfs01.scinet.local kernel: mlx5_core 0000:01:00.0: Using 64-bit direct DMA at offset 800000000000000
gssio2: Jan 30 16:47:23 bgqess-gpfs01.scinet.local kernel: mlx5_core 0004:01:00.0: Using 64-bit direct DMA at offset 800000000000000
gssio2: Jan 30 16:47:50 bgqess-gpfs01.scinet.local kernel: mlx5_core 0009:01:00.0: Using 64-bit direct DMA at offset 800000000000000
mgt1: Jan 26 16:55:41 bgqess-mgt1 kernel: mlx5_core 0004:01:00.0: Using 64-bit direct DMA at offset 800000000000000
Make sure you see all of the InfiniBand devices in this list.
This sample output includes the following device numbers:
0000:01:00.0, 0004:01:00.0, and 0009:01:00.0.
The slot-to-device assignments for the Connect-IB adapter follow:
- Slot
- Device
- C5
- 0009:01:00.0
- C6
- 0004:01:00.0
- C7
- 0000:01:00.0
If a device for a slot where the Connect-IB adapter is installed is not displayed
in the xdsh output, follow these steps:
Make sure the OS or partition is shut down. 
- Click on server on the HMC GUI -> Operations -> Launch ASM.
- On the Welcome pane, specify your user ID and password. The default user ID is admin. The default password is abc123.
- In the navigation area, expand System Configuration -> System -> I/O Adapter Enlarged Capacity.
- Select Enable and specify I/O Adapter Enlarged Capacity 11. This specifies all slots, because the I/O server nodes have 11 slots.
- Save your settings.
Restart the server
so the changes will take effect.
System check 2: run gssfindmissingdisks
Run the gssfindmissingdisks command to verify that the I/O server nodes are cabled properly. This command reports the status of the disk paths. See gssfindmissingdisks command for more information about this command.
drive paths. Run:
gssfindmissingdisks -G gss_ppc64
The system displays output similar to this:
[root@ems1 deploy]# gssfindmissingdisks -G gss_ppc64
2015-06-15T20:27:18.793026 Start find missing disk paths
2015-06-15T20:27:20.556384 nodelist: gssio1 gssio2
2015-06-15T20:27:20.556460 May take long time to complete search of all drive paths
2015-06-15T20:27:20.556501 Checking missing disk paths from node gssio1
gssio1 Enclosure SV45221140 (number 1):
gssio1 Enclosure SV45222733 (number 2):
gssio1: GSS configuration: 2 enclosures, 2 SSDs, 2 empty slots, 118 disks total, 6 NVRAM partitions
2015-06-15T20:27:37.698284 Checking missing disk paths from node gssio2
gssio2 Enclosure SV45221140 (number 1):
gssio2 Enclosure SV45222733 (number 2):
gssio2: GSS configuration: 2 enclosures, 2 SSDs, 2 empty slots, 118 disks total, 6 NVRAM partitions
2015-06-15T20:27:54.827175 Finish search for missing disk paths. Number of missing disk paths: 0
When there are missing drive paths, the command reports possible configuration or
hardware errors:
[root@ems1 setuptools]# ./gssfindmissingdisks -G gss_ppc64
2014-10-28T04:23:45.714124 Start finding missing disks
2014-10-28T04:23:46.984946 nodelist: gssio1 gssio2
2014-10-28T04:23:46.985026 Checking missing disks from node gssio1
gssio1: Enclosure SV24819545 (number undetermined): 4-7
gssio1: Enclosure SV24819545 (number undetermined): 4-9
gssio1: Enclosure SV32300072 (number undetermined): 5-5
2014-10-28T04:25:10.587857 Checking missing disks from node gssio2
gssio2: Enclosure SV24819545 (number undetermined): 2-9
gssio2: Enclosure SV24819545 (number undetermined): 3-4
gssio2: Enclosure SV24819545 (number undetermined): 4-6
2014-10-28T04:26:33.253075 Finish search for missing disks. Number of missing disks: 6
In this example, the path to the disks is different from each I/O server node. Missing
drives are shown in a different node view. It is most likely not a physical drive
issue, but rather a cable or other subsystem issue. scsi3[19.00.00.00] U78CB.001.WZS0043-P1-C2-T1
scsi4[19.00.00.00] U78CB.001.WZS0043-P1-C2-T2 [P1 SV32300072 ESM A (sg67)] [P2
SV24819545 ESM B (sg126)]
scsi5[19.00.00.00] U78CB.001.WZS0043-P1-C3-T1
scsi6[19.00.00.00] U78CB.001.WZS0043-P1-C3-T2 [P2 SV24819545 ESM A (sg187)]
scsi1[19.00.00.00] U78CB.001.WZS0043-P1-C11-T1
scsi2[19.00.00.00] U78CB.001.WZS0043-P1-C11-T2 [P2 SV32300072 ESM B (sg8)]
For information about hardware ports, cabling, PCIe adapter installation, and SSD placement,
see Cabling the Elastic Storage Server.System check 2a: run mmgetpdisktopology
Use the gssfindmissingdisks command to verify the I/O server JBOD disk topology. If gssfindmissingdisks shows one or more errors, run the mmgetpdisktopology and topsummary commands to obtain more detailed information about the storage topology for further analysis. These commands are run from the I/O server nodes. It is a best-practice recommendation to run these commands once on each I/O server node.
For more information about mmgetpdisktopology, see GPFS™: Administration and Programming Reference. For more information about topsummary, see GPFS Native RAID: Administration.
mmgetpdisktopology | topsummary
The system displays output similar to this:
[root@gssio1 ~]# mmgetpdisktopology | topsummary
/usr/lpp/mmfs/bin/topsummary: reading topology from standard input
GSS enclosures found: SV45221140 SV45222733
Enclosure SV45221140 (number 1):
Enclosure SV45221140 ESM A sg188[039A][scsi6 port 2] ESM B sg127[039A][scsi4 port 2]
Enclosure SV45221140 Drawer 1 ESM sg188 12 disks diskset "10026" ESM sg127 12 disks diskset "10026"
Enclosure SV45221140 Drawer 2 ESM sg188 12 disks diskset "51918" ESM sg127 12 disks diskset "51918"
Enclosure SV45221140 Drawer 3 ESM sg188 12 disks diskset "64171" ESM sg127 12 disks diskset "64171"
Enclosure SV45221140 Drawer 4 ESM sg188 12 disks diskset "02764" ESM sg127 12 disks diskset "02764"
Enclosure SV45221140 Drawer 5 ESM sg188 12 disks diskset "34712" ESM sg127 12 disks diskset "34712"
Enclosure SV45221140 sees 60 disks
Enclosure SV45222733 (number 2):
Enclosure SV45222733 ESM A sg68[039A][scsi4 port 1] ESM B sg9[039A][scsi2 port 2]
Enclosure SV45222733 Drawer 1 ESM sg68 11 disks diskset "28567" ESM sg9 11 disks diskset "28567"
Enclosure SV45222733 Drawer 2 ESM sg68 12 disks diskset "04142" ESM sg9 12 disks diskset "04142"
Enclosure SV45222733 Drawer 3 ESM sg68 12 disks diskset "29724" ESM sg9 12 disks diskset "29724"
Enclosure SV45222733 Drawer 4 ESM sg68 12 disks diskset "31554" ESM sg9 12 disks diskset "31554"
Enclosure SV45222733 Drawer 5 ESM sg68 11 disks diskset "13898" ESM sg9 11 disks diskset "13898"
Enclosure SV45222733 sees 58 disks
GSS configuration: 2 enclosures, 2 SSDs, 2 empty slots, 118 disks total, 6 NVRAM partitions
scsi3[20.00.02.00] U78CB.001.WZS06M2-P1-C2-T1
scsi4[20.00.02.00] U78CB.001.WZS06M2-P1-C2-T2 [P1 SV45222733 ESM A (sg68)] [P2 SV45221140 ESM B (sg127)]
scsi5[20.00.02.00] U78CB.001.WZS06M2-P1-C3-T1
scsi6[20.00.02.00] U78CB.001.WZS06M2-P1-C3-T2 [P2 SV45221140 ESM A (sg188)]
scsi0[20.00.02.00] U78CB.001.WZS06M2-P1-C11-T1
scsi2[20.00.02.00] U78CB.001.WZS06M2-P1-C11-T2 [P2 SV45222733 ESM B (sg9)]
Depending on the model and configuration you may see references to enclosure
numbers up to 6. This summary is produced by analyzing the SAS physical topology. - The first line, is a list of the enclosure mid-plane serial numbers, for some enclosure type (DCS3700, for example).
This serial number does not appear anywhere on the enclosure itself.
The second line shows the enclosure ordering based on the cabling.
A system with incorrect cabling will show that the enclosure number is undetermined.
The third line shows the enclosure's serial number, then ESM A and ESM B, each followed
by a SCSI generic device number that is assigned by the host:
The number in the first set of brackets is the code level of the ESM. The ports of the SCSI device are enclosed in the second set of brackets. The SCSI generic device number (sg188 or sg127, for example) is also shown in the gsscheckdisk path output of drive performance and error counter.Enclosure SV45221140 ESM A sg188[039A][scsi6 port 2] ESM B sg127[039A][scsi4 port 2] - Enclosures are numbered physically from bottom to top within a building block. Enclosure 1 is the bottom enclosure; enclosure 6 is the top enclosure.
- Analyze the output:
Each line shows two disk-set numbers, one from ESM A and the other from ESM B.Enclosure SV45221140 (number 1): Enclosure SV45221140 ESM A sg188[039A][scsi6 port 2] ESM B sg127[039A][scsi4 port 2] Enclosure SV45221140 Drawer 1 ESM sg188 12 disks diskset "10026" ESM sg127 12 disks diskset "10026" ^ ^The disk-set number is the checksum of the serial numbers of the drives seen on that path. Checksums that don't match indicate an issue with that path involving an adapter, SAS cable, enclosure ESM, or expanders in the enclosures. If only one disk set is shown, this indicates a complete lack of path, such as a missing cable or ESM.
scsi3[20.00.02.00] U78CB.001.WZS06M2-P1-C2-T1
scsi4[20.00.02.00] U78CB.001.WZS06M2-P1-C2-T2 [P1 SV45222733 ESM A (sg68)] [P2 SV45221140 ESM B (sg127)]
scsi5[20.00.02.00] U78CB.001.WZS06M2-P1-C3-T1
scsi6[20.00.02.00] U78CB.001.WZS06M2-P1-C3-T2 [P2 SV45221140 ESM A (sg188)]
scsi0[20.00.02.00] U78CB.001.WZS06M2-P1-C11-T1
scsi2[20.00.02.00] U78CB.001.WZS06M2-P1-C11-T2 [P2 S45222V733 ESM B (sg9)]
The first two lines represent the SAS adapter in slot C2.
There are two SAS 2300 SCSI Controllers in each adapter card, indicated by T1 and T2. T1 P1 = Port 0
T1 P2 = Port 1
T2 P1 = Port 2
T2 P2 = Port 3
This shows that Port 2 of the adapter in slot C2 is connected to ESM A of enclosure
SV45222733. Similarly, Port 2 of the adapter in slot C11 is connected to ESM B of
enclosure 45222V733.
See
Figure 1
and
Figure 2
for the physical location of ports and ESMs.
System check 3: run gsscheckdisks
The gsscheckdisks command initiates I/O to the drives and can be used to identify marginal drives. This command must be run on a system where there is no GPFS cluster configured. If it is run with a write test on a system where a GPFS cluster is already configured, it will overwrite the cluster configuration data stored in the disk, resulting in cluster and data loss. This command can be run from the management server node or from an I/O server node. The default duration is to run for 30 seconds for each I/O test for each path. For a more thorough test, set the duration to run for 5 minutes (300 seconds) or more.
[root@ems1 deploy]# gsscheckdisks -G gss_ppc64 --disk-list sdx,sdc --iotest a --write-enable
2015-06-15T20:35:53.408621 Start running check disks
gsscheckdisks must run in INSTALL or MFG environment. It may result in data loss
if run in a configured system.
Please rerun with environment GSSENV=INSTALL or GSSENV=MFG to indicate that it is
run in install or manufacturing environment.
Example:
GSSENV=INSTALL gsscheckdisks -N gss_ppc64 --show-enclosure-list
Run gsscheckdisks to verify that disks are in a good state.
GSSENV=INSTALL gsscheckdisks -G gss_ppc64 --encl all --iotest a --write-enable
The system displays output similar to this:
[root@gssio1 ~]# GSSENV=INSTALL gsscheckdisks -G gss_ppc64 --encl all --iotest a --write-enable
2014-11-26T05:30:42.401514 Start running check disks
List of Enclosures found
SV32300072
SV24819545
Taking inventory of disks in enclosure SV32300072.
Taking inventory of disks in enclosure SV24819545.
2014-11-26T05:34:48.317358 Starting r test for 118 of 118 disks. Path: 0, duration 30 secs
2014-11-26T05:35:25.216815 Check disk analysis for r test Complete
2014-11-26T05:35:25.218802 Starting w test for 118 of 118 disks. Path: 0, duration 30 secs
2014-11-26T05:36:02.247192 Check disk analysis for w test Complete
2014-11-26T05:36:02.249225 Starting R test for 118 of 118 disks. Path: 0, duration 30 secs
2014-11-26T05:36:39.384888 Check disk analysis for R test Complete
2014-11-26T05:36:39.386868 Starting W test for 118 of 118 disks. Path: 0, duration 30 secs
2014-11-26T05:37:16.515254 Check disk analysis for W test Complete
2014-11-26T05:37:16.517218 Starting r test for 118 of 118 disks. Path: 1, duration 30 secs
2014-11-26T05:37:53.407486 Check disk analysis for r test Complete
2014-11-26T05:37:53.409601 Starting w test for 118 of 118 disks. Path: 1, duration 30 secs
2014-11-26T05:38:30.421883 Check disk analysis for w test Complete
2014-11-26T05:38:30.423763 Starting R test for 118 of 118 disks. Path: 1, duration 30 secs
2014-11-26T05:39:07.548179 Check disk analysis for R test Complete
2014-11-26T05:39:07.550328 Starting W test for 118 of 118 disks. Path: 1, duration 30 secs
2014-11-26T05:39:44.675574 Check disk analysis for W test Complete
gsscheckdisks displays an error count if any of the drives under test (and path) experience I/O errors. If there are errors on any disks, the output identifes the failing disks. The output details the performance and errors seen by the drives and is saved in the /tmp/checkdisk directory of the management server node (or I/O server node if it is called from there) for further analysis. There are three files in this directory.
hostdiskana[0-1].csv contains summary results of disk I/O throughput of each device every second and a one-line summary of each device showing throughput and error count.
- diskiostat.csv contains details of the /proc/iostat data for every second for offline detailed analysis of disk performance. The format of the data is: column 1: time epoch, column 2: node where run, column 3: device. Columns 4 through 11 are a dump of /proc/iostat.
- deviceerr.csv contains the drive error count. The format of the data: column 1: time epoch, column 2: node where run, column 3: device, column 4: I/O issued, column 5: I/O completed, column 6: io error.
Note: With a default test duration of 30 for each test case and a batch size of 60 drives, it can take up to 20 minutes per node for a GL4 system.
See gsscheckdisks command for more information about this command.
Set up high-speed networking
Set up the high-speed network that will be used for cluster data communication. See Networking: creating a bonded interface for more information.
Choose the hostname that will be associated with the high-speed network IP address. Typically, the hostname associated with the high-speed network is derived from the the xCAT hostname using the prefix and suffix. Before creating the cluster in the next step, high-speed neworking must be configured with the proper IP address and hostname. See Node name considerations for more information.
Create the GPFS cluster
Run the gssgencluster command on the management server to create the cluster. This command creates a GPFS cluster using all of the nodes in the node group if you specify the -G option. You can also provide a list of names using the -N option. The command assigns server licenses to each I/O server node, so it prompts for license acceptance (or use the -accept-license option). It applies the best-practice GPFS configuration parameters for a GNR-based NSD server. At the end of cluster creation, the SAS adapter firmware, storage enclosure firmware, and drive firmware are upgraded if needed. To bypass the firmware update, specify the --no-fw-update option.
Note: This command could take some time to run.
See gssgencluster command for more information about this command.
Note: This command could take some time to run.
mmlscluster
The system displays output similar to this:
[root@gssio1 ~]# mmlscluster
GPFS cluster information
========================
GPFS cluster name: test01.gpfs.net
GPFS cluster id: 14599547031220361759
GPFS UID domain: test01.gpfs.net
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
Repository type: CCR
Node Daemon node name IP address Admin node name Designation
-------------------------------------------------------------------------
1 gssio1-hs.gpfs.net 172.45.45.23 gssio1-hs.gpfs.net quorum-manager
2 gssio2-hs.gpfs.net 172.45.45.24 gssio2-hs.gpfs.net quorum-manager
Verify that the GPFS cluster is active
mmgetstate -a
The system displays output similar to this:
[root@gssio1 ~]# mmgetstate -a
Node number Node name GPFS state
------------------------------------------
1 gssio1-hs active
2 gssio2-hs active
Create the recovery groups
The gssgenclusterrgs command creates the recovery groups (RGs) and the associated log tip vdisk, log backup vdisk, and log home vdisk. This command can create NSDs and file systems for simple configurations that require one file system. More flexibility can be achieved using gssgencluster to create the recovery groups only and using gssgenvdisks (the preferred method) to create data vdisks, metadata vdisks, NSDs, and file systems. For backward compatibility, the gssgenclusterrgs command continues to support vdisk, NSD, and file system creation.
gssgenclusterrgs creates and saves the stanza files for the data and metadata vdisks and NSD. The stanza files are located in the /tmp directory of the first node of the first building block with names node1_node2_vdisk.cfg.save and node1_node2_nsd.cfg.save. These files can be edited for further customization.
If a customized recovery stanza file is available, it can be used to create the recovery group. The files must be located on the first node (in the node list) of each building block in /tmp. Their names must be in the format xxxxL.stanza and yyyyR.stanza, where L is for the left recovery group and R is for the right recovery group. The name of the recovery group is derived from the I/O server node's short name (with prefix and suffix) by adding a prefix of rg_. When the --create-nsds option is specified, by default, 1% of the space is left as reserved and the remaining space is used to create the NSDs. The amount of reserved space is user-selectable and the default is 1% of the total raw space. Note that the percentage of reserved space is based on the total raw space (not on the available space) before any redundancy overhead is applied.
If the system already contains recovery groups and log vdisks (created in the previous steps), their creation can be skipped using the appropriate options. This can be useful when NSDs are recreated (for a change in the number of NSDs or block size, for example).
Note 1: This command could take some time to complete.
Note 2: NSDs in a building block are assigned to the same failure group by default. If you have multiple building blocks, the NSDs defined in each building block will have a different failure group for each building block. Carefully consider this information and change the failure group assignment when you are configuring the system for metadata and data replication.
gssgenclusterrgs -G gss_ppc64 --suffix=-hs
The system displays output similar to this:
[root@ems1 ~]# gssgenclusterrgs -G gss_ppc64 --suffix=-hs
2015-06-16T00:12:22.176357 Determining peer nodes
2015-06-16T00:12:23.786661 nodelist: gssio1 gssio2
2015-06-16T00:12:23.786749 Getting pdisk topology from node to create partner list gssio1
2015-06-16T00:12:38.933425 Getting pdisk topology from node to create partner list gssio2
2015-06-16T00:12:54.049202 Getting pdisk topology from node for recoverygroup creation. gssio1
2015-06-16T00:13:06.466809 Getting pdisk topology from node for recoverygroup creation. gssio2
2015-06-16T00:13:25.289541 Stanza files for node pairs gssio1 gssio2
/tmp/SV45221140L.stanza /tmp/SV45221140R.stanza
2015-06-16T00:13:25.289604 Creating recovery group rg_gssio1-hs
2015-06-16T00:13:48.556966 Creating recovery group rg_gssio2-hs
2015-06-16T00:14:17.627686 Creating log vdisks in recoverygroup rg_gssio1-hs
2015-06-16T00:15:14.117554 Creating log vdisks in recoverygroup rg_gssio2-hs
2015-06-16T00:16:30.267607 Task complete.
See gssgenclusterrgs command for more information about this command.
Verify the recovery group configuration
mmlsrecoverygroup
The system displays output similar to this:
[root@gssio1 ~]# mmlsrecoverygroup
declustered
arrays with
recovery group vdisks vdisks servers
------------------ ----------- ------ -------
rg_gssio1-hs 3 3 gssio1-hs.gpfs.net,gssio2-hs.gpfs.net
rg_gssio2-hs 3 3 gssio2-hs.gpfs.net,gssio1-hs.gpfs.net
- NVR contains the NVRAM devices used for the log tip vdisk
- SSD contains the SSD devices used for the log backup vdisk
- DA1 contains the SSD or HDD devices used for the log home vdisk and file system data
- DAn, where n > 1 (depending on the ESS model), contains the SSD or HDD devices used for file system data.
mmlsrecoverygroup rg_gssio1-hs -L
The system displays output similar to this:
[root@gssio1 ~]# mmlsrecoverygroup rg_gssio1-hs -L
declustered
recovery group arrays vdisks pdisks format version
----------------- ----------- ------ ------ --------------
rg_gssio1-hs 3 3 61 4.1.0.1
declustered needs replace scrub background activity
array service vdisks pdisks spares threshold free space duration task progress priority
----------- ------- ------ ------ ------ --------- ---------- -------- -------------------------
SSD no 1 1 0,0 1 372 GiB 14 days scrub 4% low
NVR no 1 2 0,0 1 3648 MiB 14 days scrub 4% low
DA1 no 1 58 2,31 2 101 TiB 14 days scrub 0% low
declustered checksum
vdisk RAID code array vdisk size block size granularity state remarks
------------------ ------------------ ----------- ---------- ---------- ----------- ----- -------
rg_gssio1_hs_logtip 2WayReplication NVR 48 MiB 2 MiB 4096 ok logTip
rg_ssio1_hs_logtipbackup Unreplicated SSD 48 MiB 2 MiB 4096 ok logTipBackup
rg_gssio1_hs_loghome 4WayReplication DA1 20 GiB 2 MiB 4096 ok log
config data declustered array VCD spares actual rebuild spare space remarks
------------------ ------------------ ------------- --------------------------------- ----------------
rebuild space DA1 31 35 pdisk
config data max disk group fault tolerance actual disk group fault tolerance remarks
------------------ --------------------------------- --------------------------------- ----------------
rg descriptor 4 drawer 4 drawer limiting fault tolerance
system index 1 enclosure + 1 drawer 4 drawer limited by rg descriptor
vdisk max disk group fault tolerance actual disk group fault tolerance remarks
------------------ --------------------------------- --------------------------------- ----------------
rg_gssio1_hs_logtip 1 pdisk 1 pdisk
rg_gssio1_hs_logtipbackup 0 pdisk 0 pdisk
rg_gssio1_hs_loghome 1 enclosure + 1 drawer 3 drawer limited by rg descriptor
active recovery group server servers
----------------------------------------------- -------
gssio1-hs.gpfs.net gssio1-hs.gpfs.net,gssio2-hs.gpfs.net
Create the vdisk stanza
Use gssgenvdisks to create the vdisk stanza file. By default, the vdisk stanza is stored in /tmp/vdisk1.cfg. Optionally, gssgenvdisks can be used to create vdisks, NSDs, and the file system on existing recovery groups. If no recovery groups are specified, all available recovery groups are used. If the command is run on the management server node (or any other node) that is not part of the cluster, a contact node that is part of the cluster must be specified. The contact node must be reachable from the node (the management server node, for example) where the command is run.
You can use this
command to add a suffix to vdisk names, which can be useful when creating multiple
file systems. A unique suffix can be used with a vdisk name to associate it with a different
file system (examples follow). The default reserve capacity is set to 1%.
If the vdisk data block size is less than 8M,
the reserved space should be increased with decreasing data vdisk block size
.
See gssgenvdisks command for more information about this command.
This command can be used to create a shared-root file system for IBM Spectrum Scale protocol nodes. See Adding IBM Spectrum Scale nodes to an ESS cluster for more information.
Note: NSDs that are in the same building block are given the same failure group by default. If file system replication is set to 2 (m=2 or r=2), there should be more than one building block or the failure group of the NSDs must be adjusted accordingly.
In ESS 3.0, gssgenvdisks supports an option for specifying vdisk size in GiB. Note that the vdisk created size can be slightly higher than the size provided as input due to rounding up of the capacity of vdisks.
Reserved space considerations
When all available space is allocated, the reserved space should be increased with decreasing data vdisk block size. A default reserved space of 1% works well for a block size of up to 4 MB. For a 2 MB block size, 2% should be reserved. For a 1 MB block size, reserved space should be increased to 3%.
Example 1:
Create two file systems, one with 20 TB (two vdisks, 10 TB each), and the other with 40 TB (two vdisks, 20 TB each) with a RAID code of 8+3p.
gssgenvdisks --contact-node gssio1 --create-vdisk --create-nsds --create-filesystem
--vdisk-suffix=_fs1 --filesystem-name fs1 --data-vdisk-size 10240
The system displays output similar to this:
[root@ems1 ~]# gssgenvdisks --contact-node gssio1 --create-vdisk --create-nsds --create-filesystem
--vdisk-suffix=_fs1 --filesystem-name fs1 --data-vdisk-size 10240
2015-06-16T00:50:37.254906 Start creating vdisk stanza
vdisk stanza saved in gssio1:/tmp/vdisk1.cfg
2015-06-16T00:50:51.809024 Generating vdisks for nsd creation
2015-06-16T00:51:27.409034 Creating nsds
2015-06-16T00:51:35.266776 Creating filesystem
Filesystem successfully created. Verify failure group of nsds and change as needed.
2015-06-16T00:51:46.688937 Applying data placement policy
2015-06-16T00:51:51.637243 Task complete.
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 246G 2.9G 244G 2% /
devtmpfs 60G 0 60G 0% /dev
tmpfs 60G 0 60G 0% /dev/shm
tmpfs 60G 43M 60G 1% /run
tmpfs 60G 0 60G 0% /sys/fs/cgroup
/dev/sda2 497M 161M 336M 33% /boot
/dev/fs1 21T 160M 21T 1% /gpfs/fs1
The last line shows that file system fs1 was created. gssgenvdisks --contact-node gssio1 --create-vdisk --create-nsds --create-filesystem
--vdisk-suffix=_fs2 --filesystem-name fs2 --data-vdisk-size 20480 --raid-code 8+3p
The system displays output similar to this:
[root@ems1 ~]# gssgenvdisks --contact-node gssio1 --create-vdisk --create-nsds --create-filesystem
--vdisk-suffix=_fs2 --filesystem-name fs2 --data-vdisk-size 20480 --raid-code 8+3p
2015-06-16T01:06:59.929580 Start creating vdisk stanza
vdisk stanza saved in gssio1:/tmp/vdisk1.cfg
2015-06-16T01:07:13.019100 Generating vdisks for nsd creation
2015-06-16T01:07:56.688530 Creating nsds
2015-06-16T01:08:04.516814 Creating filesystem
Filesystem successfully created. Verify failure group of nsds and change as needed.
2015-06-16T01:08:16.613198 Applying data placement policy
2015-06-16T01:08:21.637298 Task complete.
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 246G 2.9G 244G 2% /
devtmpfs 60G 0 60G 0% /dev
tmpfs 60G 0 60G 0% /dev/shm
tmpfs 60G 43M 60G 1% /run
tmpfs 60G 0 60G 0% /sys/fs/cgroup
/dev/sda2 497M 161M 336M 33% /boot
/dev/fs1 21T 160M 21T 1% /gpfs/fs1
/dev/fs2 41T 160M 41T 1% /gpfs/fs2
The last line shows that file system fs2 was created. mmlsvdisk
The system displays output similar to this:
[root@gssio1 ~]# mmlsvdisk
declustered block size
vdisk name RAID code recovery group array in KiB remarks
------------------ --------------- ------------------ ----------- ---------- -------
rg_gssio1_hs_Data_8M_2p_1_fs1 8+2p rg_gssio1-hs DA1 8192
rg_gssio1_hs_Data_8M_3p_1_fs2 8+3p rg_gssio1-hs DA1 8192
rg_gssio1_hs_MetaData_8M_2p_1_fs1 3WayReplication rg_gssio1-hs DA1 1024
rg_gssio1_hs_MetaData_8M_3p_1_fs2 4WayReplication rg_gssio1-hs DA1 1024
rg_gssio1_hs_loghome 4WayReplication rg_gssio1-hs DA1 2048 log
rg_gssio1_hs_logtip 2WayReplication rg_gssio1-hs NVR 2048 logTip
rg_gssio1_hs_logtipbackup Unreplicated rg_gssio1-hs SSD 2048 logTipBackup
rg_gssio2_hs_Data_8M_2p_1_fs1 8+2p rg_gssio2-hs DA1 8192
rg_gssio2_hs_Data_8M_3p_1_fs2 8+3p rg_gssio2-hs DA1 8192
rg_gssio2_hs_MetaData_8M_2p_1_fs1 3WayReplication rg_gssio2-hs DA1 1024
rg_gssio2_hs_MetaData_8M_3p_1_fs2 4WayReplication rg_gssio2-hs DA1 1024
rg_gssio2_hs_loghome 4WayReplication rg_gssio2-hs DA1 2048 log
rg_gssio2_hs_logtip 2WayReplication rg_gssio2-hs NVR 2048 logTip
rg_gssio2_hs_logtipbackup Unreplicated rg_gssio2-hs SSD 2048 logTipBackup
Example 2a:
for all of the other options,
run:
vim /var/log/gss/gssinstall.log
gssgenvdisks --contact-node gssio1 --create-vdisk --create-filesystem --data-blocksize 1M
--reserved-space 3
The system displays output similar to this:
[root@ems1 ~]# vim /var/log/gss/gssinstall.log
[root@ems1 ~]# gssgenvdisks --contact-node gssio1 --create-vdisk --create-filesystem
--data-blocksize 1M --reserved-space 3
2015-06-16T01:49:07.963323 Start creating vdisk stanza
vdisk stanza saved in gssio1:/tmp/vdisk1.cfg
2015-06-16T01:49:21.210383 Generating vdisks for nsd creation
2015-06-16T01:52:19.688953 Creating nsds
2015-06-16T01:52:27.766494 Creating filesystem
Filesystem successfully created. Verify failure group of nsds and change as needed.
2015-06-16T01:52:47.249103 Applying data placement policy
2015-06-16T01:52:51.896720 Task complete.
Example 2b:
2% reserved space, and the default settings for all of the other options,
run:
gssgenvdisks --contact-node gssio1 --create-vdisk --create-filesystem --data-blocksize 4M --reserved-space 2
The system displays output similar to this:
[root@ems1 ~]# gssgenvdisks --contact-node gssio1 --create-vdisk --create-filesystem --data-blocksize 4M
--reserved-space 2
2015-06-16T01:25:54.455588 Start creating vdisk stanzavdisk stanza saved in gssio1:/tmp/vdisk1.cfg
2015-06-16T01:26:07.443263 Generating vdisks for nsd creation
2015-06-16T01:27:46.671050 Creating nsds
2015-06-16T01:27:54.296765 Creating filesystem
Filesystem successfully created. Verify failure group of nsds and change as needed.
2015-06-16T01:28:07.279192 Applying data placement policy
2015-06-16T01:28:11.836822 Task complete.
Example 3:
Suppose you want to create three file systems. The first file system is called fsystem0.
Keep 66% of the space reserved for future file system creation.
For the second file system, fsystem1, keep 33% reserved.
For the third file system, fsystem2, keep 1% reserved.
Because you are going to create multiple file systems, you must specify a unique suffix for vdisk creation.
Specify _fs0 as the suffix
of the vdisk name for the first file system.
Specify a RAID code of 8+3p for data vdisks.
gssgenvdisks --create-vdisk --vdisk-suffix _fs0 --raid-code 8+3p --create-filesystem
--filesystem-name fsystem0 --reserved-space-percent 66
The system displays output similar to this:
[root@ems1 ~]# gssgenvdisks --create-vdisk --vdisk-suffix _fs0 --raid-code 8+3p --create-filesystem
--filesystem-name fsystem0 --reserved-space-percent 66
2015-03-13T07:04:12.703294 Start creating vdisk stanza
2015-03-13T07:04:12.703364 No contact node provided. Using current node. ems1
vdisk stanza saved in ems1:/tmp/vdisk1.cfg
2015-03-13T07:04:33.088067 Generating vdisks for nsd creation
2015-03-13T07:05:44.648360 Creating nsds
2015-03-13T07:05:53.517659 Creating filesystem
Filesystem successfully created. Verify failure group of nsds and change as needed.
2015-03-13T07:06:07.416392 Applying data placement policy
2015-03-13T07:06:12.748168 Task complete.
gssgenvdisks --create-vdisk --vdisk-suffix _fs1 --raid-code 8+3p --create-filesystem --filesystem-name
fsystem1 --reserved-space-percent 33
The system displays output similar to this:
[root@ems1 ~]# gssgenvdisks --create-vdisk --vdisk-suffix _fs1 --raid-code 8+3p --create-filesystem --filesystem-name
fsystem1 --reserved-space-percent 33
2015-03-13T07:11:14.649102 Start creating vdisk stanza
2015-03-13T07:11:14.649189 No contact node provided. Using current node. ems1
vdisk stanza saved in ems1:/tmp/vdisk1.cfg
2015-03-13T07:11:34.998352 Generating vdisks for nsd creation
2015-03-13T07:12:46.858365 Creating nsds
2015-03-13T07:12:55.416322 Creating filesystem
Filesystem successfully created. Verify failure group of nsds and change as needed.
2015-03-13T07:13:09.488075 Applying data placement policy
2015-03-13T07:13:14.756651 Task complete.
gssgenvdisks --create-vdisk --vdisk-suffix _fs2 --raid-code 8+3p --create-filesystem --filesystem-name
fsystem2 --reserved-space-percent 1
The system displays output similar to this:
[root@ems1 ~]# gssgenvdisks --create-vdisk --vdisk-suffix _fs2 --raid-code 8+3p --create-filesystem --filesystem-name
fsystem2 --reserved-space-percent 1
2015-03-13T07:13:37.191809 Start creating vdisk stanza
2015-03-13T07:13:37.191886 No contact node provided. Using current node. ems1
vdisk stanza saved in ems1:/tmp/vdisk1.cfg
2015-03-13T07:13:57.548238 Generating vdisks for nsd creation
2015-03-13T07:15:08.838311 Creating nsds
2015-03-13T07:15:16.666115 Creating filesystem
Filesystem successfully created. Verify failure group of nsds and change as needed.
2015-03-13T07:15:30.532905 Applying data placement policy
2015-03-13T07:15:35.876333 Task complete.
mmlsvdisk
The system displays output similar to this:
[root@ems1 ~]# mmlsvdisk
declustered block size
vdisk name RAID code recovery group array in KiB remarks
------------------ --------------- ------------------ ----------- ---------- -------
rg_gssio1_hs_Data_8M_3p_1_fs0 8+3p rg_gssio1-hs DA1 8192
rg_gssio1_hs_Data_8M_3p_1_fs1 8+3p rg_gssio1-hs DA1 8192
rg_gssio1_hs_Data_8M_3p_1_fs2 8+3p rg_gssio1-hs DA1 8192
rg_gssio1_hs_MetaData_8M_3p_1_fs0 4WayReplication rg_gssio1-hs DA1 1024
rg_gssio1_hs_MetaData_8M_3p_1_fs1 4WayReplication rg_gssio1-hs DA1 1024
rg_gssio1_hs_MetaData_8M_3p_1_fs2 4WayReplication rg_gssio1-hs DA1 1024
rg_gssio1_hs_loghome 4WayReplication rg_gssio1-hs DA1 2048 log
rg_gssio1_hs_logtip 2WayReplication rg_gssio1-hs NVR 2048 logTip
rg_gssio1_hs_logtipbackup Unreplicated rg_gssio1-hs SSD 2048 logTipBackup
rg_gssio2_hs_Data_8M_3p_1_fs0 8+3p rg_gssio2-hs DA1 8192
rg_gssio2_hs_Data_8M_3p_1_fs1 8+3p rg_gssio2-hs DA1 8192
rg_gssio2_hs_Data_8M_3p_1_fs2 8+3p rg_gssio2-hs DA1 8192
rg_gssio2_hs_MetaData_8M_3p_1_fs0 4WayReplication rg_gssio2-hs DA1 1024
rg_gssio2_hs_MetaData_8M_3p_1_fs1 4WayReplication rg_gssio2-hs DA1 1024
rg_gssio2_hs_MetaData_8M_3p_1_fs2 4WayReplication rg_gssio2-hs DA1 1024
rg_gssio2_hs_loghome 4WayReplication rg_gssio2-hs DA1 2048 log
rg_gssio2_hs_logtip 2WayReplication rg_gssio2-hs NVR 2048 logTip
rg_gssio2_hs_logtipbackup Unreplicated rg_gssio2-hs SSD 2048 logTipBackup
mmlsnsd
The system displays output similar to this:
[root@ems1 ~]# mmlsnsd
File system Disk name NSD servers
---------------------------------------------------------------------------
fsystem0 rg_gssio1_hs_Data_8M_3p_1_fs0 gssio1-hs,gssio2-hs
fsystem0 rg_gssio1_hs_MetaData_8M_3p_1_fs0 gssio1-hs,gssio2-hs
fsystem0 rg_gssio2_hs_Data_8M_3p_1_fs0 gssio2-hs,gssio1-hs
fsystem0 rg_gssio2_hs_MetaData_8M_3p_1_fs0 gssio2-hs,gssio1-hs
fsystem1 rg_gssio1_hs_Data_8M_3p_1_fs1 gssio1-hs,gssio2-hs
fsystem1 rg_gssio1_hs_MetaData_8M_3p_1_fs1 gssio1-hs,gssio2-hs
fsystem1 rg_gssio2_hs_Data_8M_3p_1_fs1 gssio2-hs,gssio1-hs
fsystem1 rg_gssio2_hs_MetaData_8M_3p_1_fs1 gssio2-hs,gssio1-hs
fsystem2 rg_gssio1_hs_Data_8M_3p_1_fs2 gssio1-hs,gssio2-hs
fsystem2 rg_gssio1_hs_MetaData_8M_3p_1_fs2 gssio1-hs,gssio2-hs
fsystem2 rg_gssio2_hs_Data_8M_3p_1_fs2 gssio2-hs,gssio1-hs
fsystem2 rg_gssio2_hs_MetaData_8M_3p_1_fs2 gssio2-hs,gssio1-hs
Check the file system configuration
mmlsfs all
The system displays output similar to this:
[root@gssio1 ~]# mmlsfs all
File system attributes for /dev/gpfs0:
======================================
flag value description
------------------- ------------------------ -----------------------------------
-f 32768 Minimum fragment size in bytes (system pool)
262144 Minimum fragment size in bytes (other pools)
-i 4096 Inode size in bytes
-I 32768 Indirect block size in bytes
-m 1 Default number of metadata replicas
-M 2 Maximum number of metadata replicas
-r 1 Default number of data replicas
-R 2 Maximum number of data replicas
-j scatter Block allocation type
-D nfs4 File locking semantics in effect
-k all ACL semantics in effect
-n 32 Estimated number of nodes that will mount file system
-B 1048576 Block size (system pool)
8388608 Block size (other pools)
-Q none Quotas accounting enabled
none Quotas enforced
none Default quotas enabled
--perfileset-quota No Per-fileset quota enforcement
--filesetdf No Fileset df enabled?
-V 14.10 (4.1.0.4) File system version
--create-time Tue Jun 16 02:49:45 2015 File system creation time
-z No Is DMAPI enabled?
-L 4194304 Logfile size
-E Yes Exact mtime mount option
-S No Suppress atime mount option
-K whenpossible Strict replica allocation option
--fastea Yes Fast external attributes enabled?
--encryption No Encryption enabled?
--inode-limit 134217728 Maximum number of inodes
--log-replicas 0 Number of log replicas
--is4KAligned Yes is4KAligned?
--rapid-repair Yes rapidRepair enabled?
--write-cache-threshold 0 HAWC Threshold (max 65536)
-P system;data Disk storage pools in file system
-d rg_gssio1_hs_Data_8M_2p_1; Disks in file system
rg_gssio1_hs_MetaData_8M_2p_1;
rg_gssio2_hs_Data_8M_2p_1;
rg_gssio2_hs_MetaData_8M_2p_1
-A yes Automatic mount option
-o none Additional mount options
-T /gpfs/gpfs0 Default mount point
--mount-priority 0 Mount priority
Mount the file system
mmmount device -a
where device is the name of the file system.
The default file system name is gpfs0.
For example, run:
mmmount gpfs0 -a
To check whether the file system is mounted properly, run:
mmlsmount gpfs0 -L
The system displays output similar to this:
[root@gssio1 ~]# mmlsmount gpfs0 -L
File system gpfs0 is mounted on 2 nodes:
172.45.45.23 gssio1-hs
172.45.45.24 gssio2-hs
To check file system space usage, run:
df
The system displays output similar to this:
[root@gssio1 ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 257922000 2943152 254978848 2% /
devtmpfs 62265728 0 62265728 0% /dev
tmpfs 62302080 0 62302080 0% /dev/shm
tmpfs 62302080 43584 62258496 1% /run
tmpfs 62302080 0 62302080 0% /sys/fs/cgroup
/dev/sda2 508588 164580 344008 33% /boot
/dev/gpfs0 154148405248 163840 154148241408 1% /gpfs/gpfs0
Initially after creation, you might see that the file system use is at 99%, temporarily.
Test the file system using gpfsperf
/usr/lpp/mmfs/samples/perf/gpfsperf create seq /gpfs/gpfs0/testfile1 -n 200G -r 16M -th 4
The system displays output similar to this:
[root@gssio1 ~]# /usr/lpp/mmfs/samples/perf/gpfsperf create seq /gpfs/gpfs0/testfile1 -n 200G -r 16M -th
32
/usr/lpp/mmfs/samples/perf/gpfsperf create seq /gpfs/gpfs0/testfile1
recSize 16M nBytes 200G fileSize 16G
nProcesses 1 nThreadsPerProcess
32
file cache flushed before test
not using direct I/O
offsets accessed will cycle through the same file segment
not using shared memory buffer
not releasing byte-range token after open
no fsync at end of test
Data rate was 4689394.83 Kbytes/sec, thread utilization 0.925
The block size must match the data vdisk block size.
To verify that the ESS is operating as expected, you can use gpfsperf to run other I/O tests such as read and write.
/usr/lpp/mmfs/samples/perf/gpfsperf
Add nodes to the cluster
updatenode ManagementServerNodeName -P gss_updatenode
reboot
updatenode ManagementServerNodeName -P gss_ofed
reboot
The I/O server nodes must be deployed properly and the high-speed network configured before gssaddnode can be used to add these nodes to the ESS cluster. gssaddnode adds the nodes to the cluster, runs the product license acceptance tool, configures the nodes (using gssServerConfig.sh or gssClientConfig.sh), and updates the host adapter, enclosure, and drive firmware. Do not use gssaddnode to add non-ESS (I/O server or management server) nodes to the cluster. Use mmaddnode instead.
On the gssaddnode command, the -N ADD-NODE-LIST option specifies the list of nodes that are being added. For the management server node, it is that node's hostname. The --nodetype option specifies the type of node that is being added. For the management server node, the value is ems. This command must run on the management server node when that node is being added. This command can be also used to add I/O server nodes to an existing cluster.
See gssaddnode command for more information about this command, including an example.
mmlscluster
The system displays output similar to this:
[root@ems1 ~]# mmlscluster
GPFS cluster information
========================
GPFS cluster name: test01.gpfs.net
GPFS cluster id: 14599547031220361759
GPFS UID domain: test01.gpfs.net
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
Repository type: CCR
Node Daemon node name IP address Admin node name Designation
-------------------------------------------------------------------------
1 gssio1-hs.gpfs.net 172.45.45.23 gssio1-hs.gpfs.net quorum-manager
2 gssio2-hs.gpfs.net 172.45.45.24 gssio2-hs.gpfs.net quorum-manager
5 ems1-hs.gpfs.net 172.45.45.22 ems1-hs.gpfs.net quorum
Check the installed software
Verify that the key components are installed correctly. See Checking the code levels.
Run a stress test
After the system is configured correctly and all marginal components are out of the system, run a stress test to stress the disk and network elements. Use the gssstress command to run a stress test on the system.
Note: gssstress is not a performance tool, so performance numbers shown should not be interpreted as performance of the system.
gssstress /gpfs/gpfs0 gssio1 gssio2
The system displays output similar to this:
[root@ems1 ~]# gssstress /gpfs/gpfs0 gssio1 gssio2
1 gssio1 create
1 gssio2 create
Waiting for 1 create to finish
create seq /gpfs/gpfs0/stressFile.1.gssio1 16777216 214748364800 214748364800 1 16 0 1 0 0 1 1 0 0 0 1728569.28 0.980
create seq /gpfs/gpfs0/stressFile.1.gssio2 16777216 214748364800 214748364800 1 16 0 1 0 0 1 1 0 0 0 1706918.52 0.981
1 gssio1 read
1 gssio2 read
Waiting for 1 read to finish
read seq /gpfs/gpfs0/stressFile.1.gssio1 16777216 214748364800 214748364800 1 16 0 1 0 0 1 1 0 0 0 2776149.11 0.997
read seq /gpfs/gpfs0/stressFile.1.gssio2 16777216 214748364800 214748364800 1 16 0 1 0 0 1 1 0 0 0 2776185.62 0.998
1 gssio1 write
1 gssio2 write
Waiting for 1 write to finish
write seq /gpfs/gpfs0/stressFile.1.gssio2 16777216 214748364800 214748364800 1 16 0 1 0 0 1 1 0 0 0 1735661.04 0.971
write seq /gpfs/gpfs0/stressFile.1.gssio1 16777216 214748364800 214748364800 1 16 0 1 0 0 1 1 0 0 0 1733622.96 0.971
1 gssio1 read
1 gssio2 read
Waiting for 1 read to finish
read seq /gpfs/gpfs0/stressFile.1.gssio1 16777216 214748364800 214748364800 1 16 0 1 0 0 1 1 0 0 0 2774776.83 0.997
read seq /gpfs/gpfs0/stressFile.1.gssio2 16777216 214748364800 214748364800 1 16 0 1 0 0 1 1 0 0 0 2770247.35 0.998
gpfsperf is run with the nolabels option, which produces one line of output for each test.
The format of the output is: operation, I/O pattern, file name, record size, number of bytes, file size,
number of processes, number of threads, stride records, inv, dio, shm, fsync, cycle, reltoken, aio, osync, rate, util. Throughput is shown in the second field from the end of the line, as shown in bold typeface in the example. While the gssstress is running, you can log on to each node and run dstat to view the disk and network load in the node.

Note: By default, each iteration read and writes 800 GB. With 20 iterations, it will perform a total of 16 TB of I/O from each node and therefore could take some time to complete. For a shorter completion time, specify a lower iteration number, a shorter operation list, or both. The test can be interrupted by pressing <Ctrl-c>.
Dec 28 18:38:16 gssio5 kernel: sd 4:0:74:0: [sdin] CDB:
Dec 28 18:38:16 gssio5 kernel: Read(32): 7f 00 00 00 00 00 00 18 00 09 20 00 00 00 00 00 10 24 b4 90 10 24 b4 90 00 00 00 00 00 00 04 10
Dec 28 18:38:16 gssio5 kernel: end_request: critical medium error, dev sdin, sector 270840976
Dec 28 18:38:16 gssio5 mmfs: [E] Pdisk e1d2s03 of RG gssio5-hs path /dev/sdin: I/O error on read: sector 270840976 length 4112 err 5.
At the end of the stress test, check the enclosures and disks for any errors. Check the enclosures
mmlsenclosure all
The system displays output similar to this:
[root@gssio1 gpfs0]# mmlsenclosure all
needs
serial number service nodes
------------- ------- ------
SV24819545 no gssio1-ib0.data.net.gpfs.net
SV32300072 no gssio1-ib0.data.net.gpfs.net
mmlsenclosure SV24819545 -L -N all
The system displays output similar to this:
[root@gssio1 gpfs0]# mmlsenclosure SV24819545 -L -N all
needs
serial number service nodes
------------- ------- ------
SV24819545 no gssio1-ib0.data.net.gpfs.net,gssio2-ib0.data.net.gpfs.net
component type serial number component id failed value unit properties
-------------- ------------- ------------ ------ ----- ---- ----------
dcm SV24819545 DCM_0A no
dcm SV24819545 DCM_0B no
dcm SV24819545 DCM_1A no
dcm SV24819545 DCM_1B no
dcm SV24819545 DCM_2A no
dcm SV24819545 DCM_2B no
dcm SV24819545 DCM_3A no
dcm SV24819545 DCM_3B no
dcm SV24819545 DCM_4A no
dcm SV24819545 DCM_4B no
component type serial number component id failed value unit properties
-------------- ------------- ------------ ------ ----- ---- ----------
enclosure SV24819545 ONLY no
component type serial number component id failed value unit properties
-------------- ------------- ------------ ------ ----- ---- ----------
esm SV24819545 ESM_A no REPORTER
esm SV24819545 ESM_B no NOT_REPORTER
component type serial number component id failed value unit properties
-------------- ------------- ------------ ------ ----- ---- ----------
fan SV24819545 0_TOP_LEFT no 4890 RPM
fan SV24819545 1_BOT_LEFT no 4940 RPM
fan SV24819545 2_BOT_RGHT no 4890 RPM
fan SV24819545 3_TOP_RGHT no 5040 RPM
component type serial number component id failed value unit properties
-------------- ------------- ------------ ------ ----- ---- ----------
powerSupply SV24819545 0_TOP no
powerSupply SV24819545 1_BOT no
component type serial number component id failed value unit properties
-------------- ------------- ------------ ------ ----- ---- ----------
tempSensor SV24819545 DCM_0A no 46 C
tempSensor SV24819545 DCM_0B no 38 C
tempSensor SV24819545 DCM_1A no 47 C
tempSensor SV24819545 DCM_1B no 40 C
tempSensor SV24819545 DCM_2A no 45 C
tempSensor SV24819545 DCM_2B no 40 C
tempSensor SV24819545 DCM_3A no 45 C
tempSensor SV24819545 DCM_3B no 37 C
tempSensor SV24819545 DCM_4A no 45 C
tempSensor SV24819545 DCM_4B no 40 C
tempSensor SV24819545 ESM_A no 39 C
tempSensor SV24819545 ESM_B no 41 C
tempSensor SV24819545 POWERSUPPLY_BOT no 39 C
tempSensor SV24819545 POWERSUPPLY_TOP no 36 C
component type serial number component id failed value unit properties
-------------- ------------- ------------ ------ ----- ---- ----------
voltageSensor SV24819545 12v no 12 V
voltageSensor SV24819545 ESM_A_1_0v no 0.98 V
voltageSensor SV24819545 ESM_A_1_2v no 1.19 V
voltageSensor SV24819545 ESM_A_3_3v no 3.31 V
voltageSensor SV24819545 ESM_A_5v no 5.04 V
voltageSensor SV24819545 ESM_B_1_0v no 1 V
voltageSensor SV24819545 ESM_B_1_2v no 1.19 V
voltageSensor SV24819545 ESM_B_3_3v no 3.31 V
voltageSensor SV24819545 ESM_B_5v no 5.07 V
Check for failed disks
mmlspdisk all --not-ok
The system displays output similar to this:
[root@gssio1]# mmlspdisk all --not-ok
pdisk:
replacementPriority = 7.34
name = "e1d2s01"
device = ""
recoveryGroup = "gssio1"
declusteredArray = "DA1"
state = "failing/noPath/systemDrain/noRGD/noVCD/noData"
capacity = 2000381018112
freeSpace = 1999307276288
fru = "42D0768"
location = "SV12616682-2-1"
WWN = "naa.5000C500262630DF"
server = "gssio1.gpfs.net"
reads = 295
writes = 915
bytesReadInGiB = 0.576
bytesWrittenInGiB = 1.157
IOErrors = 0
IOTimeouts = 0
mediaErrors = 0
checksumErrors = 0
pathErrors = 0
relativePerformance = 1.003
dataBadness = 0.000
rgIndex = 9
userLocation = "Enclosure SV12616682 Drawer 2 Slot 1"
userCondition = "replaceable"
hardware = "IBM-ESXS ST32000444SS BC2B 9WM40AQ10000C1295TH8"
hardwareType = Rotating 7200
nPaths = 0 active 0 total
mmlspdisk displays the details of the failed or failing disk, including the pdisk name, the
enclosure (serial number), and the location of the disk. Replacing a disk
If a disk fails and needs to be replaced, follow the proper disk replacement procedure. Improper disk replacement could greatly increase the possibility of data loss. Use the mmchcarrier to replace a failed pdisk. See GPFS Native RAID: Administration for more information.
Run gnrhealthcheck
gnrhealthcheck
The system displays output similar to this:
[root@gssio1 gpfs0]# gnrhealthcheck
################################################################
# Beginning topology checks.
################################################################
Topology checks successful.
################################################################
# Beginning enclosure checks.
################################################################
Enclosure checks successful.
################################################################
# Beginning recovery group checks.
################################################################
Recovery group checks successful.
################################################################
# Beginning pdisk checks.
################################################################
Pdisk checks successful.
Collecting data
gsssnap
command to collect vdisk information.
Save the output with an identifier so that it can be mapped to the installed system.
Run the following command from any I/O server node:

gsssnap
The configuration and service data collected at the end of the installation can be very
valuable during future problem determination and troubleshooting.
Send the collected service data to your IBM representative. See gsssnap command for more information.
Cleaning up the system
If you need to perform a quick cleanup of the system, follow these steps:
- ssh to any I/O server node
- To delete the file system and the associated NSDs and vdisks, run:
/opt/ibm/gss/tools/samples/gssdelvdisks - To shut down GPFS and delete the cluster, run:
mmshutdown -a mmdelnode -N all
