
Troubleshooting
This topic includes troubleshooting information for ESS.
If I/O server node installation fails
makeconservercf
nodeset gss_ppc64 osimage=rhels7.1-ppc64-install-gss
rnetboot gss_ppc64 -V
Red Hat Enterprise Linux update considerations
ESS 3.0 supports Red Hat Enterprise Linux 7.1 (kernel release 3.10.0-229.el7.ppc64). You can update Red Hat Enterprise Linux as needed to address security updates. It is highly recommended that you limit errata updates applied to the Red Hat Enterprise Linux operating system used in the ESS solution to security errata or errata updates requested by service.
Information about a possible update issue follows.
Issue: A yum update command upgrades the Red Hat Enterprise Linux version.
If you are subscribed to Red Hat updates and run yum update, the redhat-release-server package might be updated as well. This could cause issues with OFED installation (mlnxofedinstall, for example) on ESS nodes, because the Red Hat version is not in the installer's supported distribution list.
See Red Hat's solution articles for information about this behavior:
https://access.redhat.com/solutions/33807
https://access.redhat.com/solutions/10185
Resolution:
yum downgrade redhat-release-server
Prevention:
If possible, limit running yum update to security-related errata.
See Red Hat's solution article about applying security updates only:
https://access.redhat.com/solutions/10021
ESS 3.0 issues
Depending on which fix level you are installing, these might or might not apply to you.
| Issue | Environment affected | Description | Resolution or action |
|---|---|---|---|
| 1. The gssaddnode command fails when trying to add a new node to a cluster so that the total number of nodes is greater than seven. | Clustering
Deployment type: installation or upgrade GPFS edition: Advanced or Standard Affected nodes: I/O server, management server |
The gssaddnode command allows users to easily add the management server and any new I/O server
nodes to a cluster.
If you already have an existing GPFS cluster in which the addition of nodes causes the total number of nodes in the cluster to exceed seven, the command will fail due to an array index out of bounds exception. |
Here is a workaround for this issue:
|
| 2. The gssgenclusterrgs command might fail in configurations with multiple building blocks. | Recovery group creation with the gssgenclusterrgs command
Deployment type: installation or upgrade GPFS edition: Advanced or Standard Affected nodes: I/O server |
In configurations in which building block host names do not follow a sorted order (for example: gssio1, gssio2, gssio3, gssio4), the gssgenclusterrgs command might fail with messages that the partner node cannot be found. | Run the gssgenclusterrgs command. If it fails re-run it with the -N option with one node of one building block at a time. |
| 3. The sg module fails to load after an upgrade to ESS 3.0. | Hardware validation, firmware updates, ESS GUI
Deployment type: upgrade GPFS edition: Advanced or Standard Affected nodes: I/O server |
The Linux SCSI Generic (sg) kernel module is required by various commands and components to send SCSI commands to devices that understand them. During the upgrade to ESS 3.0, this module is not loaded, which could cause the inability to update firmware, validate hardware topology, and limit GUI functionality. | Here is a workaround for this issue:
Before updating host adapter firmware on each I/O server node as part of the upgrade, run these commands from the management server: xdsh IoNode "modprobe sg" xdsh IoNode "echo "sg" > /etc/modules-load.d/gss.conf" To validate, run these commands from the I/O server node: lsmod | grep sg cat /etc/modules-load.d/gss.conf Now load the sg module and insert it into the gss.conf file to enable it automatically when rebooting. |
| 4. An update of the management server node failed due to a conflict with an older version ESS GUI RPM. | Cluster software upgrade
Deployment type: upgrade GPFS edition: Advanced Affected nodes: management server |
In the GPFS Advanced Edition on ESS 3.0, the gpfs.gss.gui RPM
from previous releases causes a conflict when installing the latest version (GPFS 4.1.0.8).
The result is a failed update of the management server node to the latest software release.
After updating your management server node using the updatenode MgtServerNode -V -P gss_updatenode command,
you will see an error similar to the following:
If you see this issue, apply the workaround. |
To work around this issue, run updatenode again. This will fix the conflict, install the new gpfs.gui RPM, and upgrade the node to the latest ESS 3.0 code. You will then be prompted to reboot in order for the kernel update to complete. After rebooting, run updatenode again complete the upgrade process. |
| 5. GPFS 4.1.0.8 file systems fail to mount on Red Hat Enterprise Linux 7.1 nodes due to a systemd issue. | Cluster file system
Deployment type: installation or upgrade GPFS edition: Advanced or Standard Affected nodes: I/O server, management server |
ESS 3.0 contains GPFS 4.1.0.8, which has a problem mounting on Red Hat Enterprise Linux 7.1 due to an issue with systemd. Updating cluster nodes to the latest systemd packages will correct the issue. | To work around this issue,
follow these steps:
The advisory can be found here: https://rhn.redhat.com/errata/RHBA-2015-0738.html |
