GPFS commands are unsuccessful
GPFS commands can be unsuccessful for various reasons.
- Return codes indicating the GPFS daemon is no longer running.
- Command specific problems indicating you are unable to access the disks.
- A nonzero return code from the GPFS command.
- If all commands
are generically unsuccessful, this may be due to a daemon failure. Verify that the GPFS daemon is active. Issue:
If the daemon is not active, check /var/adm/ras/mmfs.log.latest and /var/adm/ras/mmfs.log.previous on the local node and on the file system manager node. These files enumerate the failing sequence of the GPFS daemon.mmgetstate
If there is a communication failure with the file system manager node, you receive an error and the errno global variable may be set to EIO (I/O error).
- Verify the GPFS cluster configuration data files are not locked and are accessible. To determine if the GPFS cluster configuration data files are locked, see GPFS cluster configuration data file issues.
- The
ssh command is not functioning correctly. See Authorization problems. If ssh is not functioning properly on a node in the GPFS cluster, a GPFS administration command that needs to run work on that node fails with a 'permission is denied' error. The system displays information similar to:
These messages indicate that ssh is not working properly on nodes k145n01 and k145n02.mmlscluster sshd: 0826-813 Permission is denied. mmdsh: 6027-1615 k145n02 remote shell process had return code 1. mmlscluster: 6027-1591 Attention: Unable to retrieve GPFS cluster files from node k145n02 sshd: 0826-813 Permission is denied. mmdsh: 6027-1615 k145n01 remote shell process had return code 1. mmlscluster: 6027-1592 Unable to retrieve GPFS cluster files from node k145n01
If you encounter this type of failure, determine why ssh is not working on the identified node. Then fix the problem.
- Most problems encountered during file system creation
fall into three classes:
- You did not create network shared disks which are required to build the file system.
- The creation operation cannot access the disk.
Follow the procedures for checking access to the disk. This can result from a number of factors including those described in NSD and underlying disk subsystem failures.
- Unsuccessful attempt to communicate with the file system manager.
The file system creation runs on the file system manager node. If that node goes down, the mmcrfs command may not succeed.
- If the mmdelnode command was unsuccessful
and you plan to permanently de-install GPFS
from a node, you should first remove the node from the cluster. If this is not done and you run the
mmdelnode command after the mmfs code is
removed, the command fails and displays a message similar to this example:
If this happens, power off the node and run the mmdelnode command again.Verifying GPFS is stopped on all affected nodes ... k145n05: ksh: /usr/lpp/mmfs/bin/mmremote: not found.
- If you have successfully installed and are operating with the latest level of GPFS, but cannot run the new functions available,
it is probable that you have not issued the mmchfs -V full or
mmchfs -V compat command to change the version of the file system. This
command must be issued for each of your file systems.
In addition to mmchfs -V, you may need to run the mmmigratefs command.
Note: For more information, see Upgrading. You must ensure that all nodes in the cluster have been migrated to the latest level of GPFS code and that you have successfully run the mmchconfig release=LATEST command.Make sure you have operated with the new level of code for some time and are certain you want to migrate to the latest level of GPFS. Issue the mmchfs -V full command only after you have definitely decided to accept the latest level, as this causes disk changes that are incompatible with previous levels of GPFS.
For more information about the mmchfs command, see the IBM Storage Scale: Command and Programming Reference Guide.
For more information about the mmchfs command, see mmchfs command.