Troubleshooting by symptom
You might encounter some common problems while using the IBM® Pattern for IBM Storage Scale.
The mount point gets unmounted upon restarting the IBM Storage Scale Client
Symptom: After you restart the IBM Storage Scale Client, the mount point gets unmounted.
su - gpfsprod -c sudo /usr/lpp/mmfs/bin/mmstartup
Storage Scale active node might show as "Passive Node" type after restart
After the restart of Storage Scale nodes, the Active primary node might show up as Passive node. This behavior can cause issues while you use the nodes or add new nodes to the cluster.
Symptom: Action such as adding new member on the Passive node type fails.
- Check whether the IBM Storage Scale daemon service is up and running. If not, then run the
following command to start the daemon service:
su - gpfsprod -c 'sudo /usr/lpp/mmfs/bin/mmstartup'
- After the services are started successfully, make sure GPFS filesystem is mounted by using the
df-hk
ormmlsmount
command.
IBM Storage Scale Server block volume attachment
fails with errors for mmdsh
mmdsh
,
respectively:Block volume attachment failed with error : mmdsh: Invalid or missing remote shell command: /usr/bin/sshwrap.pl
Block volume attachment failed with error : mmdsh: Invalid or missing remote shell command: /usr/bin/scpwrap.pl
- Copy sshwrap.pl from /usr/lpp/mmfs/bin and paste it to
/usr/bin/.
cp /usr/lpp/mmfs/bin/sshwrap.pl /usr/bin/
- Copy scpwrap.pl from /usr/lpp/mmfs/bin and paste it to
/usr/bin/.
cp /usr/lpp/mmfs/bin/scpwrap.pl /usr/bin/
Download of client private key and client key from mirror node might fail
retrieve a client private key or client key
from the
Retrieve key
operation might fail with the following error message:
Retrieve Client Key: The Client key was not found for this configuration.
Resolution: Retrieve the client private key and the client key from the primary node.
Network Shared Disk (NSD) on node goes down after IBM Storage Scale auto revert
Symptom: After the IBM Storage Scale auto revert, Network Shared Disk (NSD) on the node goes down.
su - gpfsprod -c 'sudo /usr/lpp/mmfs/bin/mmchdisk <Name of file system> start -d <Name of NSD>'
Troubleshooting issues in GPFS/IBM Storage Scale pattern type
- Compile GPFS portability layer for this kernel version in a different virtual machine by using
the steps mentioned in the Building IBM Storage Scale portability layer after Linux kernel updates topic. Note: In the Building IBM Storage Scale portability layer after Linux kernel updates topic, you can skip the sub-steps of step 3 to start the node and check for node active state.
- Copy the content that is available in the /lib/modules/<upgraded kernel version>/extra folder from the system where the GPFS portability layer is successful and paste it in the /lib/modules/<upgraded kernel version>/extra folder of the virtual machine where the upgrade failed.
- Run the following command to start GPFS:
su - gpfsprod -c 'sudo /usr/lpp/mmfs/bin/mmstartup’
- Run the following command to check whether all the nodes are in active state:
su - gpfsprod -c 'sudo /usr/lpp/mmfs/bin/mmgetstate -aL
Mixing of IP address formats
Never mix instances of IPv4 and IPv6 IBM Storage Scale pattern deployments, whether they be client deployments, primary, mirror, or tiebreaker deployments. This scenario is not supported.
IBM Storage Scale Clients - File set names
Do not include blank spaces in file set names.
IBM Storage Scale Clients - Link directories
Do not include blank spaces in link directory names.
IBM Storage Scale Clients - Page pool memory not available
If you see a message saying that the Page Pool Memory can not be obtained in /var/adm/ras/mmfs.logs.latest, this means that the virtual machine does not have enough memory to support IBM Storage Scale, and the client pattern most likely will have to allocate more memory to itself in its configuration values. Ensure that the allocated memory is at least 4 Gb.
If your client fails to deploy, run the Status operation. You might see a IBM Storage Scale error that prevents the page pool from being allocated. Correct any errors and try the deployment again.
IBM Storage Scale Clients - File set already exists
Check whether a file set name used by your client deployment already exists. If it does, unintentional file overwriting might occur. Use the Cluster status operation on the server to list the existing file sets.
IBM Storage Scale Clients - File set quota size is not as expected
If you find that the quota size is not what you expected, use the Cluster status operation on the server to list the existing file sets. If the size is not what the client expects, the reason most likely is that some other client created the file set. If you need a different value, contact the original owner. If a change is agreed to, run the Change File Set operation on the Primary IBM Storage Scale instance to change the size of the quota.
IBM Storage Scale Client - Quota Size Constraints on the file set are ignored
Only non-root users are affected by the file set quota settings.
IBM Storage Scale Client - Connect to server operation fails to update the remote file system information
Connect to server
operation
might fail, resulting in an error message similar to the following
example:Web_Application-was.11406729401441.GPFSClient: Connect to server: Failed to update the remote file system information for kent ['/usr/lpp/mmfs/bin/mmremotefs', 'update', 'kent', '-f', 'kent', '-C', 'testClusterPassive_pass.purescale.raleigh.ibm.com', '-A', 'yes', '-T', '/gpfs/kent']
mmremotefs: Command was unable to determine whether file system is mounted.
The IBM Storage Scale product documentation notes that when this type of problem occurs, message 6027-1996 is issued with similar wording.
If you encounter this message, perform problem determination, resolve the problem, and reissue the command. If you cannot determine or resolve the problem, you might be able to run the command successfully by first shutting down the IBM Storage Scale daemon on all nodes of the cluster (using mmshutdown -a), ensuring that the file system is not mounted.
- Log in to the IBM Storage Scale Client virtual machine instance.
- Navigate to /usr/lpp/mmfs/bin/ and run the mmshutdown -a command.
- Run the mmstartup command.
- Perform the
Connect to server
operation again.
IBM Storage Scale Client - Connect to server operation fails to unmount the file system
Connect to server
operation might fail, resulting
in an error message indicating that the device or resource is busy,
similar to the following example:Web_Application-was.11407238943746.GPFSClient: Connect to server: Failed to unmount the testFSys file system ['/usr/lpp/mmfs/bin/mmumount', 'testFSys', '-f'] umount2: Device or resource busy umount: /gpfs/testFSys: device is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1)) umount2: Device or resource busy
umount: /gpfs/testFSys: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
umount2: Device or resource busy
Refer to the IBM Storage Scale Problem Determination Guide for actions to take when the file system will not unmount.
Ensure
that all processes finish accessing the file system, then run the Connect
to server
operation again.
IBM Storage Scale Server - Disk volume limit exceeded
Only a maximum of 14 storage volumes can be added to any IBM Storage Scale configuration.
IBM Storage Scale Server - Disk volume not in list
Ensure that the correct storage volume is attached.
IBM Storage Scale Server - sudo error: sorry, you must have a tty to run sudo
Ensure that the requiretty option is disabled on the virtual machine.
requiretty is an option in the /etc/sudoers
file, which prevents
sudo operations from non-TTY sessions. The IBM Storage Scale nodes must be able
to run sudo commands from scripts.
IBM shared service for IBM Storage Scale - Not deployed before IBM Storage Scale clients
The IBM shared service for IBM Storage Scale must be deployed to a cloud
group before deploying any IBM Storage Scale Clients to that
same cloud group unless you specified a IBM Storage Scale server at
deployment or through the Connect to server
operation for virtual application patterns and virtual system patterns.
Deployment will conclude with an error if a IBM Storage Scale Client is deployed to a cloud group which
does not have an instance of the IBM shared service for IBM Storage Scale
deployed unless you specified a IBM Storage Scale server at deployment or
through the Connect to server
operation for virtual application patterns and virtual system patterns.
[2015-02-26 14:22:23.243192] GPFSAgent - Retrieve Manager Info from shared service
[2015-02-26 14:22:23.705671] Failed to retrieve values from the IBM Shared Service for GPFS. Ensure that the IBM Shared Service for GPFS is deployed in the same cloud group with this deployment. If IBM Shared Service for GPFS is deployed, ensure that the input value is valid.
IBM Storage Scale portability failures are not reported promptly on Linux
During the IBM Storage Scale installation process, the build of the IBM Storage Scale portability layer might fail. You usually encounter IBM Storage Scale portability failures if the base image that is used to deploy your IBM Storage Scale instance does not have all of the required IBM Storage Scale dependencies.
This problem can occur when you are deploying a IBM Storage Scale Client, or IBM Storage Scale Primary configuration, or when attaching a IBM Storage Scale Mirror or Tiebreaker instance to a IBM Storage Scale Primary configuration.
When this failure occurs, the error is reported in the IBM Storage Scale logs but the execution is not aborted and the installation or add member operation continues, but will eventually fail because IBM Storage Scale was not configured properly (due to the portability layer build failure).
To identify any IBM Storage Scale portability failures, after you deploy your instance or after you add new members to the cluster, ensure that the cluster has been configured properly by running the Get Cluster Status operation and verify that all IBM Storage Scale nodes and NSD are reported to be up and running.
To help debug the
problem and identify the root cause, open the IBM Storage Scale trace log (IWD trace.log for
the GPFSMainServer role or GPFSClient role) and search for a Build
GPFS portability FAILED
message.
Primary instance remains in maintenance mode after an auto-revert operation
Problem: After a primary instance completes an auto-revert operation, it remains in maintenance mode.
Resolution: Manually resume the instance from the Instance management
page to bring it to a Running
state. You can then do the other IBM Storage Scale operations on that primary instance.
Some IBM Storage Scale operations show up in languages other than English
Symptom: When you use IBM Storage Scale, some operations show up in languages other than English.
EN_US
to make the operations show up in
English language. Use the following commands for the IBM Storage Scale Server and manager instances of the IBM Storage Scale server cluster.- Check the language value with the following
command.
bash-4.2# echo $LANG
- Check locale on the instance with the following
command.
bash-4.2# locale
- Check environment on the instance with the following
command.
bash-4.2# env |grep -e LANG -e LC
- Change the locale.
- Change the
LANG
value on the instance with the following command.bash-4.2# export LANG=en_US.UTF-8
- Change the
LC_ALL
value on the instance with the following command.bash-4.2# export LC_ALL="en_US.UTF-8"
- Check the locale.conf file on the instance with the following command.
Ensure that the file must have an entry for the
en_US
locale.bash-4.2# cat /etc/locale.conf
- Modify the bash_profile name on the instance. Add the following statements
at the end of the file to set and export the
LANG
value on the instances.LANG=en_US.UTF-8 export LANG
- Restart the instances.
Client key is not accepted on the IBM Cloud Pak System user interface when installing the IBM Storage Scale client on AIX 7.2
Symptom: On AIX 7.2, the client keys
are generated as OPENSSH
but the IBM Cloud Pak System user interface requires the RSA key.
ssh-keygen -p -m PEM -f <opensshkeyfile>
Provide that converted key file in
the IBM Storage Scale Manager IP and Client Key field along with the IP
address of the manager node.GPFS service sometimes does not autostart after you restart the IBM Storage Scale client node
- Symptom
- After you restart the IBM Storage Scale client node, the GPFS service sometimes does not autostart.
- Resolution
- To address this issue, do these steps:
- Log in to the IBM Storage Scale client virtual machine instance.
- Go to the /usr/lpp/mmfs/bin/ location.
- Run the following command:
mmstartup