Suboptimal performance due to failover of NSDs to secondary server - NSD server failure
In a shared storage configuration, failure of an NSD server might result in the failover of its NSDs to the secondary server, if the secondary server is active. This can reduce the total number of NSD servers actively serving the file system, which in turn impacts the file system's performance.
Problem identification
In IBM Storage Scale, the system-defined node class “nsdnodes” contains all the NSD server nodes in the IBM Storage Scale cluster. Issue the mmgetstate –N nsdnodes command to verify the state of the GPFS daemon. The GPFS file system performance might degrade if one or more NSD servers are in the down or arbitrating or unknown state.
The following example displays two nodes: one in active state and the other in down state
# mmgetstate -N nsdnodes Node number Node name GPFS state
------------------------------------------
1 c25m3n07-ib active
2 c25m3n08-ib down
Problem resolution and verification
Resolve any system-level or software issues that exist. For example, confirm that NSD server have no network connectivity problems, or that the GPFS portability modules are correctly built for the kernel that is running. Also, perform necessary low-level tests to ensure that both the NSD server and the communication to the node are healthy and stable.
Verify that no system or software issues exist, and start GPFS on the NSD server by using the mmstartup –N <NSD_server_to_revive> command. Use the mmgetstate –N nsdnodes command to verify that the GPFS daemon is in active state as shown:
Node number Node name GPFS state
-----------------------------------------
1 c25m3n07-ib active
2 c25m3n08-ib active