Suboptimal performance due to failover of NSDs to secondary server - Disk connectivity failure

In a shared storage configuration, disk connectivity failure on an NSD server might result in failover of its NSDs to the secondary server, if the secondary server is active. This can reduce the total number of NSD servers actively serving the file system, which in turn impacts the overall performance of the file system.

Problem identification

The mmlsnsd command displays information about the currently defined disks in a cluster. In the following sample output, the NSD client is configured to perform file system I/O on the primary NSD server c25m3n07-ib for odd-numbered NSDs like DMD_NSD01, DMD_NSD03. In this case, c25m3n08-ib acts as a secondary server.

The NSD client is configured to perform file system I/O on the NSD server c25m3n08-ib for even-numbered NSDs like DMD_NSD02,DMD_NSD04. In this case, c25m3n08-ib is the primary server, while c25m3n07-ib acts as the secondary server.

Issue the #mmlsnsd command to display the NSD server information for the disks in a file system. The following sample output shows the various disks in the gpfs1b file system and the NSD servers that are supposed to act as primary and secondary servers for these disks.

# mmlsnsd
 File system   Disk name    NSD servers
---------------------------------------------------------------------------
            gpfs1b        DMD_NSD01    c25m3n07-ib,c25m3n08-ib
            gpfs1b        DMD_NSD02    c25m3n08-ib,c25m3n07-ib
            gpfs1b        DMD_NSD03    c25m3n07-ib,c25m3n08-ib
            gpfs1b        DMD_NSD04    c25m3n08-ib,c25m3n07-ib
            gpfs1b        DMD_NSD05    c25m3n07-ib,c25m3n08-ib
            gpfs1b        DMD_NSD06    c25m3n08-ib,c25m3n07-ib
            gpfs1b        DMD_NSD07    c25m3n07-ib,c25m3n08-ib
            gpfs1b        DMD_NSD08    c25m3n08-ib,c25m3n07-ib
            gpfs1b        DMD_NSD09    c25m3n07-ib,c25m3n08-ib
            gpfs1b        DMD_NSD10    c25m3n08-ib,c25m3n07-ib

However, the mmlsdisk <fsdevice> -m command that is issued on the NSD client indicates that the NSD client is currently performing all the file system I/O on a single NSD server, c25m3n07-ib.

# mmlsdisk <fsdevice> -m
Disk name     IO performed on node     Device             Availability
         ------------  -----------------------  -----------------  ------------
           DMD_NSD01     c25m3n07-ib              -                  up  
           DMD_NSD02     c25m3n07-ib              -                  up 
           DMD_NSD03     c25m3n07-ib              -                  up
           DMD_NSD04     c25m3n07-ib              -                  up 
           DMD_NSD05     c25m3n07-ib              -                  up
           DMD_NSD06     c25m3n07-ib              -                  up
           DMD_NSD07     c25m3n07-ib              -                  up
           DMD_NSD08     c25m3n07-ib              -                  up
           DMD_NSD09     c25m3n07-ib              -                  up
           DMD_NSD10     c25m3n07-ib              -                  up

Problem resolution and verification

Resolve any system-level or disk-level software issues that exist. For example, storage connectivity issues on the NSD server, or driver issues. Rediscover the NSD disk paths by using the mmnsddiscover –a –N all command. On the NSD client, first issue the mmlsnsd command to obtain the primary NSD server configured for the NSD pertaining to a file system. The echo "NSD-Name Primary-NSD-Server"; mmlsnsd | grep <fsdevice> | awk command parses the output that is generated by the mmlsnsd command and displays the primary NSD server for each of the NSDs. Perform file I/O on the NSD client and issue the mmlsdisk <fs> -m command to verify that the NSD client is performing file system I/O by using all the configured NSD servers. On the NSD client, first issue the mmlsnsd command to obtain the configured primary NSD server for the NSD pertaining to a file system. The # echo "NSD-Name Primary-NSD-Server"; mmlsnsd | grep <fsdevice> | awk command parses the output that is generated by the mmlsnsd command and displays the primary NSD server for each of the NSDs.

# echo "NSD-Name Primary-NSD-Server"; mmlsnsd | grep <gpfs1b> | awk -F ',' '{print $1}' | awk '{print $2 " " $3}'
NSD-Name   Primary-NSD-Server
DMD_NSD01   c25m3n07-ib
DMD_NSD02   c25m3n08-ib
DMD_NSD03   c25m3n07-ib
DMD_NSD04   c25m3n08-ib
DMD_NSD05   c25m3n07-ib
DMD_NSD06   c25m3n08-ib
DMD_NSD07   c25m3n07-ib
DMD_NSD08   c25m3n08-ib
DMD_NSD09   c25m3n07-ib
DMD_NSD10   c25m3n08-ib