Hi GPFS Forum,
We are implementing some new GPFS NSD servers and are wanting to configure Linux multipath (on RHEL6) to handle failures effectively.
Since all our GPFS LUNs (NSDs) are visible via two GPFS NSD servers we are currently assuming that we should configure Linux multipath on each NSD server to fail I/Os as soon as possible when all paths to a LUN fail (this is controlled by the no_path_retry device parameter in multipath.conf). We are bravely assuming that GPFS will retry failed I/Os via the other NSD server(s) when this happens.
Is this assumption correct ? i.e. when a LUN (NSD) is accessible via multiple NSD servers, and the I/O fails when initially processed via one NSD server, will GPFS retry the failed I/O via the other NSD server(s) that can access that LUN (NSD) ? If not, what has to happen to force GPFS to use the other NSD server(s) rather than the NSD server that has lost all paths to the LUN (NSD) ?