IBM Support

Cannot mount GPFS volumes on PDA/Netezza Appliance

Troubleshooting


Problem

Cannot mount GPFS volume on the appliance using command $ /usr/lpp/mmfs/bin/mmmount all Tue Oct 25 12:07:54 EST 2016: mmmount: Mounting file systems ... mount: Stale NFS file handle mount: Stale NFS file handle gpfs1 couldn't be mounted on the system. Seeing in /var/adm/ras/mmfs.log ... Tue Oct 25 12:07:07.841 2016: Command: mount gpfs1 Tue Oct 25 12:07:07.935 2016: Global NSD disk, nsd03, not found. Tue Oct 25 12:07:07.934 2016: Global NSD disk, nsd01, not found. Tue Oct 25 12:07:07.935 2016: Global NSD disk, nsd04, not found. Tue Oct 25 12:07:07.934 2016: Global NSD disk, nsd02, not found. Tue Oct 25 12:07:07.934 2016: Disk failure. Volume gpfs1. rc = 19. Physical volume nsd01. Tue Oct 25 12:07:07.935 2016: Disk failure. Volume gpfs1. rc = 19. Physical volume nsd02. Tue Oct 25 12:07:07.934 2016: Disk failure. Volume gpfs1. rc = 19. Physical volume nsd03. Tue Oct 25 12:07:07.935 2016: Disk failure. Volume gpfs1. rc = 19. Physical volume nsd04. Tue Oct 25 12:07:07.936 2016: File System gpfs1 unmounted by the system with return code 19 reason code 0 Tue Oct 25 12:07:07.937 2016: No such device Tue Oct 25 12:07:07.936 2016: Failed to open gpfs1. Tue Oct 25 12:07:07.937 2016: No such device Tue Oct 25 12:07:07.936 2016: Command: err 666: mount gpfs1 Tue Oct 25 12:07:07.937 2016: No such device Tue Oct 25 12:07:08 EST 2016: mmcommon preunmount invoked. File system: gpfs1 Reason: SGPanic

Resolving The Problem

The General Parallel File System (GPFS) is a high-performance clustered file system.
It can be deployed in shared-disk or shared-nothing distributed parallel modes.

The authentication method between nodes must be established so that communication between nodes
can be done without passwords AND without extraneous messages (such as the key prompt you had to respond to in your case).

Check the configuration

$ /usr/lpp/mmfs/bin/mmlscluster

GPFS cluster information
========================
GPFS cluster name: cluster.local
GPFS cluster id: 9976600478437553639
GPFS UID domain: cluster.local
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp

GPFS cluster configuration servers:
-----------------------------------
Primary server: gpfs-server-1.local
Secondary server: gpfs-server-2.local

Node Daemon node name IP address Admin node name Designation
-----------------------------------------------------------------------------------------------------
1 gpfs-server-1.local 10.10.10.129 gpfs-server-1.local
2 gpfs-server-2.local 10.10.10.131 gpfs-server-1.local
3 gpfs-node-1.local 10.10.10.65 gpfs-node-1.local
4 gpfs-node-2.local 10.10.10.243 gpfs-node-2.local

Check number of nodes
$ /usr/lpp/mmfs/bin/mmlsmount all -L

File system gpfs1 is mounted on 4 nodes:
10.10.10.129 gpfs-server-1
10.10.10.131 gpfs-server-2
10.10.10.65 gpfs-node-1
10.10.10.243 gpfs-node-2

Check if nodes are pingable

$ ping -c 5 10.10.10.129
$ ping -c 5 10.10.10.131
$ ping -c 5 10.10.10.65
$ ping -c 5 10.10.10.243

Run 'mmmount <volume> -a' to check if communication between nodes are passwordless

$ /usr/lpp/mmfs/bin/mmmount gpfs1 -a
Tue Oct 25 12:52:33 EST 2016: mmmount: Mounting file systems ...
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'gpfs-server-2.local (10.10.10.131)' can't be established.
RSA key fingerprint is d7:ac:e0:c8:fd:37:bb:a4:aa:19:9e:2d:7b:4c:69:ac.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'gpfs-node-1.local (10.10.10.65)' can't be established.
RSA key fingerprint is b7:3b:e0:e9:1d:ae:26:c9:43:4a:7f:f0:f1:27:89:9d.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'ukbluptra12.uk.experian.local (10.10.10.129)' can't be established.
RSA key fingerprint is b4:54:5b:58:f1:a3:fc:5d:b4:db:b4:a1:03:9c:0f:65.
Are you sure you want to continue connecting (yes/no)?
gpfs-node-2.local: Red Hat Enterprise Linux Server release 5.11 (Tikanga)
gpfs-node-2.local: Kernel \r on an \m
gpfs-node-2.local:
gpfs-node-2.local: mount: Stale NFS file handle
mmdsh: gpfs-node-2.local remote shell process had return code 32.
yes

mmmount: Interrupt received.

Fixing connections between nodes and passwordless will solve an issue with mounting.

[{"Product":{"code":"SSULQD","label":"IBM PureData System"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Storage","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1.0.0","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
17 October 2019

UID

swg21995016