Topic
IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
10 replies Latest Post - ‏2013-02-11T22:38:43Z by truongv
SystemAdmin
SystemAdmin
2092 Posts
ACCEPTED ANSWER

Pinned topic mmcrfs is failing with "No such device", and "Error accessing disks"

‏2013-01-30T19:15:47Z |
Hey all..I'm pretty new to GPFS so bear with me.

I'm trying to set up a 2 node cluster (both quorum), both machines have the OS installed on sda and a free disk on sdb, pretty simple right? I've been using guides such as this http://www.ibm.com/developerworks/wikis/display/hpccentral/gpfs+quick+start+guide+for+linux and everything is going well right up until I need to run mmcrfs. Both machines have passwordless ssh access set up and I've even tried chmod'ing the block device under /dev (I don't know if this was a good idea or not..)

Anyway, here is the output of anything that may help. I've been on this for about 2 days with no luck so anything is a huge help. I hope it's just some simple issue.

Output of mmcrfs

mmcrfs gpfs1 -F diskdef.txt -A yes -T /gpfs Unable to open disk 
'gpfs4nsd' on node ITE5. No such device Error accessing disks. mmcrfs: tscrfs failed.  Cannot create gpfs1 mmcrfs: Command failed.  Examine previous error messages to determine cause.


Output of mmlsnsd -M

Disk name    NSD volume ID      Device         Node name Remarks --------------------------------------------------------------------------------------- gpfs3nsd     C0A87A1A5106F8BB   /dev/sdb       ITE5                     server node gpfs4nsd     000000005106F8A6   /dev/sdb       ITE6                     server node


Output of tspreparedisk -S from both nodes

C0A87A1A5106F8BB /dev/sdb generic tspreparedisk:0::::0:0::   000000005106F8A6 /dev/sdb generic tspreparedisk:0::::0:0::


Output of diskdef.txt

# /dev/sdb:ITE5:::: gpfs3nsd:::dataAndMetadata:4001::system # /dev/sdb:ITE6:::: gpfs4nsd:::dataAndMetadata:4002::system


That's all I can think of. If anything else is needed please let me know.
Updated on 2013-02-11T22:38:43Z at 2013-02-11T22:38:43Z by truongv
  • SystemAdmin
    SystemAdmin
    2092 Posts
    ACCEPTED ANSWER

    Re: mmcrfs is failing with "No such device", and "Error accessing disks"

    ‏2013-01-30T19:22:01Z  in response to SystemAdmin
    Is GPFS actually up on both nodes, per mmgetstate -a?

    Whenever something goes wrong with GPFS, the first place you need to look is GPFS log, /var/adm/ras/mmfs.log.latest, on all nodes involved. Attach logs here when asking for help.

    yuri
    • SystemAdmin
      SystemAdmin
      2092 Posts
      ACCEPTED ANSWER

      Re: mmcrfs is failing with "No such device", and "Error accessing disks"

      ‏2013-01-30T19:26:34Z  in response to SystemAdmin
      Hi there, here is the info you need.

      Output of mmgetstate -a
      
      Node number  Node name        GPFS state ------------------------------------------ 1      ITE5             active 2      ITE6             active
      


      Most recent entries in mmfs.log.latest
      
      Wed Jan 30 14:09:38.778 2013: Node 192.168.122.26 (ITE5) appointed as manager 
      
      for gpfs1. Wed Jan 30 14:09:38.779 2013: Command: tscrfs /dev/gpfs1 -F /var/mmfs/tmp/tsddFile.mmcrfs.4134 -I 16384 -i 512 -M 2 -n 32 -R 2 -w 0 Wed Jan 30 14:09:39.621 2013: Command: err 19: tscrfs /dev/gpfs1 -F /var/mmfs/tmp/tsddFile.mmcrfs.4134 -I 16384 -i 512 -M 2 -n 32 -R 2 -w 0 Wed Jan 30 14:09:39.622 2013: No such device Wed Jan 30 14:09:39.625 2013: Node 192.168.122.26 (ITE5) resigned as manager 
      
      for gpfs1. Wed Jan 30 14:09:39.626 2013: File system has been deleted.
      
      • SystemAdmin
        SystemAdmin
        2092 Posts
        ACCEPTED ANSWER

        Re: mmcrfs is failing with "No such device", and "Error accessing disks"

        ‏2013-01-30T19:32:35Z  in response to SystemAdmin
        Sorry, overlooked this in the original post: it looks like you don't have NSD servers defined for either NSD, which GPFS takes to mean that both disks are visible directly on both nodes (as would be the case had you had a SAN). Use mmchnsd or mmdelnsd/mmcrnsd to re-define NSDs with a primary server.

        yuri
        • SystemAdmin
          SystemAdmin
          2092 Posts
          ACCEPTED ANSWER

          Re: mmcrfs is failing with "No such device", and "Error accessing disks"

          ‏2013-01-30T19:37:57Z  in response to SystemAdmin
          I'm not quite sure what you mean..here is some output from mmlscluster, it says ITE5 is the primary node with ITE6 being a secondary node. Is this different than a primary NSD server? If I indeed have a primary and secondary set up, is something wrong in maybe diskdef.txt?
          
          GPFS cluster information ======================== GPFS cluster name:         ITE5 GPFS cluster id:           13882480104816360238 GPFS UID domain: ITE5 Remote shell command:      /usr/bin/ssh Remote file copy command:  /usr/bin/scp   GPFS cluster configuration servers: ----------------------------------- Primary server:    ITE5 Secondary server:  ITE6   Node  Daemon node name            IP address       Admin node name             Designation ----------------------------------------------------------------------------------------------- 1   ITE5                        192.168.122.26   ITE5                        quorum 2   ITE6                        192.168.122.25   ITE6                        quorum
          
          • SystemAdmin
            SystemAdmin
            2092 Posts
            ACCEPTED ANSWER

            Re: mmcrfs is failing with "No such device", and "Error accessing disks"

            ‏2013-01-30T20:01:35Z  in response to SystemAdmin
            Primary/secondary configuration server nodes listed by mmlscluster are entirely unrelated to NSD server definitions. You need to specify an NSD server when creating an NSD for a disk device that is only visible from some but not all nodes, using the second field of the NSD descriptor. Please see 'mmcrnsd' man page.

            yuri
          • truongv
            truongv
            77 Posts
            ACCEPTED ANSWER

            Re: mmcrfs is failing with "No such device", and "Error accessing disks"

            ‏2013-01-30T20:02:33Z  in response to SystemAdmin
            What Yuri means is your mmlsnsd output should look similar to the below:
            
            File system   Disk name    NSD servers --------------------------------------------------------------------------- gpfs1         gpfs3nsd     ITE5 gpfs1         gpfs4nsd     ITE6
            

            You can use mmchnsd command to change the NSD server
            
            mmchnsd 
            "gpfs3nsd:ITE5;gpfs4:ITE6"
            
            • SystemAdmin
              SystemAdmin
              2092 Posts
              ACCEPTED ANSWER

              Re: mmcrfs is failing with "No such device", and "Error accessing disks"

              ‏2013-01-30T20:22:57Z  in response to truongv
              Thanks for your help everyone but I still seem to be hitting the same issue..after running
              
              mmchnsd 
              "gpfs3nsd:ITE5;gpfs4nsd:ITE6"
              

              as truongv suggested I tried to recreate the filesystem but I am receiving the same error..my first thought was that the diskdef.txt file might be wrong (since some config was changed), but even with the NSD servers in diskdef.txt it's still trying to open gpfs4nsd on ITE5 which is obviously impossible.

              Here is the output of mmlsnsd after the NSDs have been given appropriate servers:
              
              File system   Disk name    NSD servers --------------------------------------------------------------------------- (free disk)   gpfs3nsd     ITE5 (free disk)   gpfs4nsd     ITE6
              


              Here is once again the error I'm getting, followed by the most recent entry in mmfs.log.latest
              
              mmcrfs gpfs1 -F diskdef.txt -A yes -T /gpfs Unable to open disk 
              'gpfs4nsd' on node ITE5. No such device Error accessing disks. mmcrfs: tscrfs failed.  Cannot create gpfs1 mmcrfs: Command failed.  Examine previous error messages to determine cause.
              


              
              Wed Jan 30 15:21:22.489 2013: Node 192.168.122.26 (ITE5) appointed as manager 
              
              for gpfs1. Wed Jan 30 15:21:22.490 2013: Command: tscrfs /dev/gpfs1 -F /var/mmfs/tmp/tsddFile.mmcrfs.6607 -I 16384 -i 512 -M 2 -n 32 -R 2 -w 0 Wed Jan 30 15:21:23.329 2013: Command: err 19: tscrfs /dev/gpfs1 -F /var/mmfs/tmp/tsddFile.mmcrfs.6607 -I 16384 -i 512 -M 2 -n 32 -R 2 -w 0 Wed Jan 30 15:21:23.330 2013: No such device Wed Jan 30 15:21:23.333 2013: Node 192.168.122.26 (ITE5) resigned as manager 
              
              for gpfs1. Wed Jan 30 15:21:23.334 2013: File system has been deleted.
              


              What I think is weird is that the log seems to be complaining about no such device "gpfs1" existing when the man page for mmcrfs specifically says that the device can not already exist under /dev..maybe I'm reading it wrong.
              • truongv
                truongv
                77 Posts
                ACCEPTED ANSWER

                Re: mmcrfs is failing with "No such device", and "Error accessing disks"

                ‏2013-01-30T22:34:15Z  in response to SystemAdmin
                I would try:
                dd read the raw device /dev/sdb make sure you can see the disk
                create the filesystem on just one disk and see which one is good/bad
                run each command on the respective NSD server to see if it makes any difference
                Do you have /var/mmfs/etc/nsddevices user exit script?
                Can you show the output of mmdevdiscover on both node? Also /var/mmfs/gen/mmsdrfs file.
                • SystemAdmin
                  SystemAdmin
                  2092 Posts
                  ACCEPTED ANSWER

                  Re: mmcrfs is failing with "No such device", and "Error accessing disks"

                  ‏2013-02-11T16:29:07Z  in response to truongv
                  Hey there, sorry for the super late response. I was sick recently.

                  I managed to create a filesystem on one node, I then tried adding the disk from the other node to the filesystem but it's failing in a similar way.

                  
                  Unable to open disk 
                  'gpfs4nsd' on node ITE5. No such device Error processing disks. mmadddisk: tsadddisk failed. Verifying file system configuration information ... mmadddisk: Propagating the cluster configuration data to all affected nodes.  This is an asynchronous process. mmadddisk: Command failed.  Examine previous error messages to determine cause.
                  


                  What I find weird is the diskdef.txt file doesn't say gpfs4nsd is even on ITE5, i.e, it says:
                  
                  gpfs4nsd:ITE6::dataAndMetadata:4002::system
                  


                  If I try running the disk add command from the other server it fails to obtain a lock from ITE5.

                  I can dd read the raw devices, in fact I've verified that the gpfs tag thing is written to the second sector (like it says in the documentation).

                  There is no nsddevices exit script.

                  mmdevdiscover on ITE5
                  
                  sdb generic sda generic sda1 generic
                  


                  mmdevdiscover on ITE6
                  
                  sda generic sda1 generic sdb generic
                  


                  I don't know if it's still helpful, but here is the contents of the mmsdrfs file
                  
                  %%9999%%:00_VERSION_LINE::1210:3:13::lc:ITE5:ITE6:4:/usr/bin/ssh:/usr/bin/scp:13882480104816360238:lc2:1359067957::ITE5:0:0:0:0::::central:0.0: %%home%%:03_COMMENT::1: %%home%%:03_COMMENT::2:    This is a machine generated file.  Do not edit! %%home%%:03_COMMENT::3: %%home%%:03_COMMENT::4:2013.01.24.17.52.37:2:1:mmcrcluster -N ITE5:quorum,ITE6:quorum -p ITE5 -s ITE6 -r /usr/bin/ssh -R /usr/bin/scp %%home%%:03_COMMENT::5:2013.01.24.17.54.15:3:1:mmchlicense server --accept -N ITE5,ITE6 %%home%%:03_COMMENT::6:2013.01.24.18.13.26:4:1:mmcrnsd -F /root/diskdef.txt %%home%%:03_COMMENT::7:2013.01.28.16.49.40:5:1:mmdelnsd gpfs1nsd %%home%%:03_COMMENT::8:2013.01.28.16.49.56:6:1:mmdelnsd gpfs2nsd %%home%%:03_COMMENT::9:2013.01.28.17.16.30:7:1:mmcrnsd -F diskdef.txt %%home%%:03_COMMENT::10:2013.01.30.15.08.00:8:1:mmchnsd gpfs3nsd:ITE5;gpfs4nsd:ITE6 %%home%%:03_COMMENT::11:2013.01.30.18.16.56:9:1:mmcrfs gpfs1 -F diskdef.txt -A yes -T /gpfs %%home%%:03_COMMENT::12:2013.01.30.20.37.01:10:1:mmadddisk gpfs1 -F diskdef.txt:pre-commit %%home%%:03_COMMENT::13:2013.01.30.20.37.11:11:1:mmadddisk gpfs1 -F diskdef.txt:ts_failed %%home%%:03_COMMENT::14:2013.02.11.11.21.57:12:1:mmadddisk gpfs1 -F diskdef.txt:pre-commit %%home%%:03_COMMENT::15:2013.02.11.11.22.09:13:1:mmadddisk gpfs1 -F diskdef.txt:ts_failed %%home%%:10_NODESET_HDR:::2:TCP::1191::::1214:1214:L:::::::::::::: %%home%%:20_MEMBER_NODE::1:1:ITE5:192.168.122.26:ITE5:client::::::ITE5:ITE5:1214:3.4.0.11:Linux:Q::::::server:: %%home%%:20_MEMBER_NODE::2:2:ITE6:192.168.122.25:ITE6:client::::::ITE6:ITE6:1214:3.4.0.11:Linux:Q::::::server:: %%home%%:70_MMFSCFG::1:#   ::::::::::::::::::::::: %%home%%:70_MMFSCFG::2:#   WARNING:   This is a machine generated file.  Do not edit!      :::::::::::::::::::::: %%home%%:70_MMFSCFG::3:#   Use the mmchconfig command to change configuration parameters.  ::::::::::::::::::::::: %%home%%:70_MMFSCFG::4:#   ::::::::::::::::::::::: %%home%%:70_MMFSCFG::5:clusterName ITE5::::::::::::::::::::::: %%home%%:70_MMFSCFG::6:clusterId 13882480104816360238::::::::::::::::::::::: %%home%%:70_MMFSCFG::7:autoload no::::::::::::::::::::::: %%home%%:70_MMFSCFG::8:minReleaseLevel 1210 3.4.0.7::::::::::::::::::::::: %%home%%:70_MMFSCFG::9:dmapiFileHandleSize 32::::::::::::::::::::::: %%home%%:30_SG_HEADR:gpfs1::150:no:::0::::no::::::::::::::: %%home%%:40_SG_ETCFS:gpfs1:1:%2Fgpfs: %%home%%:40_SG_ETCFS:gpfs1:2:   dev             = /dev/gpfs1 %%home%%:40_SG_ETCFS:gpfs1:3:   vfs             = mmfs %%home%%:40_SG_ETCFS:gpfs1:4:   nodename        = - %%home%%:40_SG_ETCFS:gpfs1:5:   mount           = mmfs %%home%%:40_SG_ETCFS:gpfs1:6:   type            = mmfs %%home%%:40_SG_ETCFS:gpfs1:7:   account         = 
                  
                  false %%home%%:50_SG_MOUNT:gpfs1::rw:mtime:atime::::::::::::::::::::: %%home%%:60_SG_DISKS:gpfs1:1:gpfs3nsd:976773168:4001:dataAndMetadata:C0A87A1A5106F8BB:nsd:ITE5::other::generic:cmd::::ready::system:ITE5::::: ~%BBBB%%:60_SG_DISKS:~/~:0:gpfs4nsd:976773168:4002:dataAndMetadata:000000005106F8A6:nsd:ITE6::other::generic:cmd::::::system:ITE6:::::
                  
                  • truongv
                    truongv
                    77 Posts
                    ACCEPTED ANSWER

                    Re: mmcrfs is failing with "No such device", and "Error accessing disks"

                    ‏2013-02-11T22:38:43Z  in response to SystemAdmin
                    The NSD id of gpfs4nsd doesn't look right. The first part should contains IP address instead of 0s.
                    
                    gpfs4nsd     000000005106F8A6   /dev/sdb       ITE6                     server node
                    

                    However, this doesn't explain why it ITE5 can't access the ITE6's NSD disk, gpfs4nsd. I suggest you to start everything from scratch but follow everything from the book this time. Make sure your hostnames resolve. Check /etc/hosts or DNS on all nodes. Since you can't get obtain a lock, make sure you don't have any issue with ssh/rsh.