Topic
7 replies Latest Post - ‏2012-12-11T16:05:36Z by Just_Being_Frank
Just_Being_Frank
Just_Being_Frank
5 Posts
ACCEPTED ANSWER

Pinned topic NSD in ready / unrecovered state, mmfsck runs clean

‏2012-08-13T17:58:13Z |
At some point, not sure when, we had one of two NSDs for device gpfs1 to get into this condition.

#lxvcm1-1> mmlsdisk gpfs1 -L
disk driver sector failure holds holds storage
name type size group metadata data status availability disk id pool remarks

gpfs1nsd nsd 512 4001 yes yes ready unrecovered 1 system desc

From a previous support call, we were told we needed to perform a mmfsck to clear this up most likely.

Over the last weekend we had a chance to take all 5 nodes down from our GPFS replication setup ( We are using two nodes as the main write only nodes and two for read only replicates and have tiebreaker disks and a tie breaker site node as well ).

Ran the mmfsck on our 2.3TB filesystem and it completed cleanly after about 25-30 minutes. Indicated no issues at all.

We brought everything back online and it still has the same availability showing as unrecovered. Trying to start it just fails with the same error as before:

#lxvcm1-1> mmchdisk gpfs1 start -d gpfs1nsd
Scanning file system metadata, phase 1 ...
Error migrating log.
Inconsistency in file system metadata.
Initial disk state was updated successfully, but another error may have changed the state again.
mmchdisk: Command failed. Examine previous error messages to determine cause.

Nothing good from the errpt output for us to use either. All indications from the DS5000 unit and the SAN are no issues, all is green.

Not a seasoned pro with GPFS, learning and willing student though... Any ideas on what to check next? Or any commands that would shed more light on potential issues?

Looked through the forum but it's hard to find a clear start point with so much information. I'm still going through the docs to learn more as well.

Thanks!

Frank
Updated on 2012-12-11T16:05:36Z at 2012-12-11T16:05:36Z by Just_Being_Frank
  • dlmcnabb
    dlmcnabb
    1012 Posts
    ACCEPTED ANSWER

    Re: NSD in ready / unrecovered state, mmfsck runs clean

    ‏2012-08-13T20:46:44Z  in response to Just_Being_Frank
    You need to run "mmchdisk $fsname start -a" to move unrecovered disks from unrecovered to up.
    • Just_Being_Frank
      Just_Being_Frank
      5 Posts
      ACCEPTED ANSWER

      Re: NSD in ready / unrecovered state, mmfsck runs clean

      ‏2012-08-13T21:09:06Z  in response to dlmcnabb
      I get the same error if I run that or the command I mentioned.

      Scanning file system metadata, phase 1 ...
      Error migrating log.
      Inconsistency in file system metadata.
      Initial disk state was updated successfully, but another error may have changed the state again.
      mmchdisk: Command failed. Examine previous error messages to determine cause.

      So either one: mmchdisk gpfs1 start -a or mmchdisk gpfs1 -d gpfs1nsd

      retruns the same error - and again, mmfsck has run clean on the system and there are no errors in the error log at all...?

      I'm stumped, as it seems something is inconsistent with the metadata, or this is possibly a bug? It hasn't prevented us from using the filesystem and the disk from what I can tell... we hoped when we shutdown gpfs, rebooted all the nodes, ran the mmfsck, etc - it would go away, but no success.
      • dlmcnabb
        dlmcnabb
        1012 Posts
        ACCEPTED ANSWER

        Re: NSD in ready / unrecovered state, mmfsck runs clean

        ‏2012-08-14T06:07:47Z  in response to Just_Being_Frank
        There have been a few log migration fixes recently. Please upgrade to latest service level for your release.
        • FelipeKnop
          FelipeKnop
          24 Posts
          ACCEPTED ANSWER

          Re: NSD in ready / unrecovered state, mmfsck runs clean

          ‏2012-08-15T15:41:30Z  in response to dlmcnabb
          This seems to match a problem which has been fixed in 3.4.0.15.

          Felipe
        • Just_Being_Frank
          Just_Being_Frank
          5 Posts
          ACCEPTED ANSWER

          Re: NSD in ready / unrecovered state, mmfsck runs clean

          ‏2012-08-15T17:01:27Z  in response to dlmcnabb
          Thanks! We will try the upgrade on our next maintenance window in Sept.
          • Just_Being_Frank
            Just_Being_Frank
            5 Posts
            ACCEPTED ANSWER

            Re: NSD in ready / unrecovered state, mmfsck runs clean

            ‏2012-08-15T17:01:55Z  in response to Just_Being_Frank
            Thanks! We will try the upgrade on our next maintenance window in Sept.
  • Just_Being_Frank
    Just_Being_Frank
    5 Posts
    ACCEPTED ANSWER

    Re: NSD in ready / unrecovered state, mmfsck runs clean

    ‏2012-12-11T16:05:36Z  in response to Just_Being_Frank
    The answer to the issue was to upgrade to version 15 or above. We upgraded to fix version 17 this last weekend finally and was able to resolve the unrecovered disk problem that was really only a bug in the metadata logs. There was never a real issue with the data or metadata. However, keep in mind that the cluster will have issues with the filesystem if there are any issues with other disks in the filesystem that reaches a qourum majority! The filesystem will then be unusable. In our case there were only two disks and when the other disk had an issue, it took the filesystem down.

    Thanks for the info Dan McNabb and Dr. Felipe Knop!!