mmfsck command

Checks and repairs a GPFS™ file system.

Synopsis

mmfsck Device [-n | -y] [-s | -v | -V] 
       [-c | -m | -o | --skip-inode-check | --skip-directory-check] 
       [-t Directory]
       [-N {Node[,Node...] | NodeFile | NodeClass}]
       [--patch-file Path [--patch]] [--qos QosClass]
       [--threads ThreadLevel]

The file system must be unmounted before you can run the mmfsck command with any option other than -o.

Availability

Available on all IBM Spectrum Scale™ editions.

Description

The mmfsck command in offline mode is intended to be used only in situations where there have been disk or communications failures that have caused MMFS_FSSTRUCT error log entries to be issued, or where it is known that disks have been forcibly removed or otherwise permanently unavailable for use in the file system, and other unexpected symptoms are seen by users. In general it is unnecessary to run mmfsck in offline mode unless under the direction of the IBM® Support Center.

If neither the -n nor -y flag is specified, the mmfsck command runs interactively prompting you for permission to repair each consistency error as reported. It is suggested that in all but the most severely damaged file systems, you run the mmfsck command interactively (the default).

The occurrence of I/O errors, or the appearance of a message telling you to run the mmfsck command, may indicate file system inconsistencies. If either situation occurs, use the mmfsck command to check file system consistency and interactively repair the file system.

For information about file system maintenance and repair, see the topic Checking and repairing a file system in the IBM Spectrum Scale: Advanced Administration Guide. The mmfsck command checks for these inconsistencies:
  • Blocks marked allocated that do not belong to any file. The corrective action is to mark the block free in the allocation map.
  • Files for which an inode is allocated and no directory entry exists (orphaned files). The corrective action is to create directory entries for these files in a lost+found subdirectory of the fileset to which the orphaned file or directory belongs. The index number of the inode is assigned as the name. If you do not allow the mmfsck command to reattach an orphaned file, it asks for permission to delete the file.
  • Directory entries pointing to an inode that is not allocated. The corrective action is to remove the directory entry.
  • Incorrectly formed directory entries. A directory file contains the inode number and the generation number of the file to which it refers. When the generation number in the directory does not match the generation number stored in the file's inode, the corrective action is to remove the directory entry.
  • Incorrect link counts on files and directories. The corrective action is to update them with accurate counts.
  • Policy files are not valid. The corrective action is to delete the file.
  • Various problems related to filesets: missing or corrupted fileset metadata, inconsistencies in directory structure related to filesets, missing or corrupted fileset root directory, other problems in internal data structures. The repaired filesets will be renamed as FilesetFilesetId and put into unlinked state.

If you are repairing a file system due to node failure and the file system has quotas enabled, it is suggested that you run the mmcheckquota command to recreate the quota files.

Indications leading you to the conclusion that you should run the mmfsck command include:
  • An MMFS_FSSTRUCT along with an MMFS_SYSTEM_UNMOUNT error log entry on any node indicating some critical piece of the file system is inconsistent.
  • Disk media failures
  • Partial disk failure
  • EVALIDATE=214, Invalid checksum or other consistency check failure on a disk data structure, reported in error logs or returned to an application.

For further information on recovery actions and how to contact the IBM Support Center, see the IBM Spectrum Scale: Problem Determination Guide.

If you are running the online mmfsck command to free allocated blocks that do not belong to any files, plan to make file system repairs when system demand is low. This is an I/O intensive activity and it can affect system performance.

Results

If the file system is inconsistent, the mmfsck command displays information about the inconsistencies and (depending on the option entered) may prompt you for permission to repair them. The mmfsck command tries to avoid actions that may result in loss of data. In some cases, however, it may indicate the destruction of a damaged file.

All corrective actions, with the exception of recovering lost disk blocks (blocks that are marked as allocated but do not belong to any file), require that the file system be unmounted on all nodes. If the mmfsck command is run on a mounted file system, lost blocks are recovered but any other inconsistencies are only reported, not repaired.

If a bad disk is detected, the mmfsck command stops the disk and writes an entry to the error log. The operator must manually start and resume the disk when the problem is fixed.

The file system must be unmounted on all nodes before the mmfsck command can repair file system inconsistencies.

Parameters

Device
The device name of the file system to be checked and repaired. File system names need not be fully-qualified. fs0 is as acceptable as /dev/fs0.

This must be the first parameter.

-n
Specifies a no response to all file system error repair prompts from the mmfsck command. The option reports inconsistencies but it does not change the file system. To save this information, redirect it to an output file when you issue the mmfsck command.
-y
Specifies a yes response to all file system error repair prompts from the mmfsck command. Use this option only on severely damaged file systems. It allows the mmfsck command to take any action necessary for repairs.
-s
Specifies that the output is semi-verbose.
-v
Specifies that the output is verbose.
-V
Specifies that the output is verbose and contains information for debugging purposes.
-c
When the file system log has been lost and the file system is replicated, this option specifies that the mmfsck command attempt corrective action by comparing the replicas of metadata and data. If this error condition occurs, it is indicated by an error log entry.
-m
Has the same meaning as -c, except that mmfsck checks only the metadata replica blocks. It therefore runs faster than with -c.
-o
Online mode does not perform a full file system consistency check, but blocks marked as allocated that do not belong to a file are recovered. Lost blocks do not constitute file system corruption.
--skip-inode-check
Causes the command to run faster by skipping its inode-check phase. Include this option only if you know that the inodes are valid and that only directories need to be checked. In this mode, the product does not scan all parts of the file system and therefore might not detect all corruptions in the file system.
--skip-directory-check
Causes the command to run faster by skipping its directory-check phase. Include this option if you want to check only the inodes. In this mode, the product does not scan all parts of the file system and therefore might not detect all corruptions in the file system.
-t Directory
Specifies the directory that GPFS uses for temporary storage during mmfsck command processing. This directory must be available on all nodes that are participating in mmfsck and that are designated as either manager or quorum node. In addition to the location requirement, the storage directory has a minimum space requirement of 4GB. The default directory for mmfsck processing is /tmp.
-N {Node[,Node...] | NodeFile | NodeClass}
Specify the nodes to participate in the check and repair of the file system. This command supports all defined node classes. The default is all or the current value of the defaultHelperNodes parameter of the mmchconfig command.

For information on how to specify node names, see the topic Specifying nodes as inputs to GPFS commands in the IBM Spectrum Scale: Advanced Administration Guide.

--patch-file Path
Specifies the name of a patch file. When the --patch parameter is not specified, information about file system inconsistencies (detected during an mmfsck run with the -n parameter) are stored in the patch file that is specified by Path. Path should be accessible from the file system manager node. The information stored in the patch file can be viewed as a report of the problems in the file system. For more information about patch files, see the topic Checking and repairing a file system in the IBM Spectrum Scale: Advanced Administration Guide.

When this option is specified with the --patch parameter, the information in the patch file is read and used to repair the file system.

--patch
Specifies that the file system will be repaired using the information stored in the patch file that is specified with --patch-file Path.
--qos QOSClass
Specifies the Quality of Service for I/O operations (QoS) class to which the instance of the command is assigned. If you do not specify this parameter, the instance of the command is assigned by default to the maintenance QoS class. This parameter has no effect unless the QoS service is enabled. For more information, see the topic mmchqos command in the IBM Spectrum Scale: Administration and Programming Reference. Specify one of the following QoS classes:
maintenance
This QoS class is typically configured to have a smaller share of file system IOPS. Use this class for I/O-intensive, potentially long-running GPFS commands, so that they contribute less to reducing overall file system performance.
other
This QoS class is typically configured to have a larger share of file system IOPS. Use this class for administration commands that are not I/O-intensive.
For more information, see the topic Setting the Quality of Service for I/O operations (QoS) in the IBM Spectrum Scale: Advanced Administration Guide.
--threads ThreadLevel
The number of threads that are created to run mmfsck. The default is 16.

Exit status

0
Successful completion.
2
The command was interrupted before it completed checks or repairs.
4
The command changed the file system and it must now be restarted.
8
The file system contains damage that has not been repaired.
16
The problem cannot be fixed.
The exit string is a combination of three different error indicators:
  1. The first value is the Exit errno value.
  2. The second value is an internal ancillary value that helps explain where the errno value came from.
  3. The third value is the OR of several status bits.

Security

You must have root authority to run the mmfsck command.

The node on which the command is issued must be able to execute remote shell commands on any other node in the cluster without the use of a password and without producing any extraneous messages. For more information, see the topic Requirements for administering a GPFS file system in the IBM Spectrum Scale: Administration and Programming Reference.

Examples

  1. To run the mmfsck command on the fs1 file system, receive a report, but not fix inconsistencies, issue this command:
    mmfsck fs1 -n
    The system displays information similar to:
    Checking "fs1"
    Checking reserved files
    Checking inodes
    Checking inode map file
    Checking ACL file records
    Checking directories and files
    Checking log files
    Checking extended attributes file
    Checking allocation summary file
    Checking policy file
    Checking metadata of filesets
    Checking file reference counts
    Checking file system replication status 
    
                   500224 inodes
                       39   allocated
                        0   repairable
                        0   repaired
                        0   damaged
                        0   deallocated
                        0   orphaned
                        0   attached
                        0   corrupt ACL references
     
                 13107200 subblocks
                   142696   allocated
                        0   unreferenced
                        0   duplicates
                        0   deletable
                        0   deallocated
    
                     4209 addresses
                        0   suspended
                        0   duplicates
                        0   reserved file holes found
                        0   reserved file holes repaired
    File system is clean.
    mmfsck found no inconsistencies in this file system.
  2. To run the mmfsck command on the fs2 file system, receive a report, and fix inconsistencies, issue this command:
    mmfsck fs2 -y  
    The system displays information similar to:
    Checking "fs2"
    Checking inodes
    
    Lost blocks were found.
    Correct the allocation map? yes
    Checking inode map file
    
    Corrections are needed in the inode allocation map.
    Correct the allocation map? yes
    Root inode 32512 of fileset 'fset2' has been deleted.
    Delete the inode reference from fileset metadata? yes
    Checking directories and files
    
    Error in directory inode 3:  DirEntryBad DirLinkCountBad
    Directory entry "top_dir" is not an allocated inode.
    Patching will delete the directory entry.
    Remove directory entry? yes
    Directory entry "fset2" is not an allocated inode.
    Patching will delete the directory entry.
    Remove directory entry? yes
    Directory has an incorrect link count of 4.
    Corrected link count would be 2
    Correct link count? yes
    
    Error in directory inode 12032:  DirEntryBad
    Directory entry ".." is not an allocated inode.
    Cannot allow deletion of this directory entry.
    
    Error in directory inode 12034:  DirEntryBad BadFilesetId
    Directory entry ".." has a fileset id that does not match fileset id of the directory.
    Patching will reset fileset id of inode 32512 to the fileset id of the directory, 1
    Correct fileset id? yes
    Directory entry ".." is not an allocated inode.
    Cannot allow deletion of this directory entry.
    
    Checking log files
    Checking extended attributes file
    Checking allocation summary file
    Checking policy file
    Checking filesets metadata
    Root directory of fileset 'fset2' (inode -1) is invalid
    Recreate fileset root inode and directory? yes
    Checking file reference counts
    
    Directory inode 12032 is not referenced in any directory.
    Reattach inode to lost+found? yes
    
    Directory inode 12034 is not referenced in any directory.
    Reattach inode to lost+found? yes
    Checking file system replication status
                 10585856 inodes
                      369   allocated
                       42   repairable
                       42   repaired
                        0   damaged
                        0   deallocated
                        0   orphaned
                        0   attached
                        0   corrupt ACL references
    
                 89391104 subblocks
                   661908   allocated
                      262   unreferenced
                        0   duplicates
                        0   deletable
                      262   deallocated
    
                    20464 addresses
                        0   suspended
                        0   duplicates
                        0   reserved file holes found
                        0   reserved file holes repaired
    File system is clean.
  3. To run the mmfsck command on the FSchk file system, and create a patch file called path-towrite-patchfile that will store information about the file system inconsistencies, issue this command:
    mmfsck FSchk -nv --patch-file path-towrite-patchfile
    The system displays information similar to:
    Creating patch file "path-towrite-patchfile" on node "Node3"
    Checking "FSchk"
      fsckFlags                     0x8000009
      Stripe group manager          <c0n3>
      needNewLogs                   0
    ...
    
     Checking inode map segment 0 of 1
    Inode 329478 not in use but marked (0x0).
    
    Corrections are needed in the inode allocation map.
    Correct the allocation map? No
      Checking inode map segment 1 of 1
    Checking inode map for inode range 388608 to 518143
      Checking inode map segment 0 of 1
      Checking inode map segment 1 of 1
    Checking inode map for inode range 518144 to 624383
      Checking inode map segment 0 of 1
      Checking inode map segment 1 of 1
    Inode 527110 not in use but marked (0x0).
    2 inodes are not in use but marked.
    
    Error in directory inode 329477:  DirEntryBad DirLinkCountBad
    Directory entry "dir2" is not an allocated inode.
    Patching will delete the directory entry.
    Remove directory entry? No
    Directory has an incorrect link count of 3.
    Corrected link count would be 2
    Correct link count? No
    
    Error in directory inode 527104:  DirEntryBad DirLinkCountBad
    Directory entry "dir_2" is not an allocated inode.
    Patching will delete the directory entry.
    Remove directory entry? No
    Directory has an incorrect link count of 3.
    Corrected link count would be 2
    Correct link count? No
    
    ...
    
    Error in inode 527109:  SubblocksBad
    Inode 527109 has an incorrect subblock count of 40.
    Corrected subblock count would be 32.
    Correct count? No
    
    Error in inode 329474:  SubblocksBad
    Inode 329474 has an incorrect subblock count of 37.
    Corrected subblock count would be 32.
    Correct count? No
    
    Error in inode 329480:  RepCountBad
    Inode 329480 has an incorrect current metadata replica count of 3.
    Correct replication count? No
    
    ...
    
    Scanning directories for cycle
    Checking log files
    Checking extended attributes file
    Checking allocation summary file
    Checking policy file
    Checking metadata of filesets
    Checking file reference counts
    
    ...
    
    
               624384 inodes
                      67   allocated
                       5   repairable
                       0   repaired
                       0   damaged
                       0   deallocated
                       5   orphaned
                       0   attached
                       0   corrupt ACL references
    
                22347776 subblocks
                  191237   allocated
                       0   unreferenced
                       0   duplicates
                       0   deletable
                       0   deallocated
    
                    5860 addresses
                       0   suspended
                       0   duplicates
                       0   reserved file holes found
                       0   reserved file holes repaired
    
    InodeProblemList: 5 entries
    iNum          snapId     status keep delete noScan new error
    ------------- ---------- ------ ---- ------ ------ --- ------------------
           527109          0      1    0      0      0   0 0x00080000 SubblocksBad
           329474          0      1    0      0      0   0 0x00080000 SubblocksBad
           329480          0      1    0      0      0   0 0x00000800 RepCountBad
           329477          0      1    0      0      0   0 0x00009000 DirEntryBad DirLinkCountBad
           527104          0      1    0      0      0   0 0x00009000 DirEntryBad DirLinkCountBad
    File system contains unrepaired damage.
    Exit status 0:0:8.
    Patch file written to "Node3:path-towrite-patchfile" with 13 patch entries.
    mmfsck: 6027-1639 Command failed. Examine previous error messages to determine cause.   
    To use the information that was stored in the patch file (path-towrite-patchfile) to repair the file system, issue the following command:
    mmfsck FSchk -v --patch-file path-towrite-patchfile --patch
    The system displays information similar to:
    Checking "FSchk"
      fsckFlags                     0x18000008
      Stripe group manager          <c0n3>
      needNewLogs                   0
      nThreads                      16
      commited nodes                0
      clientTerm                    0
      fsckReady                     1
      fsckCreated                   0
      % pool allowed                50
    
    
    checkFilesets                 1
      checkFilesetsV2               1
    Checking patch file
    Scanning patch file for allocation map patches
    Patching inode_map block 7 in inode 2 snap 0
      Patching segment 234 index 6 from 0 to 3
    Patching inode_map block 8 in inode 2 snap 0
      Patching segment 259 index 6 from 0 to 3
    Completed patching 2 allocation map patches.
      52 % complete on Thu Apr 30 08:03:24 2015
    Scanning patch file for inode patches
    Patching inode block 2574 in inode 0 snap 0
      Patching record 329474 field inode_last_block_subblocks from 37 to 32
      Patching record 329480 field inode_curr_meta_replicas from 3 to 2
    Patching inode block 4118 in inode 0 snap 0
      Patching record 527109 field inode_last_block_subblocks from 40 to 32
    Patching directory block 0 in inode 527104 snap 0
      Deleting directory entry dir_2 at offset 352
    Patching inode block 4118 in inode 0 snap 0
      Patching record 527104 field inode_num_links from 3 to 2
    Patching directory block 0 in inode 329477 snap 0
      Deleting directory entry dir2 at offset 64
    Patching inode block 2574 in inode 0 snap 0
      Patching record 329477 field inode_num_links from 3 to 2
    Orphaning inode 329483 snap 0
    Orphaning inode 329484 snap 0
    Orphaning inode 329485 snap 0
    Orphaning inode 329486 snap 0
    Orphaning inode 527113 snap 0
    Completed patching 12 inode patches.
     100 % complete on Thu Apr 30 08:03:24 2015
    File system is clean.

Location

/usr/lpp/mmfs/bin