mmfsck command
Checks and repairs a GPFS™ file system.
Synopsis
mmfsck Device [-n | -y] [-s | -v | -V]
[-c | -m | -o | --skip-inode-check | --skip-directory-check]
[-t Directory]
[-N {Node[,Node...] | NodeFile | NodeClass}]
[--patch-file Path [--patch]] [--qos QosClass]
[--threads ThreadLevel]
The file system must be unmounted before you can run the mmfsck command with any option other than -o.
Availability
Available on all IBM Spectrum Scale™ editions.
Description
The mmfsck command in offline mode is intended to be used only in situations where there have been disk or communications failures that have caused MMFS_FSSTRUCT error log entries to be issued, or where it is known that disks have been forcibly removed or otherwise permanently unavailable for use in the file system, and other unexpected symptoms are seen by users. In general it is unnecessary to run mmfsck in offline mode unless under the direction of the IBM® Support Center.
If neither the -n nor -y flag is specified, the mmfsck command runs interactively prompting you for permission to repair each consistency error as reported. It is suggested that in all but the most severely damaged file systems, you run the mmfsck command interactively (the default).
The occurrence of I/O errors, or the appearance of a message telling you to run the mmfsck command, may indicate file system inconsistencies. If either situation occurs, use the mmfsck command to check file system consistency and interactively repair the file system.
- Blocks marked allocated that do not belong to any file. The corrective action is to mark the block free in the allocation map.
- Files for which an inode is allocated and no directory entry exists (orphaned files). The corrective action is to create directory entries for these files in a lost+found subdirectory of the fileset to which the orphaned file or directory belongs. The index number of the inode is assigned as the name. If you do not allow the mmfsck command to reattach an orphaned file, it asks for permission to delete the file.
- Directory entries pointing to an inode that is not allocated. The corrective action is to remove the directory entry.
- Incorrectly formed directory entries. A directory file contains the inode number and the generation number of the file to which it refers. When the generation number in the directory does not match the generation number stored in the file's inode, the corrective action is to remove the directory entry.
- Incorrect link counts on files and directories. The corrective action is to update them with accurate counts.
- Policy files are not valid. The corrective action is to delete the file.
- Various problems related to filesets: missing or corrupted fileset metadata, inconsistencies in directory structure related to filesets, missing or corrupted fileset root directory, other problems in internal data structures. The repaired filesets will be renamed as FilesetFilesetId and put into unlinked state.
If you are repairing a file system due to node failure and the file system has quotas enabled, it is suggested that you run the mmcheckquota command to recreate the quota files.
- An MMFS_FSSTRUCT along with an MMFS_SYSTEM_UNMOUNT error log entry on any node indicating some critical piece of the file system is inconsistent.
- Disk media failures
- Partial disk failure
- EVALIDATE=214, Invalid checksum or other consistency check failure on a disk data structure, reported in error logs or returned to an application.
For further information on recovery actions and how to contact the IBM Support Center, see the IBM Spectrum Scale: Problem Determination Guide.
If you are running the online mmfsck command to free allocated blocks that do not belong to any files, plan to make file system repairs when system demand is low. This is an I/O intensive activity and it can affect system performance.
Results
If the file system is inconsistent, the mmfsck command displays information about the inconsistencies and (depending on the option entered) may prompt you for permission to repair them. The mmfsck command tries to avoid actions that may result in loss of data. In some cases, however, it may indicate the destruction of a damaged file.
All corrective actions, with the exception of recovering lost disk blocks (blocks that are marked as allocated but do not belong to any file), require that the file system be unmounted on all nodes. If the mmfsck command is run on a mounted file system, lost blocks are recovered but any other inconsistencies are only reported, not repaired.
If a bad disk is detected, the mmfsck command stops the disk and writes an entry to the error log. The operator must manually start and resume the disk when the problem is fixed.
The file system must be unmounted on all nodes before the mmfsck command can repair file system inconsistencies.
Parameters
- Device
- The device name of the file system to be checked and repaired.
File system names need not be fully-qualified. fs0 is
as acceptable as /dev/fs0.
This must be the first parameter.
- -n
- Specifies a no response to all file system error repair prompts from the mmfsck command. The option reports inconsistencies but it does not change the file system. To save this information, redirect it to an output file when you issue the mmfsck command.
- -y
- Specifies a yes response to all file system error repair prompts from the mmfsck command. Use this option only on severely damaged file systems. It allows the mmfsck command to take any action necessary for repairs.
- -s
- Specifies that the output is semi-verbose.
- -v
- Specifies that the output is verbose.
- -V
- Specifies that the output is verbose and contains information for debugging purposes.
- -c
- When the file system log has been lost and the file system is replicated, this option specifies that the mmfsck command attempt corrective action by comparing the replicas of metadata and data. If this error condition occurs, it is indicated by an error log entry.
- -m
- Has the same meaning as -c, except that mmfsck checks only the metadata replica blocks. It therefore runs faster than with -c.
- -o
- Online mode does not perform a full file system consistency check, but blocks marked as allocated that do not belong to a file are recovered. Lost blocks do not constitute file system corruption.
- --skip-inode-check
- Causes the command to run faster by skipping its inode-check phase. Include this option only if you know that the inodes are valid and that only directories need to be checked. In this mode, the product does not scan all parts of the file system and therefore might not detect all corruptions in the file system.
- --skip-directory-check
- Causes the command to run faster by skipping its directory-check phase. Include this option if you want to check only the inodes. In this mode, the product does not scan all parts of the file system and therefore might not detect all corruptions in the file system.
- -t Directory
- Specifies the directory that GPFS uses for temporary storage during mmfsck command processing. This directory must be available on all nodes that are participating in mmfsck and that are designated as either manager or quorum node. In addition to the location requirement, the storage directory has a minimum space requirement of 4GB. The default directory for mmfsck processing is /tmp.
- -N {Node[,Node...] | NodeFile | NodeClass}
- Specify the nodes to participate in the check and repair of the
file system. This command supports all defined node classes. The default
is all or the current value of the defaultHelperNodes parameter
of the mmchconfig command.
For information on how to specify node names, see the topic Specifying nodes as inputs to GPFS commands in the IBM Spectrum Scale: Advanced Administration Guide.
- --patch-file Path
- Specifies the name of a patch file. When the --patch parameter
is not specified, information about file system inconsistencies (detected
during an mmfsck run with the -n parameter)
are stored in the patch file that is specified by Path. Path should
be accessible from the file system manager node. The information stored
in the patch file can be viewed as a report of the problems in the
file system. For more information about patch files, see the topic Checking and repairing a
file system in the IBM
Spectrum Scale: Advanced
Administration Guide.
When this option is specified with the --patch parameter, the information in the patch file is read and used to repair the file system.
- --patch
- Specifies that the file system will be repaired using the information stored in the patch file that is specified with --patch-file Path.
- --qos QOSClass
- Specifies the Quality of Service for I/O operations (QoS) class
to which the instance of the command is assigned. If you do not specify
this parameter, the instance of the command is assigned by default
to the maintenance QoS class. This parameter has
no effect unless the QoS service
is enabled. For more information, see the topic mmchqos
command in the IBM
Spectrum Scale: Administration
and Programming Reference. Specify
one of the following QoS classes:
- maintenance
- This QoS class is typically configured to have a smaller share of file system IOPS. Use this class for I/O-intensive, potentially long-running GPFS commands, so that they contribute less to reducing overall file system performance.
- other
- This QoS class is typically configured to have a larger share of file system IOPS. Use this class for administration commands that are not I/O-intensive.
- --threads ThreadLevel
- The number of threads that are created to run mmfsck. The default is 16.
Exit status
- 0
- Successful completion.
- 2
- The command was interrupted before it completed checks or repairs.
- 4
- The command changed the file system and it must now be restarted.
- 8
- The file system contains damage that has not been repaired.
- 16
- The problem cannot be fixed.
- The first value is the Exit errno value.
- The second value is an internal ancillary value that helps explain where the errno value came from.
- The third value is the OR of several status bits.
Security
You must have root authority to run the mmfsck command.
The node on which the command is issued must be able to execute remote shell commands on any other node in the cluster without the use of a password and without producing any extraneous messages. For more information, see the topic Requirements for administering a GPFS file system in the IBM Spectrum Scale: Administration and Programming Reference.
Examples
- To run the mmfsck command on the fs1 file
system, receive a report, but not fix inconsistencies, issue this
command:
The system displays information similar to:mmfsck fs1 -n
mmfsck found no inconsistencies in this file system.Checking "fs1" Checking reserved files Checking inodes Checking inode map file Checking ACL file records Checking directories and files Checking log files Checking extended attributes file Checking allocation summary file Checking policy file Checking metadata of filesets Checking file reference counts Checking file system replication status 500224 inodes 39 allocated 0 repairable 0 repaired 0 damaged 0 deallocated 0 orphaned 0 attached 0 corrupt ACL references 13107200 subblocks 142696 allocated 0 unreferenced 0 duplicates 0 deletable 0 deallocated 4209 addresses 0 suspended 0 duplicates 0 reserved file holes found 0 reserved file holes repaired File system is clean.
- To run the mmfsck command on the fs2 file
system, receive a report, and fix inconsistencies, issue this command:
The system displays information similar to:mmfsck fs2 -y
Checking "fs2" Checking inodes Lost blocks were found. Correct the allocation map? yes Checking inode map file Corrections are needed in the inode allocation map. Correct the allocation map? yes Root inode 32512 of fileset 'fset2' has been deleted. Delete the inode reference from fileset metadata? yes Checking directories and files Error in directory inode 3: DirEntryBad DirLinkCountBad Directory entry "top_dir" is not an allocated inode. Patching will delete the directory entry. Remove directory entry? yes Directory entry "fset2" is not an allocated inode. Patching will delete the directory entry. Remove directory entry? yes Directory has an incorrect link count of 4. Corrected link count would be 2 Correct link count? yes Error in directory inode 12032: DirEntryBad Directory entry ".." is not an allocated inode. Cannot allow deletion of this directory entry. Error in directory inode 12034: DirEntryBad BadFilesetId Directory entry ".." has a fileset id that does not match fileset id of the directory. Patching will reset fileset id of inode 32512 to the fileset id of the directory, 1 Correct fileset id? yes Directory entry ".." is not an allocated inode. Cannot allow deletion of this directory entry. Checking log files Checking extended attributes file Checking allocation summary file Checking policy file Checking filesets metadata Root directory of fileset 'fset2' (inode -1) is invalid Recreate fileset root inode and directory? yes Checking file reference counts Directory inode 12032 is not referenced in any directory. Reattach inode to lost+found? yes Directory inode 12034 is not referenced in any directory. Reattach inode to lost+found? yes Checking file system replication status 10585856 inodes 369 allocated 42 repairable 42 repaired 0 damaged 0 deallocated 0 orphaned 0 attached 0 corrupt ACL references 89391104 subblocks 661908 allocated 262 unreferenced 0 duplicates 0 deletable 262 deallocated 20464 addresses 0 suspended 0 duplicates 0 reserved file holes found 0 reserved file holes repaired File system is clean.
- To run the mmfsck command on the FSchk file
system, and create a patch file called path-towrite-patchfile that
will store information about the file system inconsistencies, issue
this command:
The system displays information similar to:mmfsck FSchk -nv --patch-file path-towrite-patchfile
To use the information that was stored in the patch file (path-towrite-patchfile) to repair the file system, issue the following command:Creating patch file "path-towrite-patchfile" on node "Node3" Checking "FSchk" fsckFlags 0x8000009 Stripe group manager <c0n3> needNewLogs 0 ... Checking inode map segment 0 of 1 Inode 329478 not in use but marked (0x0). Corrections are needed in the inode allocation map. Correct the allocation map? No Checking inode map segment 1 of 1 Checking inode map for inode range 388608 to 518143 Checking inode map segment 0 of 1 Checking inode map segment 1 of 1 Checking inode map for inode range 518144 to 624383 Checking inode map segment 0 of 1 Checking inode map segment 1 of 1 Inode 527110 not in use but marked (0x0). 2 inodes are not in use but marked. Error in directory inode 329477: DirEntryBad DirLinkCountBad Directory entry "dir2" is not an allocated inode. Patching will delete the directory entry. Remove directory entry? No Directory has an incorrect link count of 3. Corrected link count would be 2 Correct link count? No Error in directory inode 527104: DirEntryBad DirLinkCountBad Directory entry "dir_2" is not an allocated inode. Patching will delete the directory entry. Remove directory entry? No Directory has an incorrect link count of 3. Corrected link count would be 2 Correct link count? No ... Error in inode 527109: SubblocksBad Inode 527109 has an incorrect subblock count of 40. Corrected subblock count would be 32. Correct count? No Error in inode 329474: SubblocksBad Inode 329474 has an incorrect subblock count of 37. Corrected subblock count would be 32. Correct count? No Error in inode 329480: RepCountBad Inode 329480 has an incorrect current metadata replica count of 3. Correct replication count? No ... Scanning directories for cycle Checking log files Checking extended attributes file Checking allocation summary file Checking policy file Checking metadata of filesets Checking file reference counts ... 624384 inodes 67 allocated 5 repairable 0 repaired 0 damaged 0 deallocated 5 orphaned 0 attached 0 corrupt ACL references 22347776 subblocks 191237 allocated 0 unreferenced 0 duplicates 0 deletable 0 deallocated 5860 addresses 0 suspended 0 duplicates 0 reserved file holes found 0 reserved file holes repaired InodeProblemList: 5 entries iNum snapId status keep delete noScan new error ------------- ---------- ------ ---- ------ ------ --- ------------------ 527109 0 1 0 0 0 0 0x00080000 SubblocksBad 329474 0 1 0 0 0 0 0x00080000 SubblocksBad 329480 0 1 0 0 0 0 0x00000800 RepCountBad 329477 0 1 0 0 0 0 0x00009000 DirEntryBad DirLinkCountBad 527104 0 1 0 0 0 0 0x00009000 DirEntryBad DirLinkCountBad File system contains unrepaired damage. Exit status 0:0:8. Patch file written to "Node3:path-towrite-patchfile" with 13 patch entries. mmfsck: 6027-1639 Command failed. Examine previous error messages to determine cause.
The system displays information similar to:mmfsck FSchk -v --patch-file path-towrite-patchfile --patch
Checking "FSchk" fsckFlags 0x18000008 Stripe group manager <c0n3> needNewLogs 0 nThreads 16 commited nodes 0 clientTerm 0 fsckReady 1 fsckCreated 0 % pool allowed 50 checkFilesets 1 checkFilesetsV2 1 Checking patch file Scanning patch file for allocation map patches Patching inode_map block 7 in inode 2 snap 0 Patching segment 234 index 6 from 0 to 3 Patching inode_map block 8 in inode 2 snap 0 Patching segment 259 index 6 from 0 to 3 Completed patching 2 allocation map patches. 52 % complete on Thu Apr 30 08:03:24 2015 Scanning patch file for inode patches Patching inode block 2574 in inode 0 snap 0 Patching record 329474 field inode_last_block_subblocks from 37 to 32 Patching record 329480 field inode_curr_meta_replicas from 3 to 2 Patching inode block 4118 in inode 0 snap 0 Patching record 527109 field inode_last_block_subblocks from 40 to 32 Patching directory block 0 in inode 527104 snap 0 Deleting directory entry dir_2 at offset 352 Patching inode block 4118 in inode 0 snap 0 Patching record 527104 field inode_num_links from 3 to 2 Patching directory block 0 in inode 329477 snap 0 Deleting directory entry dir2 at offset 64 Patching inode block 2574 in inode 0 snap 0 Patching record 329477 field inode_num_links from 3 to 2 Orphaning inode 329483 snap 0 Orphaning inode 329484 snap 0 Orphaning inode 329485 snap 0 Orphaning inode 329486 snap 0 Orphaning inode 527113 snap 0 Completed patching 12 inode patches. 100 % complete on Thu Apr 30 08:03:24 2015 File system is clean.