Checking and repairing a file system

The mmfsck command finds and repairs conditions that can cause problems in your file system. The mmfsck command operates in two modes: online and offline.

The online mode operates on a mounted file system and is chosen by issuing the -o option. Conversely, the offline mode operates on an unmounted file system. In general it is unnecessary to run mmfsck in offline mode unless under the direction of the IBM® Support Center.

The online mode only checks and recovers unallocated blocks on a mounted file system. If a GPFS™ file operation fails due to an out of space condition, the cause may be disk blocks that have become unavailable after repeated node failures. The corrective action taken is to mark the block free in the allocation map. Any other inconsistencies found are only reported, not repaired.
Note:
  1. If you are running the online mmfsck command to free allocated blocks that do not belong to any files, plan to make file system repairs when system demand is low. This is I/O intensive activity and it can affect system performance.
  2. If you are repairing a file system due to node failure and the file system has quotas enabled, it is suggested that you run the mmcheckquota command to recreate the quota files.
To repair any other inconsistencies, you must run the offline mode of the mmfsck command on an unmounted file system. The offline mode checks for these file inconsistencies that might cause problems:
  • Blocks marked allocated that do not belong to any file. The corrective action is to mark the block free in the allocation map.
  • Files and directories for which an inode is allocated and no directory entry exists, known as orphaned files. The corrective action is to create directory entries for these files in a lost+found subdirectory in the root directory of the fileset to which the file or directory belongs. A fileset is a subtree of a file system namespace that in many respects behaves like an independent file system. The index number of the inode is assigned as the name. If you do not allow the mmfsck command to reattach an orphaned file, it asks for permission to delete the file.
  • Directory entries pointing to an inode that is not allocated. The corrective action is to remove the directory entry.
  • Incorrectly formed directory entries. A directory file contains the inode number and the generation number of the file to which it refers. When the generation number in the directory does not match the generation number stored in the file's inode, the corrective action is to remove the directory entry.
  • Incorrect link counts on files and directories. The corrective action is to update them with accurate counts.
  • Policy files that are not valid. The corrective action is to delete the file.
  • Various problems related to filesets: missing or corrupted fileset metadata, inconsistencies in directory structure related to filesets, missing or corrupted fileset root directory, other problems in internal data structures. The repaired filesets will be renamed as Fileset FilesetId and put into unlinked state.

The mmfsck command performs other functions not listed here, as deemed necessary by GPFS.

The --patch-file parameter of the mmfsck command can be used to generate a report of file system inconsistencies. The following is an example of a patch file that is generated by mmfsck for a file system with a bad directory inode:
gpfs_fsck

<header>
  sgid = "C0A87ADC:5555C87F"
  disk_data_version = 1
  fs_name = "gpfsh0"
  #patch_file_version = 1
  #start_time = "Fri May 15 16:32:58 2015"
  #fs_manager_node = "h0"
  #fsck_flags = 150994957
</header>

<patch_inode>
  patch_type = "dealloc"
  snapshot_id = 0
  inode_number = 50432
</patch_inode>

<patch_block>
  snapshot_id = 0
  inode_number = 3
  block_num = 0
  indirection_level = 0
  generation_number = 1
  is_clone = false
  is_directory_block = true
  rebuild_block = false
  #num_patches = 1

  <patch_dir>
    entry_offset = 48
    entry_fold_value = 306661480
    delete_entry = true
  </patch_dir>
</patch_block>

<patch_block>
  snapshot_id = 0
  inode_number = 0
  block_num = 0
  indirection_level = 0
  generation_number = 4294967295
  is_clone = false
  is_directory_block = false
  rebuild_block = false
  #num_patches = 1

  <patch_field>
    record_number = 3
    field_id = "inode_num_links"
    new_value = 2
    old_value = 3
  </patch_field>
</patch_block>

<patch_inode>
  patch_type = "orphan"
  snapshot_id = 0
  inode_number = 50433
</patch_inode>

<footer>
  #stop_time = "Fri May 15 16:33:06 2015"
  #num_sections = 203
  #fsck_exit_status = 8
  need_full_fsck_scan = false
</footer>
The mmfsck command can be run with both the --patch-file and --patch parameters to repair a file system with the information stored in the patch file. Using a patch file prevents a subsequent scan of the file system before the repair actions begin.

You cannot run the mmfsck command on a file system that has disks in a down state. You must first run the mmchdisk command to change the state of the disks to unrecovered or up. To display the status of the disks in the file system, issue the mmlsdisk command.

To check the file system fs1 without making any changes to the file system, issue the following command:
mmfsck fs1

For complete usage information, see mmchdisk command, mmcheckquota command, mmfsck command, and mmlsdisk command.