Creating a master GPFS log file

The GPFS™ log frequently shows problems on one node that actually originated on another node.

GPFS is a file system that runs on multiple nodes of a cluster. This means that problems originating on one node of a cluster often have effects that are visible on other nodes. It is often valuable to merge the GPFS logs in pursuit of a problem. Having accurate time stamps aids the analysis of the sequence of events.

Before following any of the debug steps, IBM® suggests that you:
  1. Synchronize all clocks of all nodes in the GPFS cluster. If this is not done, and clocks on different nodes are out of sync, there is no way to establish the real time line of events occurring on multiple nodes. Therefore, a merged error log is less useful for determining the origin of a problem and tracking its effects.
  2. Merge and chronologically sort all of the GPFS log entries from each node in the cluster. The --gather-logs option of the gpfs.snap command can be used to achieve this:
    gpfs.snap --gather-logs -d /tmp/logs -N all
    The system displays information similar to:
    gpfs.snap: Gathering mmfs logs ...
    gpfs.snap: The sorted and unsorted mmfs.log files are in /tmp/logs

    If the --gather-logs option is not available on your system, you can create your own script to achieve the same task; use /usr/lpp/mmfs/samples/gatherlogs.samples.sh as an example.