IBM Support

How to determine which corrupt JFS file system caused an AIX crash

Troubleshooting


Problem

Customers that use the AIX legacy file system, JFS, can experience a crash due to file system corruption.

Environment

Most customers have migrated from JFS to JFS2 on their AIX systems. JFS2 offers many advantages not featured in JFS. Some customers continue to use this legacy file system type. The purpose of this document is to help identify which file system caused the server to crash. This procedure applies only to JFS.
JFS file systems are supported on all versions of AIX from 3.2 through 7.2.

Diagnosing The Problem

A typical stack trace for this type of crash dump:
CRASH INFORMATION:
CPU 4 CSA F1000815B016ED00 at time of crash, error code for LEDs: 70000000
pvthread+049700 STACK:
[002EE738]v_jfscorruption+000078 (0000000000000096, FFFFF0000800000B,
   F1000A00204767C0, 0000000000000000, 0000000000000000 [??])
[002FA194]v_findiblk+000514 (??, ??, ??)
[00178510]v_allocdisk+000530 (??, ??, ??, ??, ??)
[001808C8]v_fpagein+0000C8 (??, ??, ??, ??, ??, ??)
[001833B4]v_pagein+000094 (??, ??, ??, ??, ??, ??)
[002E5780]pfget+000940 (??, ??, ??, ??, ??, ??, ??)
[0030BD28]v_pfget+000848 (??, ??, ??, ??, ??, ??)
[001D2480]state_save_ret+000578 ()
____ Exception (F000000030019600) ____
The presence of v_jfscorruption in the stack trace indicates that this crash was due to JFS file system corruption. This is how to determine which file system is involved
(4)> dw @jfserrlog 4
_jfsdata+000000: 4A465345 5E418093 00000019 00000096  JFSE^A..........

(4)> pdt 96

PDT address F1000F00093087F0 entry 0096 of 1FFF, type: FILESYSTEM
eye catch             (_eyec)   : 706474564D4D0096
next pdt on i/o list  (nextio)  : FFFFFFFF
dev_t or strategy ptr (device)  : 8000003C00000001   >> 60,1
...
 
The device, or strategy pointer, shows the major, minor number for the underlying logical volume the file system resides upon is 0x3C,1. In decimal this is 60,1.
The error report may also show JFS corruption messages:
Feb 10 10:13:21 SYSPFS     U JFS_META_CORRUPTION Filename v_mapsubs.c Device 003C0001 (60,1)
Feb 10 10:14:13 errdemon   T ERRLOG_ON           ERROR LOGGING TURNED ON
Feb 10 08:30:10 SYSPFS     I JFS_FS_FRAGMENTED   Filesystem /dev/cicstnglv, /var/cics_regions/anstng Device 003C0001 (60,1)
Feb 10 08:13:56 SYSPFS     I JFS_FS_FRAGMENTED   Filesystem /dev/cicstnglv, /var/cics_regions/anstng Device 003C0001 (60,1)
Older versions of AIX do not identify the file system. A lookup for the major, minor number in the CuDvDr.add ODM file collected with an AIX snap will show the underlying logical volume for the file system:
$ grep -p 'value1 = \"60\"' CuDvDr.add | grep -p 'value2 = \"1\"'
CuDvDr:
        resource = "devno"
        value1 = "60"
        value2 = "1"
        value3 = "cicstnglv"

Resolving The Problem

Once the corrupt file system has been identified, unmount it and run fsck to clean it.

Document Location

Worldwide

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
20 February 2020

UID

ibm13040989