APAR status
Closed as program error.
Error description
ABSTRACT: Race between InodeLkObj::change_token(rs, CTM_A_LITE) and token revoke causes: logAssertFailed: ofP->isInodeValid() When accessing file from multiple nodes GPFS may hit assert due to race between change_token and toke revoke [X] logAssertFailed: ofP->isInodeValid() [X] return code 0, reason code 0, log record tag 0 [I] Freezing overwrite mode tracing to preserve failure data [X] *** Assert exp(ofP->isInodeValid()) in line 447 of file /project/spreltac505/build/rtac505s001a/src/avs/fs/mmfs/t s/fs/mnode.C [E] *** Traceback: [E] 2:0x558BB19032F8 logAssertFailed + 0x418 at ??:0 [E] 3:0x558BB16EEE92 FileMetadata::mnUpdateInode(Inode*) + 0x892 at ??:0 [E] 4:0x558BB16F7F0A InodeLkObj::change_token(CacheObj*, LkObj*, LkObj::LockModeEnum, int, LkObj::LockModeEnum*) + 0x51A at ??:0 [E] 5:0x558BB1404F0E LkObj::change_lock_shark_m(CacheObj*, LkObj::LockModeEnum, LkObj::LockModeEnum, LkObj::LockModeEnum*, int, int) + 0x7BE at ??:0 [E] 6:0x558BB1405497 LkObj::lock_shark_m(CacheObj*, LkObj::LockModeEnum, LkObj::LockModeEnum*, int, int) + 0x17 at ??:0 [E] 7:0x558BB16954EB FileHashTab::fetch(CacheObj*, unsigned short, LkObj::LockModeEnum, int, void*, int) + 0x25B at ??:0 [E] 8:0x558BB13F7A03 HandleMBHashFetch(MBHashFetchParms*) + 0x153 at ??:0 [E] 9:0x558BB13F3BA3 Mailbox::msgHandlerBody(void*) + 0x363 at ??:0 [E] 10:0x558BB13D7313 Thread::callBody(Thread*) + 0x63 at ??:0 [E] 11:0x558BB13C4262 Thread::callBodyWrapper(Thread*) + 0xA2 at ??:0 [E] 12:0x7F8E37070DD5 start_thread + 0xC5 at ??:0 [E] 13:0x7F8E3612302D __clone + 0x6D at ??:0 mmfsd: /project/spreltac505/build/rtac505s001a/src/avs/fs/mmfs/t s/fs/mnode.C:447: void logAssertFailed(UInt32, const char*, UInt32, Int32, Int32, UInt32, const char*, const char*): Assertion 'ofP->isInodeValid()' failed. [E] Signal 6 at location 0x7F8E3605B2C7 in process 324632, link reg 0xFFFFFFFFFFFFFFFF. [I] rax 0x0000000000000000 rbx 0x00007F8E376A3000 [I] rcx 0xFFFFFFFFFFFFFFFF rdx 0x0000000000000006 [I] rsp 0x00007F8E2C787CD8 rbp 0x00007F8E361AF020 [I] rsi 0x000000000004F504 rdi 0x000000000004F418 [I] r8 0x0000000000000001 r9 0xFEFEFF092D63646B [I] r10 0x0000000000000008 r11 0x0000000000000206 [I] r12 0x0000558BB2597409 r13 0x0000558BB26254E0 [I] r14 0x0000558BB25CBEE8 r15 0x00000000000001BF [I] rip 0x00007F8E3605B2C7 eflags 0x0000000000000206 [I] csgsfs 0x0000000000000033 err 0x0000000000000000 [I] trapno 0x0000000000000000 oldmsk 0x0000000010017807 [I] cr2 0x0000000000000000 Reported In: Spectrum Scale 5.0.5.1
Local fix
Problem summary
logAssertFailed: ofP->isInodeValid() at mnUpdateInode when doing stat() or gpfs_statlite()
Problem conclusion
Benefits of the solution: No more assert Work Around: None Problem trigger: A file is actively written on one node and repeatedly be stat() or gpfs_statlite() on another node Symptom: Abend/Crash Platforms affected: ALL Operating System environments Functional Area affected: All Scale Users Customer Impact: High Importance
Temporary fix
Comments
APAR Information
APAR number
IJ29209
Reported component name
SPEC SCALE ADV
Reported component ID
5737F35AP
Reported release
505
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-11-10
Closed date
2021-01-04
Last modified date
2021-01-04
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SPEC SCALE ADV
Fixed component ID
5737F35AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"505","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
06 January 2021