IBM Support

IJ29209: ASSERT EXP(OFP->ISINODEVALID())

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • ABSTRACT:
    Race between InodeLkObj::change_token(rs, CTM_A_LITE)
    and token revoke causes:
    logAssertFailed: ofP->isInodeValid()
    
    
    When accessing file from multiple nodes GPFS may hit
    assert due to race between change_token and toke revoke
     [X] logAssertFailed: ofP->isInodeValid()
     [X] return code 0, reason code 0, log record tag 0
     [I] Freezing overwrite mode tracing to preserve failure
    data
     [X] *** Assert exp(ofP->isInodeValid()) in line 447 of
    file
    /project/spreltac505/build/rtac505s001a/src/avs/fs/mmfs/t
    s/fs/mnode.C
     [E] *** Traceback:
     [E]         2:0x558BB19032F8 logAssertFailed + 0x418 at
    ??:0
     [E]         3:0x558BB16EEE92
    FileMetadata::mnUpdateInode(Inode*) + 0x892 at ??:0
     [E]         4:0x558BB16F7F0A
    InodeLkObj::change_token(CacheObj*, LkObj*,
    LkObj::LockModeEnum, int, LkObj::LockModeEnum*) + 0x51A
    at ??:0
     [E]         5:0x558BB1404F0E
    LkObj::change_lock_shark_m(CacheObj*,
    LkObj::LockModeEnum, LkObj::LockModeEnum,
    LkObj::LockModeEnum*, int, int) + 0x7BE at ??:0
     [E]         6:0x558BB1405497
    LkObj::lock_shark_m(CacheObj*, LkObj::LockModeEnum,
    LkObj::LockModeEnum*, int, int) + 0x17 at ??:0
     [E]         7:0x558BB16954EB
    FileHashTab::fetch(CacheObj*, unsigned short,
    LkObj::LockModeEnum, int, void*, int) + 0x25B at ??:0
     [E]         8:0x558BB13F7A03
    HandleMBHashFetch(MBHashFetchParms*) + 0x153 at ??:0
     [E]         9:0x558BB13F3BA3
    Mailbox::msgHandlerBody(void*) + 0x363 at ??:0
     [E]         10:0x558BB13D7313 Thread::callBody(Thread*)
    + 0x63 at ??:0
     [E]         11:0x558BB13C4262
    Thread::callBodyWrapper(Thread*) + 0xA2 at ??:0
     [E]         12:0x7F8E37070DD5 start_thread + 0xC5 at
    ??:0
     [E]         13:0x7F8E3612302D __clone + 0x6D at ??:0
    mmfsd:
    /project/spreltac505/build/rtac505s001a/src/avs/fs/mmfs/t
    s/fs/mnode.C:447: void logAssertFailed(UInt32, const
    char*, UInt32, Int32, Int32, UInt32, const char*, const
    char*): Assertion 'ofP->isInodeValid()' failed.
    [E] Signal 6 at location 0x7F8E3605B2C7 in process
    324632, link reg 0xFFFFFFFFFFFFFFFF.
    [I] rax    0x0000000000000000  rbx    0x00007F8E376A3000
    [I] rcx    0xFFFFFFFFFFFFFFFF  rdx    0x0000000000000006
    [I] rsp    0x00007F8E2C787CD8  rbp    0x00007F8E361AF020
    [I] rsi    0x000000000004F504  rdi    0x000000000004F418
    [I] r8     0x0000000000000001  r9     0xFEFEFF092D63646B
    [I] r10    0x0000000000000008  r11    0x0000000000000206
    [I] r12    0x0000558BB2597409  r13    0x0000558BB26254E0
    [I] r14    0x0000558BB25CBEE8  r15    0x00000000000001BF
    [I] rip    0x00007F8E3605B2C7  eflags 0x0000000000000206
    [I] csgsfs 0x0000000000000033  err    0x0000000000000000
    [I] trapno 0x0000000000000000  oldmsk 0x0000000010017807
    [I] cr2    0x0000000000000000
    
    Reported In:
    Spectrum Scale 5.0.5.1
    

Local fix

Problem summary

  • logAssertFailed: ofP->isInodeValid() at mnUpdateInode
    when doing stat() or gpfs_statlite()
    

Problem conclusion

  • Benefits of the solution:
    No more assert
    
    Work Around:
    None
    
    Problem trigger:
    A file is actively written on one node and
    repeatedly be stat() or gpfs_statlite() on
    another node
    
    Symptom:
    Abend/Crash
    
    Platforms affected:
    ALL Operating System environments
    
    Functional Area affected:
    All Scale Users
    
    Customer Impact:
    High Importance
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ29209

  • Reported component name

    SPEC SCALE ADV

  • Reported component ID

    5737F35AP

  • Reported release

    505

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2020-11-10

  • Closed date

    2021-01-04

  • Last modified date

    2021-01-04

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IJ30138

Fix information

  • Fixed component name

    SPEC SCALE ADV

  • Fixed component ID

    5737F35AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"505","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
06 January 2021