APAR status
Closed as program error.
Error description
GPFS crash due to the following Assertion failure. 2023-07-31_15:59:22.722-0500: [X] logAssertFailed: inode.fileSize.getval() == tinodeP->fileSize 2023-07-31_15:59:22.722-0500: [X] return code 3968, reason code 3744, log record tag 0 2023-07-31_15:59:23.900-0500: [X] *** Assert exp(inode.fileSize.getval() == tinodeP->fileSize) in line 5722 of file /project/sprelgpfs516/build/rgpfs516s001d/src/avs/fs/mmf s/ts/fs/metadata.C 2023-07-31_15:59:23.900-0500: [E] *** Traceback: 2023-07-31_15:59:23.900-0500: [E] 2:0x5557D3F5136A logAssertFailed + 0x3AA at ??:0 2023-07-31_15:59:23.900-0500: [E] 3:0x5557D3C883E4 FileMetadata::mergeInode(int) + 0xF74 at ??:0 2023-07-31_15:59:23.900-0500: [E] 4:0x5557D3D23DFA FileMetadata::mnbMergeInode(int, Inode*) + 0x12A at ??:0 2023-07-31_15:59:23.900-0500: [E] 5:0x5557D3D2C791 FileMetadata::MetaNodeBegin(int, Inode*) + 0x291 at ??:0 2023-07-31_15:59:23.900-0500: [E] 6:0x5557D3D1C828 MNodeToken::tryToBecomeMnode(LkObj::LockModeEnum, int, Inode*) + 0x588 at ??:0 2023-07-31_15:59:23.900-0500: [E] 7:0x5557D3D24DA6 InodeLkObj::change_token(CacheObj*, LkObj*, LkObj::LockModeEnum, int, LkObj::LockModeEnum*) + 0x156 at ??:0 2023-07-31_15:59:23.900-0500: [E] 8:0x5557D3A4354F LkObj::change_lock_shark_m(CacheObj*, LkObj::LockModeEnum, LkObj::LockModeEnum, LkObj::LockModeEnum*, int, int) + 0x4EF at ??:0 2023-07-31_15:59:23.900-0500: [E] 9:0x5557D3A440D5 LkObj::lock_shark_m(CacheObj*, LkObj::LockModeEnum, LkObj::LockModeEnum*, int, int) + 0x15 at ??:0 2023-07-31_15:59:23.900-0500: [E] 10:0x5557D3CAE158 LockFile(OpenFile**, StripeGroup*, FileUID, OperationLockMode, LkObj::LockModeEnum, LkObj::LockModeEnum*, LkObj: :LockModeEnum*, int, int) + 0x2B8 at ??:0 2023-07-31_15:59:23.900-0500: [E] 11:0x5557D3BFF47B FSOperation::createLockedFile(StripeGroup*, FileUID, OperationLockMode, LkObj::LockModeEnum, OpenFile**, unsigne d int*, int, int) + 0x9B at ??:0 2023-07-31_15:59:23.900-0500: [E] 12:0x5557D3B247A9 InodePrefetchInstance::doWork() + 0x769 at ??:0 2023-07-31_15:59:23.900-0500: [E] 13:0x5557D3B24E48 InodePrefetchInstance::WorkerThreadBody(void*) + 0x108 at ??:0 2023-07-31_15:59:23.900-0500: [E] 14:0x5557D3A1F662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-07-31_15:59:23.900-0500: [E] 15:0x5557D3A0C680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-07-31_15:59:23.900-0500: [E] 16:0x7EFCD480A1CA start_thread + 0xEA at ??:0 2023-07-31_15:59:23.900-0500: [E] 17:0x7EFCD3529E73 __GI___clone + 0x43 at ??:0
Local fix
GPFS daemon will restart itself. It's a code defect. One needs to get the fix code.
Problem summary
Race between stat/gpfs_stalite() and inode token revoke causes log assert.
Problem conclusion
This problem is fixed in 5.1.2.14 To see all Spectrum Scale APARs and their respective Fix solutions refer to page: https://public.dhe.ibm.com/storage/spectrumscale/spectrum_scale_ apars.html Benefits of the solution: Closed race window and fixed this problem Work Around: Set config parameters statliteMaxAttrAge and statMaxAttrAge to 0 to disable stat lite. Problem trigger: A file is actively written on one node and stat() or gpfs_statlite() is called repeatedly on another node Symptom: Abend/Crash Platforms affected: ALL Operating System environments Functional Area affected: All Scale Users Customer Impact: High Importance
Temporary fix
Comments
APAR Information
APAR number
IJ48629
Reported component name
SPEC SCALE STD
Reported component ID
5737F33AP
Reported release
516
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2023-09-19
Closed date
2023-11-02
Last modified date
2023-11-02
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SPEC SCALE STD
Fixed component ID
5737F33AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"516","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
03 November 2023