APAR status
Closed as program error.
Error description
Spectrum Scale daemon crashes, filesystem gets unmounted on the node. The daemon crashes because of daemon code hit an assert which is saying the disk address is not expected. Reported in: Spectrum Scale 5.0.4.x Error message: [X] logAssertFailed: !"oldDiskAddrFound.compAddr(*oldDiskAddrP)" [X] return code 0, reason code 0, log record tag 0 [X] *** Assert exp(!"oldDiskAddrFound.compAddr(*oldDiskAddrP)") in line 11122 of file /project/spreltac504/build/rtac504s001a/src/avs/fs/mmfs/t s/fs/metadata.C [E] *** Traceback: [E] 2:0x55776C34CD78 logAssertFailed + 0x418 at ??:0 [E] 3:0x55776C0AB24D FileMetadata::updateDataBlockDiskAddr(long long, fsDiskAddr const*, fsDiskAddr, unsigned int, int, indBlockDesc*, unsigned int*, fsDiskAddr*) + 0xF8D at ??:0 [E] 4:0x55776BF1AA35 BufferDesc::commitAssigned(int, unsigned int) + 0x1D5 at ??:0 [E] 5:0x55776C12E844 OpenFile::handleBufferFlush(int, BufferDesc*, long long, long long, ByteRange const&, int*, int*, ByteRange*, long long*, int*, int*, long long*, cxiUioAio_t**) + 0x1904 at ??:0 [E] 6:0x55776C12F415 OpenFile::flushBufferAddrs() + 0xE5 at ??:0 [E] 7:0x55776C12FC3D FileMetadata::flushFileMetadata(int, ByteRange const&, ByteRange*, unsigned int*) + 0x6CD at ??:0 [E] 8:0x55776C132143 FileMetadata::flushFile(int, ByteRange const&, int*, int*, ByteRange*) + 0x9B3 at ??:0 [E] 9:0x55776C132CBB SFSSyncFile(StripeGroup*, long long, unsigned int, int, ByteRange const&, OpenFile*) + 0x37B at ??:0 [E] 10:0x55776C11DA01 HandleMBFSyncFile(MBFSyncFileParms*) + 0xD1 at ??:0 [E] 11:0x55776BE53F83 Mailbox::msgHandlerBody(void*) + 0x363 at ??:0 [E] 12:0x55776BE37703 Thread::callBody(Thread*) + 0x63 at ??:0 [E] 13:0x55776BE24692 Thread::callBodyWrapper(Thread*) + 0xA2 at ??:0 [E] 14:0x7FC7F88F9DD5 start_thread + 0xC5 at ??:0 [E] 15:0x7FC7F79FDEAD __clone + 0x6D at ??:0
Local fix
Problem summary
GPFS daemon crashes and the filesystem gets unmounted. The GPFS daemon crashes because the daemon code hit an assert which is saying the disk address is not expected.
Problem conclusion
Benefits of the solution: No crash, and the file system is keeped active Work around: None Problem trigger: Failed writes and seeks within a file Symptom: Scale daemon crashes with logAssertFailed: !"oldDiskAddrFound.compAddr(*oldDiskAddrP)", and the file system gets unmounted. Platforms affected: All Functional Area affected: Core Customer Impact: Critical
Temporary fix
Comments
APAR Information
APAR number
IJ26510
Reported component name
SPEC SCALE STD
Reported component ID
5737F33AP
Reported release
505
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-07-24
Closed date
2020-07-24
Last modified date
2020-07-24
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
IJ26956
Fix information
Fixed component name
SPEC SCALE STD
Fixed component ID
5737F33AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"505","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
12 August 2020