APAR status
Closed as program error.
Error description
Error Description: In a AFM cache, AFM gateway node and file system manager nodes might hit logAssertFailure as below: AFM Gateway: 2022-06-30_16:22:16.661-0500: [X] logAssertFailed: Remote ASSERT from node <c0n0>: SGNotQuiesced snap 445/0 ino 6535457643 reason 5 code 118 2022-06-30_16:22:16.661-0500: [X] return code 0, reason code 0, log record tag 0 2022-06-30_16:22:16.661-0500: [I] Freezing overwrite mode tracing to preserve failure data 2022-06-30_16:22:17.177-0500: [X] *** Assert exp(Remote ASSERT from node <c0n0>: SGNotQuiesced snap 445/0 ino 6535457643 reason 5 code 118) in line 3467 of file /project/sprelmax513/build/rmax513s001a/src/avs/fs/mmfs/t s/cfgmgr/sgmrpc.C 2022-06-30_16:22:17.177-0500: [E] *** Traceback: 2022-06-30_16:22:17.177-0500: [E] 2:0x559C22780E7A logAssertFailed + 0x3AA at ??:0 2022-06-30_16:22:17.177-0500: [E] 3:0x559C227ED526 ClusterConfiguration::CCHandleAssert(RpcContext*, char*) + 0x166 at ??:0 2022-06-30_16:22:17.177-0500: [E] 4:0x559C227A1268 tscHandleMsg(RpcContext*, MsgDataBuf*) + 0x658 at ??:0 2022-06-30_16:22:17.177-0500: [E] 5:0x559C227D5B81 RcvWorker::RcvMain() + 0x191 at ??:0 2022-06-30_16:22:17.177-0500: [E] 6:0x559C227D5D7D RcvWorker::thread(void*) + 0x3D at ??:0 2022-06-30_16:22:17.177-0500: [E] 7:0x559C22271282 Thread::callBody(Thread*) + 0x42 at ??:0 2022-06-30_16:22:17.177-0500: [E] 8:0x559C2225E2A0 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2022-06-30_16:22:17.177-0500: [E] 9:0x7F006431DEA5 start_thread + 0xC5 at ??:0 2022-06-30_16:22:17.177-0500: [E] 10:0x7F006320AB0D __clone + 0x6D at ??:0 mmfsd: /project/sprelmax513/build/rmax513s001a/src/avs/fs/mmfs/t s/cfgmgr/sgmrpc.C:3467: void logAssertFailed(UInt32, const char*, UInt32, Int32, Int32, UInt32, const char*, const char*): Assertion 'Remote ASSERT from node <c0n0>: SGNotQuiesced snap 445/0 ino 6535457643 reason 5 code 118' failed. 2022-06-30_16:22:17.178-0500: [E] Signal 6 at location 0x7F0063142387 in process 56899, link reg 0xFFFFFFFFFFFFFFFF. File system manager: 2022-06-30_16:22:05.474-0500: [X] logAssertFailed: SGNotQuiesced snap 445/0 ino 6535457643 reason 5 code 118 2022-06-30_16:22:05.475-0500: [X] return code 0, reason code 0, log record tag 0 2022-06-30_16:22:07.018-0500: [X] *** Assert exp(SGNotQuiesced snap 445/0 ino 6535457643 reason 5 code 118) in line 3467 of file /project/sprelmax513/build/rmax513s001a/src/avs/fs/mmfs/t s/cfgmgr/sgmrpc.C 2022-06-30_16:22:07.018-0500: [E] *** Traceback: 2022-06-30_16:22:07.018-0500: [E] 2:0x55B9055B827A logAssertFailed + 0x3AA at ??:0 2022-06-30_16:22:07.018-0500: [E] 3:0x55B905634CE1 RemoteLogAssert + 0x211 at ??:0 2022-06-30_16:22:07.018-0500: [E] 4:0x55B9056A5E20 StripeGroupCfg::SGHandleRPC(RpcContext*, char*) + 0x2060 at ??:0 2022-06-30_16:22:07.018-0500: [E] 5:0x55B9055D8668 tscHandleMsg(RpcContext*, MsgDataBuf*) + 0x658 at ??:0 2022-06-30_16:22:07.018-0500: [E] 6:0x55B90560D1C1 RcvWorker::RcvMain() + 0x191 at ??:0 2022-06-30_16:22:07.018-0500: [E] 7:0x55B90560D3BD RcvWorker::thread(void*) + 0x3D at ??:0 2022-06-30_16:22:07.018-0500: [E] 8:0x55B9050A8532 Thread::callBody(Thread*) + 0x42 at ??:0 2022-06-30_16:22:07.018-0500: [E] 9:0x55B905095550 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2022-06-30_16:22:07.018-0500: [E] 10:0x7FEB87D6114A start_thread + 0xEA at ??:0 2022-06-30_16:22:07.018-0500: [E] 11:0x7FEB86B3CDC3 __GI___clone + 0x43 at ??:0 mmfsd: /project/sprelmax513/build/rmax513s001a/src/avs/fs/mmfs/t s/cfgmgr/sgmrpc.C:3467: void logAssertFailed(UInt32, const char*, UInt32, Int32, Int32, UInt32, const char*, const char*): Assertion 'SGNotQuiesced snap 445/0 ino 6535457643 reason 5 code 118' failed. 2022-06-30_16:22:07.019-0500: [E] Signal 6 at location 0x7FEB86A7737F in process 1340786, link reg 0xFFFFFFFFFFFFFFFF. Reported in: Spectrum Scale 5.1.3.1 on RHEL7 Known Impact: deadlock/daemon crash Verification steps: Recovery action: N/A
Local fix
N/A
Problem summary
When AFM fileset snapshot is being created or deleted, either through manual snapshot commands or AFM DR periodic snapshot operations, user's file operations in AFM fileset could proceed without interlocking with snapshot operations, then trigger such assert went off.
Problem conclusion
avoid mmfsd daemon process crash
Temporary fix
Comments
APAR Information
APAR number
IJ40965
Reported component name
SPEC SCALE ADV
Reported component ID
5737F35AP
Reported release
513
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2022-07-01
Closed date
2022-07-20
Last modified date
2022-09-08
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SPEC SCALE ADV
Fixed component ID
5737F35AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"513","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
08 September 2022