IBM Support

IJ40965: AFM: LOGASSERTFAILED: SGNotQuiesced

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • Error Description:
    In a AFM cache,  AFM gateway node and file system manager
    nodes might hit logAssertFailure as below:
    
    AFM Gateway:
    2022-06-30_16:22:16.661-0500: [X] logAssertFailed: Remote
    ASSERT from node <c0n0>: SGNotQuiesced snap 445/0 ino
    6535457643 reason 5 code 118
    2022-06-30_16:22:16.661-0500: [X] return code 0, reason
    code 0, log record tag 0
    2022-06-30_16:22:16.661-0500: [I] Freezing overwrite mode
    tracing to preserve failure data
    2022-06-30_16:22:17.177-0500: [X] *** Assert exp(Remote
    ASSERT from node <c0n0>: SGNotQuiesced snap 445/0 ino
    6535457643 reason 5 code 118) in line 3467 of file
    /project/sprelmax513/build/rmax513s001a/src/avs/fs/mmfs/t
    s/cfgmgr/sgmrpc.C
    2022-06-30_16:22:17.177-0500: [E] *** Traceback:
    2022-06-30_16:22:17.177-0500: [E]
    2:0x559C22780E7A logAssertFailed + 0x3AA at ??:0
    2022-06-30_16:22:17.177-0500: [E]
    3:0x559C227ED526
    ClusterConfiguration::CCHandleAssert(RpcContext*, char*)
    + 0x166 at ??:0
    2022-06-30_16:22:17.177-0500: [E]
    4:0x559C227A1268 tscHandleMsg(RpcContext*, MsgDataBuf*) +
    0x658 at ??:0
    2022-06-30_16:22:17.177-0500: [E]
    5:0x559C227D5B81 RcvWorker::RcvMain() + 0x191 at ??:0
    2022-06-30_16:22:17.177-0500: [E]
    6:0x559C227D5D7D RcvWorker::thread(void*) + 0x3D at ??:0
    2022-06-30_16:22:17.177-0500: [E]
    7:0x559C22271282 Thread::callBody(Thread*) + 0x42 at ??:0
    2022-06-30_16:22:17.177-0500: [E]
    8:0x559C2225E2A0 Thread::callBodyWrapper(Thread*) + 0xA0
    at ??:0
    2022-06-30_16:22:17.177-0500: [E]
    9:0x7F006431DEA5 start_thread + 0xC5 at ??:0
    2022-06-30_16:22:17.177-0500: [E]
    10:0x7F006320AB0D __clone + 0x6D at ??:0
    mmfsd:
    /project/sprelmax513/build/rmax513s001a/src/avs/fs/mmfs/t
    s/cfgmgr/sgmrpc.C:3467: void logAssertFailed(UInt32,
    const char*, UInt32, Int32, Int32, UInt32, const char*,
    const char*): Assertion 'Remote ASSERT from node <c0n0>:
    SGNotQuiesced snap 445/0 ino 6535457643 reason 5 code
    118' failed.
    2022-06-30_16:22:17.178-0500: [E] Signal 6 at location
    0x7F0063142387 in process 56899, link reg
    0xFFFFFFFFFFFFFFFF.
    
    File system manager:
    2022-06-30_16:22:05.474-0500: [X] logAssertFailed:
    SGNotQuiesced snap 445/0 ino 6535457643 reason 5 code 118
    2022-06-30_16:22:05.475-0500: [X] return code 0, reason
    code 0, log record tag 0
    2022-06-30_16:22:07.018-0500: [X] *** Assert
    exp(SGNotQuiesced snap 445/0 ino 6535457643 reason 5 code
    118) in line 3467 of file
    /project/sprelmax513/build/rmax513s001a/src/avs/fs/mmfs/t
    s/cfgmgr/sgmrpc.C
    2022-06-30_16:22:07.018-0500: [E] *** Traceback:
    2022-06-30_16:22:07.018-0500: [E]
    2:0x55B9055B827A logAssertFailed + 0x3AA at ??:0
    2022-06-30_16:22:07.018-0500: [E]
    3:0x55B905634CE1 RemoteLogAssert + 0x211 at ??:0
    2022-06-30_16:22:07.018-0500: [E]
    4:0x55B9056A5E20 StripeGroupCfg::SGHandleRPC(RpcContext*,
    char*) + 0x2060 at ??:0
    2022-06-30_16:22:07.018-0500: [E]
    5:0x55B9055D8668 tscHandleMsg(RpcContext*, MsgDataBuf*) +
    0x658 at ??:0
    2022-06-30_16:22:07.018-0500: [E]
    6:0x55B90560D1C1 RcvWorker::RcvMain() + 0x191 at ??:0
    2022-06-30_16:22:07.018-0500: [E]
    7:0x55B90560D3BD RcvWorker::thread(void*) + 0x3D at ??:0
    2022-06-30_16:22:07.018-0500: [E]
    8:0x55B9050A8532 Thread::callBody(Thread*) + 0x42 at ??:0
    2022-06-30_16:22:07.018-0500: [E]
    9:0x55B905095550 Thread::callBodyWrapper(Thread*) + 0xA0
    at ??:0
    2022-06-30_16:22:07.018-0500: [E]
    10:0x7FEB87D6114A start_thread + 0xEA at ??:0
    2022-06-30_16:22:07.018-0500: [E]
    11:0x7FEB86B3CDC3 __GI___clone + 0x43 at ??:0
    mmfsd:
    /project/sprelmax513/build/rmax513s001a/src/avs/fs/mmfs/t
    s/cfgmgr/sgmrpc.C:3467: void logAssertFailed(UInt32,
    const char*, UInt32, Int32, Int32, UInt32, const char*,
    const char*): Assertion 'SGNotQuiesced snap 445/0 ino
    6535457643 reason 5 code 118' failed.
    2022-06-30_16:22:07.019-0500: [E] Signal 6 at location
    0x7FEB86A7737F in process 1340786, link reg
    0xFFFFFFFFFFFFFFFF.
    
    
    Reported in:
    Spectrum Scale 5.1.3.1 on RHEL7
    
    Known Impact:
    deadlock/daemon crash
    
    Verification steps:
    
    Recovery action:
    N/A
    

Local fix

  • N/A
    

Problem summary

  • When AFM fileset snapshot is being created or deleted, either
    through manual snapshot commands or AFM DR periodic snapshot
    operations,  user's file operations in AFM fileset could
    proceed without interlocking with snapshot operations, then
    trigger such assert went off.
    

Problem conclusion

  • avoid mmfsd daemon process crash
    

Temporary fix

Comments

APAR Information

  • APAR number

    IJ40965

  • Reported component name

    SPEC SCALE ADV

  • Reported component ID

    5737F35AP

  • Reported release

    513

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2022-07-01

  • Closed date

    2022-07-20

  • Last modified date

    2022-09-08

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SPEC SCALE ADV

  • Fixed component ID

    5737F35AP

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"513","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
08 September 2022