IBM Support

IV29986: CLSTRMGR MAY CORE DUMP AFTER RESTART FROM UNMANAGED

Direct link to fix

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • clstrmgr core dumps after restart from unmanaged
    mode (force stop) when a clappmond process is
    unresponsive.
    
    clstrmgr.debug looks like:
    ...
    Mon Jul 23 16:27:24 stabIntervalCb(clAappl1mon): Called
    Mon Jul 23 16:27:24
     Appmon::stabIntervalComplete(clAappl1mon) Called
    Mon Jul 23 16:27:24 stabIntervalComplete(clAappl1mon)
     Called when monitor alread y running ?
    Mon Jul 23 16:27:24 die: clstrmgr on node 2 is exiting
     with code 4
    Mon Jul 23 16:27:24 die: waitpid failed, waitrc = -1,
     ERRNO = 10
    
    stack trace looks like:
    (dbx) t
    pthread_kill(??, ??) at 0xd05098c0
    _p_raise(??) at 0xd0508d28
    raise.raise(??) at 0xd01373e0
    abort() at 0xd01c5704
    unnamed block in die(int)(code = 4),
     line 772 in "rdutils.C"
    die(int)(code = 4), line 772 in "rdutils.C"
    Appmon::stabIntervalComplete()(this = 0x2015eff8),
     line 238 in "Appmon.C"
    stabIntervalCb(void*)(appmon = 0x2015eff8),
     line 193 in "Appmon.C"
    MitCheck()(), line 272 in "mit.C"
    DoMainLoop()(), line 2084 in "main.C"
    main(argc = 1, argv = 0x2ff22d10),
     line 4044 in "main.C"
    

Local fix

Problem summary

  • clstrmgr core dumps:
    &
    clstrmgr.debug looks like:
    ...
    Mon Jul 23 16:27:24 stabIntervalCb(clAappl1mon): Called
    Mon Jul 23 16:27:24
     Appmon::stabIntervalComplete(clAappl1mon) Called
    Mon Jul 23 16:27:24 stabIntervalComplete(clAappl1mon)
     Called when monitor alread y running ?
    Mon Jul 23 16:27:24 die: clstrmgr on node 2 is exiting
     with code 4
    Mon Jul 23 16:27:24 die: waitpid failed, waitrc = -1,
     ERRNO = 10
    
    stack trace looks like:
    (dbx) t
    pthread_kill(??, ??) at 0xd05098c0
    _p_raise(??) at 0xd0508d28
    raise.raise(??) at 0xd01373e0
    abort() at 0xd01c5704
    unnamed block in die(int)(code = 4),
     line 772 in "rdutils.C"
    die(int)(code = 4), line 772 in "rdutils.C"
    Appmon::stabIntervalComplete()(this = 0x2015eff8),
     line 238 in "Appmon.C"
    stabIntervalCb(void*)(appmon = 0x2015eff8),
     line 193 in "Appmon.C"
    MitCheck()(), line 272 in "mit.C"
    DoMainLoop()(), line 2084 in "main.C"
    main(argc = 1, argv = 0x2ff22d10),
     line 4044 in "main.C"
    

Problem conclusion

  • Kill old clappmond process if it was already WARNED TO STOP.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IV29986

  • Reported component name

    POWERHA SYSMIR

  • Reported component ID

    5765H3900

  • Reported release

    711

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Submitted date

    2012-10-11

  • Closed date

    2012-10-11

  • Last modified date

    2013-03-21

  • APAR is sysrouted FROM one or more of the following:

    IV25337

  • APAR is sysrouted TO one or more of the following:

    IV38520

Fix information

  • Fixed component name

    POWERHA SYSMIR

  • Fixed component ID

    5765H3900

Applicable component levels

  • R711 PSY U855232

       UP12/12/07 I 1000

PTF to Fileset Mapping

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSLM9V","label":"PowerHA SystemMirror Standard Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"711","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSXU4N","label":"PowerHA SystemMirror Enterprise Edition for AIX"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"711","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSLM9V","label":"PowerHA SystemMirror Standard Edition for AIX"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"711","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}},{"Business Unit":{"code":"BU008","label":"Security"},"Product":{"code":"SGL4G4","label":"PowerHA"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"711","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11R","label":"APARs - AIX 7.1 environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"711","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11Q","label":"AIX 6.1 HIPERS, APARs and Fixes"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"711","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
19 October 2021