IBM Support

IV46717: ERPD STOPPED AND RESTARTED WHEN THE ERPDMASTER MONITORING TIMED OUT

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Support Engineer: GBH
    Change Team Eng:  EJ
    
    Environment: TSAMP v3.2.2 xDR
    
    
    Details:
    The monitor command for the erpdmaster resource times out on the
     proxy node:
     Jul 13 22:09:13 node02 GblResRM[4454]:
    :::GBLRESRM_MONITOR_TIMEOUT IBM.Application monitor command
    timed out. Resource name: erpdmaster
    
    These monitor commands have a timeout of 10 resp. 15 seconds for
    calling simple commands like 'ps' or 'cp ind user'. Therefore it
    is obvious that the system either is under heavy load or is
    not dispatching the proxies at that time. A little later xDR
    init processing is started and finally brings the system back to
    a hyperswap enable state.
     Jul 13 22:09:51 node02 xdr: cmd: rc=0 for command 'xdr.init -v
    1 -V -c xdrbcluster -n node02 -t 10.101.1.8 -s 2 -p 3:3 -d 2
    --ss 0 --devices "7202:E802/45,7242:E842/45"' sent to GDPS
    
    It is correct to set the resource states to unknown, if a
    monitor command times out and the behaviour should not be
    changed. If a hyperswap would have been triggered at that time,
    it is very likely
    that it would have failed due to the bad response time of the
    system. An improvement could be to prevent the erpd process
    being stopped and restarted due to the monitor timeout. The
    timeout of the monitor commands brings the resource to an
    operational state of unknown and this currently causes erpd to
    beg stopped and restarted.
     Jul 13 22:08:54 xdrb2 xdr: erpd: Master role of erpd changed
    ==> restarting erpd
     Jul 13 22:08:54 xdrb2 xdr: erpd: Ending. rc=30
    

Local fix

  • N/A
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: Tivoli System Automation for Multiplatforms
    * users using GDPS/PPRC Multiplatform Resiliency for System z
    * for management of Linux on z/VM guests (xDR z/VM guest Linux
    * on System z) using a dual-node proxy cluster
    ****************************************************************
    * PROBLEM DESCRIPTION:
    * erpd process is restarted in case a monitor command times out
    * for resource erpdmaster.
    ****************************************************************
    * RECOMMENDATION:
    ****************************************************************
    

Problem conclusion

  • The xDR proxy erpd code has been modified to stay up and running
    in case the OpState of the resource erpdmaster changes to
    Unknown due to a time out of the monitor command.
    .
    The official fix for this problem is included in fix pack 7 of
    Tivoli System Automation for Multiplatforms 3.2.2
    | 3.2.2-TIV-ITSAMP-FP0007 |
    .
    Additional Search Keywords
    .
    

Temporary fix

Comments

APAR Information

  • APAR number

    IV46717

  • Reported component name

    SA MULTIPLATFOR

  • Reported component ID

    5724M0000

  • Reported release

    322

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2013-08-07

  • Closed date

    2013-09-27

  • Last modified date

    2013-09-27

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SA MULTIPLATFOR

  • Fixed component ID

    5724M0000

Applicable component levels

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSRM2X","label":"Tivoli System Automation for Multiplatforms"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"322","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
25 August 2023