IBM Support

IC99312: WEBSPHERE MQ MFT AGENT HANGS DURING RECOVERY.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • When a WebSphere MQ MFT v7.5.0.3 agent enters recovery, for
    example because it has lost connection to its agent queue
    manager, the agent hangs and does not recover.
    
    A Javacore of the agent's JVM shows one or more TransferSender
    threads and a TriggerRecoveryThread that are blocked:
    
    "TransferSender[c3e2d840c6e3f8c54040404040404040cca934cedced5576
    ]"
    J9VMThread:0x414D8700, j9thread_t:0x4182F490,
    java/lang/Thread:0x1B8475C8, state:B, prio=5
    
    Blocked on:
    com/ibm/wmqfte/statestore/impl/FTEStateStorePersistence@0x1B4AEE
    58
    Owned by: "TriggerRecoveryThread" (J9VMThread:0x413D4700,
    java/lang/Thread:0x3A8F0120)
    Java callstack:
    at
    com/ibm/wmqfte/statestore/impl/FTEStateStorePersistence.getMutab
    leSenderState(FTEStateStorePersistence.java:1722)
    at
    com/ibm/wmqfte/statestore/impl/FTEStateStoreImpl.senderTransferR
    ecovered(FTEStateStoreImpl.java:2698)
    at
    com/ibm/wmqfte/transfer/impl/TransferSenderRunnable.transferReco
    vered(TransferSenderRunnable.java:1456)
    at
    com/ibm/wmqfte/transfer/impl/TransferSenderRunnable.run(Transfer
    SenderRunnable.java:522)
    at java/lang/Thread.run(Thread.java:781)
    at com/ibm/wmqfte/thread/FTEThread.run(FTEThread.java:70)
    
    
    "TriggerRecoveryThread" J9VMThread:0x413D4700,
    j9thread_t:0x4182F6EC, java/lang/Thread:0x3A8F0120, state:CW,
    prio=5
    Waiting on: com/ibm/wmqfte/thread/FTEThread@0x1B8475C8 Owned by:
    
    Java callstack:
    at java/lang/Object.wait(Native Method)
    at java/lang/Object.wait(Object.java:196)
    at java/lang/Thread.join(Thread.java:661)
    (entered lock: com/ibm/wmqfte/thread/FTEThread@0x1B8475C8, entry
    count: 1)
    at
    com/ibm/wmqfte/transfer/impl/TransferSenderImpl.stop(TransferSen
    derImpl.java:258)
    at
    com/ibm/wmqfte/statestore/impl/FTEStateStorePersistence.immediat
    eStop(FTEStateStorePersistence.java:1331)
    (entered lock:
    com/ibm/wmqfte/statestore/impl/FTEStateStorePersistence@0x1B4AEE
    58,
    entry count: 1)
    at
    com/ibm/wmqfte/statestore/impl/FTEStateStoreImpl.immediateStop(F
    TEStateStoreImpl.java:1152)
    (entered lock:
    com/ibm/wmqfte/statestore/impl/FTEStateStoreImpl@0x1B4AEC00,
    entry count: 1)
    at com/ibm/wmqfte/agent/Agent$4.run(Agent.java:1313)
    (entered lock: java/lang/Object@0x1ACD60E8, entry count: 1)
    at java/lang/Thread.run(Thread.java:781)
    at com/ibm/wmqfte/thread/FTEThread.run(FTEThread.java:70)
    

Local fix

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    This issue affects users of WebSphere MQ Managed File Transfer
    v7.5.0.3 with source agents that enter recovery.
    
    
    Platforms affected:
    MultiPlatform
    
    ****************************************************************
    PROBLEM SUMMARY:
    Since APAR IC94369, a deadlock situation could have occurred
    when a WebSphere MQ Manager File Transfer source agent entered
    recovery while a transfer itself was being recovered. This was
    due to a coding error in the component that handled
    multi-threaded updates to agent state information. If a
    TransferSender thread was in a position of needing to access
    state information before ending, because the state of the
    transfer required updating from "Recovering" to "Running", the
    WebSphere MQ Manager File Transfer source agent deadlocked and
    was unable to recover.
    
    If this situation occurred, the agent process needed to be
    killed and the agent restarted.
    

Problem conclusion

  • The WebSphere MQ Managed File Transfer code has been updated
    to remove this defect in the code which controls state
    information. As such, TransferSender threads that require access
    to state information prior to ending can obtain access without a
    deadlock situation arising.
    
    
    | MDVREGR 7.5.0-WS-MQ-AixPPC64-FP0003 |
    | MDVREGR 7.5.0-WS-MQ-HpuxIA64-FP0003 |
    | MDVREGR 7.5.0-WS-MQ-HpuxPaRISC64-FP0003 |
    | MDVREGR 7.5.0-WS-MQ-LinuxIA32-FP0003 |
    | MDVREGR 7.5.0-WS-MQ-LinuxPPC64-FP0003 |
    | MDVREGR 7.5.0-WS-MQ-LinuxS390X-FP0003 |
    | MDVREGR 7.5.0-WS-MQ-LinuxX64-FP0003 |
    | MDVREGR 7.5.0-WS-MQ-SolarisSparc64-FP0003 |
    | MDVREGR 7.5.0-WS-MQ-SolarisX64-FP0003 |
    | MDVREGR 7.5.0-WS-MQ-Windows-FP0003 |
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
    Version    Maintenance Level
    v7.5       7.5.0.4
    
    The latest available maintenance can be obtained from
    'WebSphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available information on
    its planned availability can be found in 'WebSphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC99312

  • Reported component name

    WMQ FTE

  • Reported component ID

    5724H7242

  • Reported release

    750

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2014-02-12

  • Closed date

    2014-03-10

  • Last modified date

    2014-03-10

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WMQ FTE

  • Fixed component ID

    5724H7242

Applicable component levels

  • R750 PSY

       UP

[{"Line of Business":{"code":"LOB36","label":"IBM Automation"},"Business Unit":{"code":"BU053","label":"Cloud \u0026 Data Platform"},"Product":{"code":"SSFKSJ","label":"WebSphere MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.5"}]

Document Information

Modified date:
23 September 2021