IBM Support

PH34816: SERVER SHUTDOWN HANGS DUE TO DEADLOCKED THREADS IN CONTROL REGION

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • The control region doesn't start the shut down of the servant
    regions because it has requests for current work associated
    with deadlocked threads in the control region:
    
    deadlock is for:
    java/lang/Object
    com/ibm/ws390/xmem/proxy/channel/XMemProxyCRInboundConnLink
    
    threads are in:
    com/ibm/ws/tcp/channel/impl/ZAioTCPConnLink.destroyCommon
    (Exception)  source: ZAioTCPConnLink.java:1072
    
    com/ibm/ws390/xmem/proxy/channel/XMemProxyCRInboundReadCallback.
    complete(com.ibm.wsspi.channel.framework.VirtualConnection)
    source: XMemProxyCRInboundReadCallback.java:70
    
    A timing window causes this deadlock.   The ReadComplete  came
    into the CR at the exact time that the CR was told to drive the
    "sendFinalResponse".   The two locks (XMem ConnLink lock and
    the related ZAioTCPConnLink readLock) are obtained in opposite
    order in the two paths and thus, the possibility for deadlock.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  All users of IBM WebSphere Application      *
    *                  Server                                      *
    *                  V9.0                                        *
    ****************************************************************
    * PROBLEM DESCRIPTION: WebSphere Application Server for z/OS   *
    *                      hang during stop processing.            *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    A Controller would not complete the stop processing because
    there was a deadlock between two ACRW threads.
    There are two locks involved in the code paths between the two
    ACRW threads and they are obtained in opposite order which can
    lead to a deadlock.
    One thread was driving the XMemProxyCRInboundConnLink.
    sendFinalResponse() method.  This method gets the
    XMemProxyCRInboundConnLink Lock then may drive close() holding
    this lock.  Under close(...) is where the HTTP channel may
    decide to "destroy" the Connection as opposed to hanging out
    another read (for a persistent Connection). Under the "destroy"
    path the ZAioTCPConnLink.destroyCommon(..) method is invoked
    and attempts to obtain its readLock. This obtain of the
    readLock could be suspended if a readComplete ACRW path is
    running concurrently and had already obtained the readLock (in
    ZAioTCPChannel.readCompleted()).
    The deadlock would occur when the readComplete processing calls
    XMemProxyCRInboundReadCallback.complete() which attempts to the
    obtain the XMemProxyCRInboundConnLink lock.  The readComplete
    is already holding the ZAioTCPConnLink readLock at this point.
    The following is what the top of the stack would look like for
    the ACRW thread processing the sendFinalResponse method:
    ZAioTCPConnLink.destroyCommon(Exception)
    source: ZAioTCPConnLink.java:1072
    ZAioTCPConnLink.destroy(Exception)
    source: ZAioTCPConnLink.java:1050
    OutboundConnectorLink.close(com.ibm.wsspi.channel.framework.
    VirtualConnection, Exception)
    source: OutboundConnectorLink.java:50
    HttpInboundLink.close(com.ibm.wsspi.channel.framework.
    VirtualConnection, Exception)
    source: HttpInboundLink.java:899
    InboundApplicationLink.close(com.ibm.wsspi.channel.framework.
    VirtualConnection, Exception)
    source: InboundApplicationLink.java:58
    XMemProxyCRInboundConnLink.close(com.ibm.wsspi.channel.
    framework.VirtualConnection, Exception)
    source: XMemProxyCRInboundConnLink.java:3236
    XMemProxyCRInboundConnLink.sendFinalResponse(int,
    com.ibm.ws390.xmem.proxy.XMemProxyCommMetaData,
    com.ibm.wsspi.buffermgmt.WsByteBuffer, long)
    source: XMemProxyCRInboundConnLink.java:2850
    At this point the sendFinalResponse thread obtained the
    XMemProxyCRInboundConnLink Lock in sendFinalResponse and is
    waiting for the ZAioTCPConnLink.readLock in destroyCommon.
    The following is what the readComplete ACRW thread would
    look like:
    XMemProxyCRInboundReadCallback.complete(
    com.ibm.wsspi.channel.framework.VirtualConnection)
    source: XMemProxyCRInboundReadCallback.java:70
    HttpServiceContextImpl.continueRead()
    source: HttpServiceContextImpl.java:4636
    HttpISCBodyReadCallback.complete(
    com.ibm.wsspi.channel.framework.VirtualConnection,
    com.ibm.wsspi.tcp.channel.TCPReadRequestContext)
    source: HttpISCBodyReadCallback.java:87
    ZAioTCPReadRequestContextImpl.readCompleted(long,
    com.ibm.wsspi.channel.framework.VirtualConnection,
    com.ibm.wsspi.buffermgmt.WsByteBuffer, String)
    source: ZAioTCPReadRequestContextImpl.java:683
    ZAioTCPConnLink.readCompleted(long,
    com.ibm.wsspi.buffermgmt.WsByteBuffer, long, String)
    source: ZAioTCPConnLink.java:1248
    ZAioTCPChannel.readCompleted(
    com.ibm.ws.tcp.channel.impl.ZAioTCPConnLink, long,
    long, com.ibm.wsspi.buffermgmt.WsByteBuffer, byte[],
    long, String)  source: ZAioTCPChannel.java:934
    ZAioTCPChannelCPPUtilities.readCompleted(
    com.ibm.ws.tcp.channel.impl.ZAioTCPChannel,
    com.ibm.ws.tcp.channel.impl.ZAioTCPConnLink, long,
    long, com.ibm.wsspi.buffermgmt.WsByteBuffer, byte[],
    long, String)
    source: ZAioTCPChannelCPPUtilities.java:181)
    At this point the readComplete thread is holding the
    ZAioTCPConnLink.readLock, obtained in
    XMemProxyCRInboundReadCallback.complete, and is attempting to
    obtain the XMemProxyCRInboundConnLink lock in the
    XMemProxyCRInboundReadCallback.complete() method.
    

Problem conclusion

  • Code was modified in the XMemProxyCRInboundConnLink's
    sendFinalResponse() method to drop its Lock before driving the
    close(..) call.  We do not need to synchronize further between
    the remaining sendFinalResponse actions and the potential close
    processing.
    
    The fix for this APAR is targeted for inclusion in fix pack
    9.0.5.8. For more information, see 'Recommended Updates for
    WebSphere Application Server':
    https://www.ibm.com/support/pages/node/715553
    

Temporary fix

Comments

APAR Information

  • APAR number

    PH34816

  • Reported component name

    WEBSPHERE FOR Z

  • Reported component ID

    5655I3500

  • Reported release

    900

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-02-24

  • Closed date

    2021-03-23

  • Last modified date

    2021-03-23

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WEBSPHERE FOR Z

  • Fixed component ID

    5655I3500

Applicable component levels

[{"Line of Business":{"code":"LOB36","label":"IBM Automation"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SS7K4U","label":"WebSphere Application Server for z\/OS"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"900"}]

Document Information

Modified date:
24 March 2021