IBM Support

PQ99704: TCP SESSION HANGS - DATA SENT BEYOND WINDOW WHILE IN FRR

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • TCP sessions that are established with a remote system that are
    using a high volume of outbound data traffic can occasionally
    hang with no progress.  If the local application uses a timeout
    function (such as FTP) or the remote application will timeout
    with a RESET, the session will terminate after that period of
    inactivity with a corresponding error indication (such as errno
    140=x'8C'=EPIPE).
    
    A packet trace (SYSTCPDA CTRACE) or sniffer trace of the session
    will show the following events:
     - Numerous ACKs from the remote side indicating missing or
       out-of-order packets.
     - Multiple of these in a row with the same ACK sequence number
       will trigger Fast Retransmit/Recovery (FRR) logic.
     - At the same time, the advertised window from the remote side
       shrinks, eventually becoming less than one MSS in size.
       Potentially, the remote side could even cut its window back
       to where the last acceptable sequence number is less than
       what was calculated from an earlier packet ("left shift" of
       the window).
     - One (or more) of the packet(s) transmitted due to FRR exceeds
       that window.  After this, no further data packets are
       transmitted (which includes window probes or other
       retransmits).
    

Local fix

  • - Take actions to reduce the number of lost or out-of-order
      packets in the network between the z/OS system and the target
      server.
    - Increase the size of TCP buffers on the target server to avoid
      closing the advertised window while waiting for recovery of
      lost or out-of-order packets.
    - Use Policy Agent to regulate the rate of out bound packets to
      the target server.
    
    Other Symptoms:
    
      Print functions using TCP/IP are quite likely to be affected
      by this problem, since printer slow-downs (paper jam, out of
      paper, ...) make the server more likely to advertise a zero
      window.  These functions include LPR, Network Print Facility
      (NPF), PSF/MVS, VPS (*), and similar utilities.
    
    (*) (R) Levi, Ray and Shoup
    TCB_WRT_BLOCKED is set and the TCB_SND_WND is greater then
    zero. The TCB_IFR contains x'91' & x'92' entries.
    There is a write event element on the TCB event queue.
    Additional Symptoms: IFR91 IFR92
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of the IBM Communications Server   *
    *                 for z/OS Version 1 Release 2 IP              *
    ****************************************************************
    * PROBLEM DESCRIPTION: A TCP transmission that involves the    *
    *                      retransmission of data when the TCP     *
    *                      window is approaching zero may hang     *
    *                      and the transfer of data may not        *
    *                      complete.                               *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    If a TCP retransmission of data occurs when the TCP window is
    less than a single packet in size, CS for z/OS TCPIP may
    retransmit a full packet that exceeds the window. If the
    receiver of such a packet does not ACK all of the data, CS for
    z/OS TCPIP may not send the next packet. As a result, TCP
    transmission will hang and will not successfully complete.
    The retransmission of data that would exceed the available
    window occurs because of code in EZBTCRD that sends a
    retransmission before updating the available window size.
    +-------------------------------------------------------------+
    + Please check our Communications Server for OS/390 homepages +
    + for common networking tips and fixes.  The URL for these    +
    + homepages can be found in Informational APAR II11334.       +
    +-------------------------------------------------------------+
    

Problem conclusion

  • EZBTCRD has been amended to update the window size before
    attempting to retransmit data.
    
    * Cross Reference between External and Internal Names
    

Temporary fix

Comments

APAR Information

  • APAR number

    PQ99704

  • Reported component name

    TCP/IP V3 MVS

  • Reported component ID

    5655HAL00

  • Reported release

    120

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2005-01-19

  • Closed date

    2005-01-24

  • Last modified date

    2005-09-06

  • APAR is sysrouted FROM one or more of the following:

    PQ94985

  • APAR is sysrouted TO one or more of the following:

    UQ97093

Modules/Macros

  •    EZBTCRD
    

Fix information

  • Fixed component name

    TCP/IP V3 MVS

  • Fixed component ID

    5655HAL00

Applicable component levels

  • R120 PSY UQ97093

       UP05/03/21 P F503

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSSN3L","label":"z\/OS Communications Server"},"Platform":[{"code":"PF054","label":"z\/OS"}],"Version":"120","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]

Document Information

Modified date:
08 January 2021