Topic
  • 3 replies
  • Latest Post - ‏2013-07-12T06:41:56Z by nitz-ibm
SystemAdmin
SystemAdmin
253 Posts

Pinned topic FTP transfers issues - Dropped packets

‏2013-02-19T20:21:42Z |
FTP transfers to the RD&T are slow do to dropped packets by TCPIP on
the RD&T system because of a TCP checksum error.
This is on z/OS 1.13.

I have traces from both sides. Inbound discards on the RD&T and full
trace of the same period on the source side. I match up packet by
packet. The packet size changes (it gets bigger) (no-fragmentation bit
is on).The problem only occurs on inbound to RDT thru the emulated OSA
from the NIC-card. Outbound is fine and inbound thru the tunnel OSA
from Linux is fine.

I believe I understand what is happening. It looks like the emulated
OSA has an optimization feature which kicks in when consecutive packets
are for the same TCP connection and the packets are in the correct
sequence number order with no holes. The reason is that TCPIP will
support a packet size up to 8992 for the OSA interface even though we
use a smaller size of 1492. The OSA emulator will merge those packets
into a larger packet before gives it to TCPIP. This would reduce
emulated z/OS TCPIP processing.
Here is what I see in the Trace:
Sent packet 2581:
Source : 10.225.253.7
Destination : 10.233.119.235
Source Port : 36534
Destination Port : 20
ID Number : 3F09
Sequence Number : 1515325806
IP Header Length : 20
TCP Header Length: 32
Data Length : 1440
Flags : Ack

Sent packet 2582:
Source : 10.225.253.7
Destination : 10.233.119.235
Source Port : 36534
Destination Port : 20
ID Number : 3F0A
Sequence Number : 1515327246
IP Header Length : 20
TCP Header Length: 32
Data Length : 1440
Flags : Ack Psh

Received packet 1676:
Source : 10.225.253.7
Destination : 10.233.119.235
Source Port : 36534
Destination Port : 20
ID Number : 3F09
Sequence Number : 1515325806
IP Header Length : 20
TCP Header Length: 32
Data Length : 2880
Flags : Ack Psh

Looks like packets 2581 & 2582 where merged together into packet 1676
with the TCP header being built from a combination of the two source
TCP headers. Both the IP and TCP checksums have been recalculated since
they are different from the source. Since the IP checksum did not have
a problem, it must have been recalculated at the right time. Since
TCPIP is complaining about the TCP checksum, either it was recalculated
incorrectly or some TCP header settings where changed after the
recalculation.
Updated on 2013-02-20T21:39:03Z at 2013-02-20T21:39:03Z by SystemAdmin
  • RDzJohn
    RDzJohn
    273 Posts

    Re: FTP transfers issues - Dropped packets

    ‏2013-02-19T20:49:31Z  
    I have forwarded this problem to our emulator team for them to analyze. I expect someone will be in touch to gather required doc. In the meantime, there are several documented workarounds for performance problems with OSA. These may help you until the root cause of the problem is identified. Refer to pp.39-40 of publication IBM System z Personal Development Tool: Volume 2 Installation and Basic Use, SG24-7722-04 for details.

    RDzJohn
  • SystemAdmin
    SystemAdmin
    253 Posts

    Re: FTP transfers issues - Dropped packets

    ‏2013-02-20T21:39:03Z  
    • RDzJohn
    • ‏2013-02-19T20:49:31Z
    I have forwarded this problem to our emulator team for them to analyze. I expect someone will be in touch to gather required doc. In the meantime, there are several documented workarounds for performance problems with OSA. These may help you until the root cause of the problem is identified. Refer to pp.39-40 of publication IBM System z Personal Development Tool: Volume 2 Installation and Basic Use, SG24-7722-04 for details.

    RDzJohn
    Reviewed the below doc in the manual and here is the result from our Linux support on the settings.
    Those settings are already in place on the ETL Linux server:
    Pause parameters for eth0:
    Autonegotiate: on
    RX: off
    TX: off

    net.core.rmem_max = 1048576
    The Ctraces have been sent in to this PMR# 77037 122 000.
    IBM System z
    Personal Development Tool
    Volume 2 Installation and Basic Use

    3.3.8 Performance problems
    At the time of writing we were aware of two particular problems that impact OSA performance.
     If frames larger than expected are used, there may be an excessive number of frames
    dropped (causing a retransmission). This might not be noticed unless careful
    measurements or comparisons are made. We believe this problem is resolved by including
    the systcl parameter:
    net.core.rmem_max=1048576
    that is now recommended in the first chapter of this document.
     If advanced Linux kernels are installed there might be a drastic slow-down of OSA
    performance that would be immediately obvious. This is due to Linux attempting to offload
    checksum functions into the adapter, which is not acceptable to the current awsOSA
    implementation. One solution is to use a Linux command:
    1. ethtool -K eth0 rx off
    Unfortunately, this command must be entered after each Linux boot. The problem was first
    noticed with a build of Linux kernel 2.6.36.2; the ethtool must be at least at level 2.6.33.
    We have not seen this problem with any of the “standard” Linux distributions that we have
    referenced in these documents.

    IBM has not published any performance specifications for OSA. Informal observation
    indicates that ftp throughput may be in the 5-8 megabytes/second range, assuming an
    unconstrained network in a dedicated environment. If your performance is much worse than
    this, the two problems mentioned here might be reviewed.
  • nitz-ibm
    nitz-ibm
    1 Post

    Re: FTP transfers issues - Dropped packets

    ‏2013-07-12T06:41:56Z  
    Reviewed the below doc in the manual and here is the result from our Linux support on the settings.
    Those settings are already in place on the ETL Linux server:
    Pause parameters for eth0:
    Autonegotiate: on
    RX: off
    TX: off

    net.core.rmem_max = 1048576
    The Ctraces have been sent in to this PMR# 77037 122 000.
    IBM System z
    Personal Development Tool
    Volume 2 Installation and Basic Use

    3.3.8 Performance problems
    At the time of writing we were aware of two particular problems that impact OSA performance.
     If frames larger than expected are used, there may be an excessive number of frames
    dropped (causing a retransmission). This might not be noticed unless careful
    measurements or comparisons are made. We believe this problem is resolved by including
    the systcl parameter:
    net.core.rmem_max=1048576
    that is now recommended in the first chapter of this document.
     If advanced Linux kernels are installed there might be a drastic slow-down of OSA
    performance that would be immediately obvious. This is due to Linux attempting to offload
    checksum functions into the adapter, which is not acceptable to the current awsOSA
    implementation. One solution is to use a Linux command:
    1. ethtool -K eth0 rx off
    Unfortunately, this command must be entered after each Linux boot. The problem was first
    noticed with a build of Linux kernel 2.6.36.2; the ethtool must be at least at level 2.6.33.
    We have not seen this problem with any of the “standard” Linux distributions that we have
    referenced in these documents.

    IBM has not published any performance specifications for OSA. Informal observation
    indicates that ftp throughput may be in the 5-8 megabytes/second range, assuming an
    unconstrained network in a dedicated environment. If your performance is much worse than
    this, the two problems mentioned here might be reviewed.

    We have also encountered severely less throughput to z/OS. Using ftp we achieved transfer rates of about 0.8kB/s, yes, less than one kilobyte per second!!!). Putting our data first on the Linux system and then using the 10.1.1.2 IP address for z/OS, the throughput varied in the mentioned 6-9MB/s range.

    We eventually discovered that we only had the performance problem when we were using a GigaBit LAN (speed=1000MBit/s), LAN-cable connected to the switch which is LAN-cable-connected to the Windows or Linux PC initiating the ftp. I have since set the speed on my W7-PC down to 100MBit/s full duplex, and that has upped the ftp transfer rate to the expected 6-8MB/s. It has also cured the problem of timeouts when connecting via IP to z/OS when we are coming in through a VPN tunnel into the office network.

    We also saw incorrect checksums, but none of us is either a Linux or a networks/IP specialist, so we didn't know how to determine further which side was causing them. We had checked the 'WIN size' setting, which is the expected 1492 both on Windows and in z/OS. I am now thinking of limiting the eth0 adapter speed to 100Mbit/s.

    Also, I have a problem with the above mentioned 'advanced Linux kernel'. We haven't changed anything from the way we got the RDT PC, so I am not sure about this problem only occuring on 'advanced kernels'.

    Barbara