Topic
  • 12 replies
  • Latest Post - ‏2014-07-24T20:12:31Z by dschroed
dschroed
dschroed
7 Posts

Pinned topic Slow inbound file transfer

‏2014-07-10T21:12:19Z |

I installed Fedora20 on a PowerVM 750. I wanted to evaluate to determine if Power is a good home for any of our x86 based RHEL. Everything seems to work well, except in-bound file transfers. The speed starts around 3.5 Mb/sec and rapidly fall to around 100Kb/sec. Outbound transfers are fast. The Linux Lpar has two virtual ethernets, defined in an etherchannel (bond0), one nic is active and the other passive.  

DEVICE=bond0
IPADDR=10.XX.XXX.XX
NETMASK=255.255.255.0
ONBOOT=yes
GATEWAY=10.XX.XXX.1
BOOTPROTO=static
USERCTL=no
BONDING_OPTS="mode=1 miimon=100 speed 1000 duplex full"
 

DEVICE=eth0
HWADDR=5E:56:FC:73:86:02
MASTER=bond0
SLAVE=yes
ONBOOT=yes
USERCTL=no
BOOTPROTO=none


DEVICE=eth1
HWADDR=5E:56:FC:73:86:03
MASTER=bond0
SLAVE=yes
ONBOOT=yes
USERCTL=no
BOOTPROTO=none
 

bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST>  mtu 1500
        inet 10.XX.XXX.XX  netmask 255.255.255.0  broadcast 10.XX.XXX.255
        inet6 fe80::5c56:fcff:fe73:8602  prefixlen 64  scopeid 0x20<link>
        ether 5e:56:fc:73:86:02  txqueuelen 0  (Ethernet)
        RX packets 18248  bytes 14252908 (13.5 MiB)
        RX errors 0  dropped 2158  overruns 0  frame 0
        TX packets 12594  bytes 1583220 (1.5 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500
        inet6 fe80::5c56:fcff:fe73:8602  prefixlen 64  scopeid 0x20<link>
        ether 5e:56:fc:73:86:02  txqueuelen 1000  (Ethernet)
        RX packets 15921  bytes 14065745 (13.4 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 12587  bytes 1583610 (1.5 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 19 

eth1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST>  mtu 1500
        inet6 fe80::5c56:fcff:fe73:8602  prefixlen 64  scopeid 0x20<link>
        ether 5e:56:fc:73:86:02  txqueuelen 1000  (Ethernet)
        RX packets 2327  bytes 187163 (182.7 KiB)
        RX errors 0  dropped 2158  overruns 0  frame 0
        TX packets 16  bytes 1296 (1.2 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 20 

I did find a similar bug at https://bugzilla.redhat.com/show_bug.cgi?id=855640 but I am not using KVM, QEMU, or VIRTIO.

Any ideas about what is causing the in-bound slow down?

Updated on 2014-07-17T17:37:20Z at 2014-07-17T17:37:20Z by dschroed
  • dschroed
    dschroed
    7 Posts

    Re: Slow inbound file transfer

    ‏2014-07-17T17:37:30Z  

    I decided to install RHEL 7.0 for IBM Power to see if I could reproduce the problem (I could). The problem of slow downs and stalls of inbound file transfer only appears to originate from an AIX Lpar. I used NETPERF between AIX and LINUX: Outbout from Linux transfer rates were around 800 Mbs/sec. NETPERF from AIX to Linux was around 1.9 Mbs/sec.

    The same slow downs also occur on the VIO Server to its Linux client so there's no network switch in the way.

    I find that the in-bound transfer from a Linux Server to the Linux Lpar is around 4 Mb/sec. Not good, but not as bad as 100 Kbs that I was getting for a scp from an AIX to the Linux Lpar.

    I tried setting mtu, and certain tcp buffer sizes, but there was no relief.

    # ethtool -i eth0
    driver: ibmveth
    version: 1.04
    (This is the same driver on FEDORA 20)

    The slowdown problem appears exactly the same with RHEL 7.0 as it was for Fedora 20. It is hard to believe that someone else has not encountered the same problem?

     

  • PowerLinuxTeam
    PowerLinuxTeam
    9 Posts

    Re: Slow inbound file transfer

    ‏2014-07-17T21:16:07Z  
    • dschroed
    • ‏2014-07-17T17:37:30Z

    I decided to install RHEL 7.0 for IBM Power to see if I could reproduce the problem (I could). The problem of slow downs and stalls of inbound file transfer only appears to originate from an AIX Lpar. I used NETPERF between AIX and LINUX: Outbout from Linux transfer rates were around 800 Mbs/sec. NETPERF from AIX to Linux was around 1.9 Mbs/sec.

    The same slow downs also occur on the VIO Server to its Linux client so there's no network switch in the way.

    I find that the in-bound transfer from a Linux Server to the Linux Lpar is around 4 Mb/sec. Not good, but not as bad as 100 Kbs that I was getting for a scp from an AIX to the Linux Lpar.

    I tried setting mtu, and certain tcp buffer sizes, but there was no relief.

    # ethtool -i eth0
    driver: ibmveth
    version: 1.04
    (This is the same driver on FEDORA 20)

    The slowdown problem appears exactly the same with RHEL 7.0 as it was for Fedora 20. It is hard to believe that someone else has not encountered the same problem?

     

    David, thank you for posting a follow-up.  We missed the first post.  I'll start checking and trolling around for an answer.

    Bill

  • dschroed
    dschroed
    7 Posts

    Re: Slow inbound file transfer

    ‏2014-07-17T22:34:23Z  

    David, thank you for posting a follow-up.  We missed the first post.  I'll start checking and trolling around for an answer.

    Bill

    You are welcome. More information. So far the slowdown only occurs from AIX to Linux. From my Windows host to Linux, xfer speeds are around 10 Mbs/sec which is about what I expect on my Windows network. Going from Linux to AIX, the xfer rate is 35 Mbs/sec, which is the same as AIX to AIX transfers.

    It does not matter if I initate the scp from the Linux host to receive from AIX, or AIX host to transmit to Linux. The speeds consistently slow from around 2 Mbs/sec and within 30 seconds to about 200 Kbs/sec. It is taking almost an hour to transfer a 500 MB file.

  • Bill_Buros
    Bill_Buros
    167 Posts

    Re: Slow inbound file transfer

    ‏2014-07-21T16:48:28Z  
    • dschroed
    • ‏2014-07-17T22:34:23Z

    You are welcome. More information. So far the slowdown only occurs from AIX to Linux. From my Windows host to Linux, xfer speeds are around 10 Mbs/sec which is about what I expect on my Windows network. Going from Linux to AIX, the xfer rate is 35 Mbs/sec, which is the same as AIX to AIX transfers.

    It does not matter if I initate the scp from the Linux host to receive from AIX, or AIX host to transmit to Linux. The speeds consistently slow from around 2 Mbs/sec and within 30 seconds to about 200 Kbs/sec. It is taking almost an hour to transfer a 500 MB file.

    Is AIX using the LargeSend and LargeReceive extensions?    That's the common gotcha we see when dealing with AIX LPARs.  

    AIX implemented a non-standard protocol there which isn't available in the open-source space.

  • dschroed
    dschroed
    7 Posts

    Re: Slow inbound file transfer

    ‏2014-07-21T19:09:19Z  

    Is AIX using the LargeSend and LargeReceive extensions?    That's the common gotcha we see when dealing with AIX LPARs.  

    AIX implemented a non-standard protocol there which isn't available in the open-source space.

    The AIX Lpar has the network adapter defined this way:  (No largesend in the AIX virtual adapter)

    en1: flags=1e084863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
            inet 10.xx.xx.11 netmask 0xffffff00 broadcast 10.xx.xx.255
             tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
     

    The VIO Servers have largesend and large_receive set for the internal VLANS per recommendations from IBM. So, Linux clients using virtual ethernet adapters are incompatible with VIO Servers defined with largesend and large_receive turned on?

    I also uncovered another interesting problem: The Linux client is on a machine that has two INTERNAL VLANs. We'll call them Vlan 1 and Vlan 2 and two VIO Servers  each with one of the Vlans in a Shared Ethernet Adapter. Each client has an etherchannel with both Vlans in an active/backup role. We setup half the clients use Vlan 1, and the other half use Vlan 2. 

    If the Linux client is on Vlan 1 (VIO Server1), it cannot ping, or otherwise reach any Lpar that is on Vlan 2 (VIO Server2). I tried pinging with the IP. To reach across Vlans the route has to go out to the switch and back to the other VIO Server. I have no problem with AIX Lpars reaching across on separate Vlan. I can't ping AIX to Linux, or Linux to AIX if the clients are on separate Vlans.

     I want to add that the xfer time between AIX and Linux is fast as long as they are both on the same internal Vlan. In this situation, the communication is not going through the Shared Ethernet Adapter. I tried it with largesend turned on and then turned off, and and the speed was the same 

    Updated on 2014-07-21T19:31:27Z at 2014-07-21T19:31:27Z by dschroed
  • Bill_Buros
    Bill_Buros
    167 Posts

    Re: Slow inbound file transfer

    ‏2014-07-22T16:47:32Z  
    • dschroed
    • ‏2014-07-21T19:09:19Z

    The AIX Lpar has the network adapter defined this way:  (No largesend in the AIX virtual adapter)

    en1: flags=1e084863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
            inet 10.xx.xx.11 netmask 0xffffff00 broadcast 10.xx.xx.255
             tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
     

    The VIO Servers have largesend and large_receive set for the internal VLANS per recommendations from IBM. So, Linux clients using virtual ethernet adapters are incompatible with VIO Servers defined with largesend and large_receive turned on?

    I also uncovered another interesting problem: The Linux client is on a machine that has two INTERNAL VLANs. We'll call them Vlan 1 and Vlan 2 and two VIO Servers  each with one of the Vlans in a Shared Ethernet Adapter. Each client has an etherchannel with both Vlans in an active/backup role. We setup half the clients use Vlan 1, and the other half use Vlan 2. 

    If the Linux client is on Vlan 1 (VIO Server1), it cannot ping, or otherwise reach any Lpar that is on Vlan 2 (VIO Server2). I tried pinging with the IP. To reach across Vlans the route has to go out to the switch and back to the other VIO Server. I have no problem with AIX Lpars reaching across on separate Vlan. I can't ping AIX to Linux, or Linux to AIX if the clients are on separate Vlans.

     I want to add that the xfer time between AIX and Linux is fast as long as they are both on the same internal Vlan. In this situation, the communication is not going through the Shared Ethernet Adapter. I tried it with largesend turned on and then turned off, and and the speed was the same 

    Working to get some experts involved..  

  • AaronBolding
    AaronBolding
    1 Post

    Re: Slow inbound file transfer

    ‏2014-07-22T17:47:06Z  
    • dschroed
    • ‏2014-07-21T19:09:19Z

    The AIX Lpar has the network adapter defined this way:  (No largesend in the AIX virtual adapter)

    en1: flags=1e084863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
            inet 10.xx.xx.11 netmask 0xffffff00 broadcast 10.xx.xx.255
             tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
     

    The VIO Servers have largesend and large_receive set for the internal VLANS per recommendations from IBM. So, Linux clients using virtual ethernet adapters are incompatible with VIO Servers defined with largesend and large_receive turned on?

    I also uncovered another interesting problem: The Linux client is on a machine that has two INTERNAL VLANs. We'll call them Vlan 1 and Vlan 2 and two VIO Servers  each with one of the Vlans in a Shared Ethernet Adapter. Each client has an etherchannel with both Vlans in an active/backup role. We setup half the clients use Vlan 1, and the other half use Vlan 2. 

    If the Linux client is on Vlan 1 (VIO Server1), it cannot ping, or otherwise reach any Lpar that is on Vlan 2 (VIO Server2). I tried pinging with the IP. To reach across Vlans the route has to go out to the switch and back to the other VIO Server. I have no problem with AIX Lpars reaching across on separate Vlan. I can't ping AIX to Linux, or Linux to AIX if the clients are on separate Vlans.

     I want to add that the xfer time between AIX and Linux is fast as long as they are both on the same internal Vlan. In this situation, the communication is not going through the Shared Ethernet Adapter. I tried it with largesend turned on and then turned off, and and the speed was the same 

    I apologize in advance for any formatting mangling that goes on in this post.  I'm not that familiar with this forum software.

    The VIO Servers have largesend and large_receive set for the internal VLANS per recommendations from IBM. So, Linux clients using virtual ethernet adapters are incompatible with VIO Servers defined with largesend and large_receive turned on?

    In general that is supposed to be true.  More precisely, largesend and large_receive are documented to be only compatible with AIX.  However, in some limited testing, it appears largesend does not cause a problem for Linux LPARs.  large_receive does cause precisely the problem you're describing and disabling it should get things fixed.  Disabling it will also result in somewhat higher CPU consumption in the VIOS and may limit the total throughput available to AIX LPARs.

    If the Linux client is on Vlan 1 (VIO Server1), it cannot ping, or otherwise reach any Lpar that is on Vlan 2 (VIO Server2). I tried pinging with the IP. To reach across Vlans the route has to go out to the switch and back to the other VIO Server. I have no problem with AIX Lpars reaching across on separate Vlan. I can't ping AIX to Linux, or Linux to AIX if the clients are on separate Vlans.

    You are correct that there is no routing between VLANs within the hypervisor switch or within the SEA.  It sounds like you have a bad routing table in Linux.  Maybe you can post netstat -rn from each OS and we can see the difference.

     I want to add that the xfer time between AIX and Linux is fast as long as they are both on the same internal Vlan. In this situation, the communication is not going through the Shared Ethernet Adapter. I tried it with largesend turned on and then turned off, and and the speed was the same 

    That's the expected result based on the limited testing that has been performed.  Largesend could cause AIX to generate packets that would be a problem for Linux, but in practice it appears not to.

  • dschroed
    dschroed
    7 Posts

    Re: Slow inbound file transfer

    ‏2014-07-22T18:52:12Z  

    I apologize in advance for any formatting mangling that goes on in this post.  I'm not that familiar with this forum software.

    The VIO Servers have largesend and large_receive set for the internal VLANS per recommendations from IBM. So, Linux clients using virtual ethernet adapters are incompatible with VIO Servers defined with largesend and large_receive turned on?

    In general that is supposed to be true.  More precisely, largesend and large_receive are documented to be only compatible with AIX.  However, in some limited testing, it appears largesend does not cause a problem for Linux LPARs.  large_receive does cause precisely the problem you're describing and disabling it should get things fixed.  Disabling it will also result in somewhat higher CPU consumption in the VIOS and may limit the total throughput available to AIX LPARs.

    If the Linux client is on Vlan 1 (VIO Server1), it cannot ping, or otherwise reach any Lpar that is on Vlan 2 (VIO Server2). I tried pinging with the IP. To reach across Vlans the route has to go out to the switch and back to the other VIO Server. I have no problem with AIX Lpars reaching across on separate Vlan. I can't ping AIX to Linux, or Linux to AIX if the clients are on separate Vlans.

    You are correct that there is no routing between VLANs within the hypervisor switch or within the SEA.  It sounds like you have a bad routing table in Linux.  Maybe you can post netstat -rn from each OS and we can see the difference.

     I want to add that the xfer time between AIX and Linux is fast as long as they are both on the same internal Vlan. In this situation, the communication is not going through the Shared Ethernet Adapter. I tried it with largesend turned on and then turned off, and and the speed was the same 

    That's the expected result based on the limited testing that has been performed.  Largesend could cause AIX to generate packets that would be a problem for Linux, but in practice it appears not to.

    Thank you very much for the explanation concerning the large_receive and traffic slowdown. I will see what I can do to disable it in the VIO Servers. I am not much concerned about increasing CPU usage in the VIOS, but I am concerned about limiting throughput to the AIX. We have also a network file system that this could increase I/O times.

     

    # netstat -nr

    Kernel IP routing table

    Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface

    0.0.0.0         10.64.122.1     0.0.0.0         UG        0 0          0 bond0

    0.0.0.0         10.64.122.1     0.0.0.0         UG        0 0          0 eth0

    10.71.64.69     10.64.122.1     255.255.255.255 UGH       0 0          0 eth0

    10.64.122.0     0.0.0.0         255.255.255.0   U         0 0          0 bond0

    10.64.122.0     0.0.0.0         255.255.255.0   U         0 0          0 eth0

    169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 bond0

    169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 bond1

    192.168.66.0    0.0.0.0         255.255.255.0   U         0 0          0 bond1

    # cat ifcfg-bond0
    DEVICE=bond0
    IPADDR=10.64.122.34
    NETMASK=255.255.255.0
    ONBOOT=yes
    GATEWAY=10.64.122.1
    BOOTPROTO=static
    USERCTL=no
    BONDING_OPTS="mode=1 miimon=100 speed 1000 duplex full"

    # cat ifcfg-eth0
    DEVICE=eth0
    HWADDR=5E:56:FC:73:86:02
    MASTER=bond0
    SLAVE=yes
    ONBOOT=yes
    USERCTL=no
    BOOTPROTO=none

    # cat ifcfg-eth1
    DEVICE=eth1
    HWADDR=5E:56:FC:73:86:03
    MASTER=bond0
    SLAVE=yes
    ONBOOT=yes
    USERCTL=no
    BOOTPROTO=none

    Routing Table from AIX Lpar: (en4 is the public network)

     

     

    # netstat -nr

    Routing tables

    Destination        Gateway           Flags   Refs     Use  If   Exp  Groups

    Route Tree for Protocol Family 2 (Internet):

    default            10.64.122.1       UG        8 135927325 en4      -      -  

    10.64.122.0        10.64.122.32      UHSb      0         0 en4      -      -  

    10.64.122/24       10.64.122.32      U         2 235325526 en4      -      -  

    10.64.122.32       127.0.0.1         UGHS      1        14 lo0      -      -  

    10.64.122.255      10.64.122.32      UHSb      0         4 en4      -      -  

    127/8              127.0.0.1         U        12   6024080 lo0      -      -  

    192.168.66.0       192.168.66.131    UHSb      0         0 en8      -      -   =>

    192.168.66/24      192.168.66.131    U         2 147100716 en8      -      -  

    192.168.66.131     127.0.0.1         UGHS      0         2 lo0      -      -  

    192.168.66.255     192.168.66.131    UHSb      0         4 en8      -      -  

     

    Route Tree for Protocol Family 24 (Internet v6):

    ::1%1              ::1%1             UH        0   2250829 lo0      -      -

  • KavithaBaratakke
    KavithaBaratakke
    2 Posts

    Re: Slow inbound file transfer

    ‏2014-07-22T20:23:25Z  
    • dschroed
    • ‏2014-07-22T18:52:12Z

    Thank you very much for the explanation concerning the large_receive and traffic slowdown. I will see what I can do to disable it in the VIO Servers. I am not much concerned about increasing CPU usage in the VIOS, but I am concerned about limiting throughput to the AIX. We have also a network file system that this could increase I/O times.

     

    # netstat -nr

    Kernel IP routing table

    Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface

    0.0.0.0         10.64.122.1     0.0.0.0         UG        0 0          0 bond0

    0.0.0.0         10.64.122.1     0.0.0.0         UG        0 0          0 eth0

    10.71.64.69     10.64.122.1     255.255.255.255 UGH       0 0          0 eth0

    10.64.122.0     0.0.0.0         255.255.255.0   U         0 0          0 bond0

    10.64.122.0     0.0.0.0         255.255.255.0   U         0 0          0 eth0

    169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 bond0

    169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 bond1

    192.168.66.0    0.0.0.0         255.255.255.0   U         0 0          0 bond1

    # cat ifcfg-bond0
    DEVICE=bond0
    IPADDR=10.64.122.34
    NETMASK=255.255.255.0
    ONBOOT=yes
    GATEWAY=10.64.122.1
    BOOTPROTO=static
    USERCTL=no
    BONDING_OPTS="mode=1 miimon=100 speed 1000 duplex full"

    # cat ifcfg-eth0
    DEVICE=eth0
    HWADDR=5E:56:FC:73:86:02
    MASTER=bond0
    SLAVE=yes
    ONBOOT=yes
    USERCTL=no
    BOOTPROTO=none

    # cat ifcfg-eth1
    DEVICE=eth1
    HWADDR=5E:56:FC:73:86:03
    MASTER=bond0
    SLAVE=yes
    ONBOOT=yes
    USERCTL=no
    BOOTPROTO=none

    Routing Table from AIX Lpar: (en4 is the public network)

     

     

    # netstat -nr

    Routing tables

    Destination        Gateway           Flags   Refs     Use  If   Exp  Groups

    Route Tree for Protocol Family 2 (Internet):

    default            10.64.122.1       UG        8 135927325 en4      -      -  

    10.64.122.0        10.64.122.32      UHSb      0         0 en4      -      -  

    10.64.122/24       10.64.122.32      U         2 235325526 en4      -      -  

    10.64.122.32       127.0.0.1         UGHS      1        14 lo0      -      -  

    10.64.122.255      10.64.122.32      UHSb      0         4 en4      -      -  

    127/8              127.0.0.1         U        12   6024080 lo0      -      -  

    192.168.66.0       192.168.66.131    UHSb      0         0 en8      -      -   =>

    192.168.66/24      192.168.66.131    U         2 147100716 en8      -      -  

    192.168.66.131     127.0.0.1         UGHS      0         2 lo0      -      -  

    192.168.66.255     192.168.66.131    UHSb      0         4 en8      -      -  

     

    Route Tree for Protocol Family 24 (Internet v6):

    ::1%1              ::1%1             UH        0   2250829 lo0      -      -

    David,

    Thanks for your detailed posts and explanation. To summarize AIX  and Linux being in the same VLAN does not exhibit any problem but putting them in separate VLAN does. 

    This of course means the traffic from one VLAN to another is getting routed from VIOS through the external switch to the other VIOS and definitely large_receive could be the cause of the issue as Aaron rightly pointed out.

    I would definitely disable the large send and large receive on the VIOS.

    I suspect with the current setting you'd see a large number of packet drops on your Linux LPAR due to incorrect checksums. netstat -s should show these stats. TCP Checksums won't match up with large_receive since the Linux LPAR doesn't support LR and the checksum field will be incorrect since the packets are coalesced at the VIOS and they won't match up to what your Linux stack is expecting. AIX on the other hand is designed to handle the LR changes in the VIOS.

    Largesend is less of a problem because if the client LPAR does not support Largesend then we just won't coalesce and send large packets through the VIOS and the largesend algorithm never gets really used. 

    I would recommend turning off large_receive on both VIOSes and checking if that solves anything for you. 

    Updated on 2014-07-22T20:26:03Z at 2014-07-22T20:26:03Z by KavithaBaratakke
  • dschroed
    dschroed
    7 Posts

    Re: Slow inbound file transfer

    ‏2014-07-22T21:53:24Z  

    David,

    Thanks for your detailed posts and explanation. To summarize AIX  and Linux being in the same VLAN does not exhibit any problem but putting them in separate VLAN does. 

    This of course means the traffic from one VLAN to another is getting routed from VIOS through the external switch to the other VIOS and definitely large_receive could be the cause of the issue as Aaron rightly pointed out.

    I would definitely disable the large send and large receive on the VIOS.

    I suspect with the current setting you'd see a large number of packet drops on your Linux LPAR due to incorrect checksums. netstat -s should show these stats. TCP Checksums won't match up with large_receive since the Linux LPAR doesn't support LR and the checksum field will be incorrect since the packets are coalesced at the VIOS and they won't match up to what your Linux stack is expecting. AIX on the other hand is designed to handle the LR changes in the VIOS.

    Largesend is less of a problem because if the client LPAR does not support Largesend then we just won't coalesce and send large packets through the VIOS and the largesend algorithm never gets really used. 

    I would recommend turning off large_receive on both VIOSes and checking if that solves anything for you. 

    I removed both largesend and large_receive from the VIOS shared ethernet adapter, and there was no improvement:

    #  lsattr -El ent11
    accounting    disabled Enable per-client accounting of network statistics                 True
    ctl_chan               Control Channel adapter for SEA failover                           True
    gvrp          no       Enable GARP VLAN Registration Protocol (GVRP)                      True
    ha_mode       disabled High Availability Mode                                             True
    hash_algo     0        Hash algorithm used to select a SEA thread                         True
    jumbo_frames  no       Enable Gigabit Ethernet Jumbo Frames                               True
    large_receive no       Enable receive TCP segment aggregation                             True
    largesend     0        Enable Hardware Transmit TCP Resegmentation                        True
    lldpsvc       no       Enable IEEE 802.1qbg services                                      True
    netaddr       0        Address to ping                                                    True
    nthreads      7        Number of SEA threads in Thread mode                               True
    pvid          3        PVID to use for the SEA device                                     True
    pvid_adapter  ent9     Default virtual adapter to use for non-VLAN-tagged packets         True
    qos_mode      disabled N/A                                                                True
    queue_size    8192     Queue size for a SEA thread                                        True
    real_adapter  ent10    Physical adapter associated with the SEA                           True
    thread        1        Thread mode enabled (1) or disabled (0)                            True
    virt_adapters ent9     List of virtual adapters associated with the SEA (comma separated) True

    I also did a 'down detach' before I executed the chdev command

  • KavithaBaratakke
    KavithaBaratakke
    2 Posts

    Re: Slow inbound file transfer

    ‏2014-07-22T22:21:38Z  
    • dschroed
    • ‏2014-07-22T21:53:24Z

    I removed both largesend and large_receive from the VIOS shared ethernet adapter, and there was no improvement:

    #  lsattr -El ent11
    accounting    disabled Enable per-client accounting of network statistics                 True
    ctl_chan               Control Channel adapter for SEA failover                           True
    gvrp          no       Enable GARP VLAN Registration Protocol (GVRP)                      True
    ha_mode       disabled High Availability Mode                                             True
    hash_algo     0        Hash algorithm used to select a SEA thread                         True
    jumbo_frames  no       Enable Gigabit Ethernet Jumbo Frames                               True
    large_receive no       Enable receive TCP segment aggregation                             True
    largesend     0        Enable Hardware Transmit TCP Resegmentation                        True
    lldpsvc       no       Enable IEEE 802.1qbg services                                      True
    netaddr       0        Address to ping                                                    True
    nthreads      7        Number of SEA threads in Thread mode                               True
    pvid          3        PVID to use for the SEA device                                     True
    pvid_adapter  ent9     Default virtual adapter to use for non-VLAN-tagged packets         True
    qos_mode      disabled N/A                                                                True
    queue_size    8192     Queue size for a SEA thread                                        True
    real_adapter  ent10    Physical adapter associated with the SEA                           True
    thread        1        Thread mode enabled (1) or disabled (0)                            True
    virt_adapters ent9     List of virtual adapters associated with the SEA (comma separated) True

    I also did a 'down detach' before I executed the chdev command

    David,

    Please check that the underlying adapters (under the SEA) on both VIOSes does not have large receive turned on either. You might have delete and recreate SEA to accomplish this. Also can you cut and paste your netstat -s statistics from your AiX and Linux servers?

  • dschroed
    dschroed
    7 Posts

    Re: Slow inbound file transfer

    ‏2014-07-24T20:12:31Z  

    David,

    Please check that the underlying adapters (under the SEA) on both VIOSes does not have large receive turned on either. You might have delete and recreate SEA to accomplish this. Also can you cut and paste your netstat -s statistics from your AiX and Linux servers?

    On the VIO Server, I removed largesend and large_recieve from all SEA, and the same problem persisted.

    The NetApp Filer is also an NFS mounted file share with a separate set of PCI adapters which have largesend and large_receive enabled on the SEA, and I have no problems transferring data (write/read) to/from the NetApp Filer. So something else must be the problem.

    There is one major difference between the NetApp network and the public network: The problem network is defined on Integrated Virtual Ethernet, aka Host Ethernet Adapters (HEA). I tried various combinations of flipping largesend, large_receive and flow_ctrl off and on.And this is the bottom line: large_receive on the physical HEA is causing the slow down. The large_receive is not found on the PCI physical ethernet adapter. For the HEA, the default is large_receive=yes. Since I turned large_receive off on the HEA, the network transmit speeds are holding steady around 35 Mbs/sec.

    On the Shared Ethernet Adapter, largesend, or large_receive- whether off, or on- did not impact transfer times. On the HEA, large_send- whether of, or on- did not impact transfer times. It was the HEA large_receive causing the problem. The issue is now resloved.