IBM Support

How to get maximum throughput from 100 Gb ethernet adapter in non-virtualized environment?

White Papers


Abstract

IBM Power8 and Power9 servers now support 100 Gb ethernet adapters. With higher network speed, a achieving peak performance requires tuning of the adapters. This document describes 100 Gb ethernet adapter tuning to get maximize throughput in non-virtualized environment.

Content

Environment
hostA and hostB with 100 Gb ethernet adapter (Feature Code: EC3M) connected via 100 Gb switch port. Both hosts are in same subnet and not running any network intensive application. The network is clean so there are no packet drops/retransmissions/duplicate acks.
♦ Configuration
• hostA
mode: 8286-42A (power8) 
aix level: 6.1 TL09 SP12     
system firmware: FW860.42    
type: Shared                                 
mode: Capped  
smt=: 4            
lcpu: 16
memory: 8192MB                             
psize: 12                                        
ent_capacity: 4.00
vpm_fold_policy=4
aix level: 6.1 TL09 SP12
Ethernet Adapter: 2-port 100GbE RoCE QSFP28 PCIe 3.0 x16 Adapter
                                  (b31513101410f704)
Feature Code: EC3M (driver name: mlxcentdd)
Device Driver name: mlxcentdd
ROM Level: 001200162500
Hardware Location Code: U78C9.001.WZS0472-P1-C5-T1
• hostB

mode: 8286-42A (power8)
system firmware: FW840.23
type: Shared
mode: Uncapped
smt: 4
lcpu: 32
memory: 32768MB
psize: 22
ent_capacity: 8.00
vpm_fold_policy: 4
aix level: 7.2 TL03 SP03
Ethernet Adapter: 2-port 100GbE RoCE QSFP28 PCIe 3.0 x16 Adapter                  
                                  (b31513101410f704)
Feature Code: EC3M
Device Driver name: mlxcentdd
ROM Level: 001200162500
Hardware Location Code: U78C9.001.WZS05FJ-P1-C5-T1
♦ Test
iperf is ran between hostA as client and hostB as sever. iperf does not read/write to the disk so it measures true network bandwidth.
hostA: iperf -c <hostB> -P 24 -t 60 -l 64k -w <TCP window size>
hostB: iperf -s -w <TCP window size>
     -c run in client mode
     -s run in server mode
     -P number of parallel connections to run
     -t time in seconds to transmit for
     -l length of buffer to read or write
     -w TCP window size
Note: It is important run multiple parallel connections to measure the bandwidth.
The tests are ran for following 3 cases using TCP window size 256 KBytes, 512 Kbytes and 1 Mbytes on both ends. In all 3 cases, tests are ran 5 times and bandwidth shown in the table below is the average of 5 tests.
(1) default value of driver attributes
(2) change the value of driver attributes queues_rx, queues_tx, rx_max_pkts and tx_send_cnt
(3) same as 2 and also change jumbo_frames
♦ Results
Following tables show bandwidth achieved by iperf test.
• Case 1:  default value of driver attributes
TCP window size
256 Kbytes
TCP window size
512 Kbytes
TCP window size
1 Mbytes
16.78 Gbits/sec 14.72 Gbits/sec 13.80 Gbits/sec
• Case 2: change the value of driver attributes queues_rx, queues_tx, rx_max_pkts and tx_send_cnt
queues_rx=20 (default 8)
Number of receive queues used by the network adapter for receiving network traffic.
queues_tx=12 (default 2)
Number of transmit queues used by the network adapter for transmit network traffic
rx_max_pkts=2048 (default 1024)
Receive queue maximum packet count
tx_send_cnt=16 (default 8)
Number of transmit packets chained for adapter processing
TCP window size
256 Kbytes
TCP window size
512 KBytes
TCP window size
1 MBytes
35.52 Gbits/sec 29.04 Gbits/sec 21.81 Gbits/sec
• Case 3: same as case 2 plus enable jumbo_frames both hosts and the switch ports
jumbo_frames=yes (default no)
Sets MTU size to 9000 to use packet size up to 9000 bytes.
Note: It is required to enable jumbo_frame all network devices in between to use
jumbo frame.
TCP window size
256 Kbytes
TCP window size
512 KBytes
TCP window size
1 MBytes
92.11 Gbits/sec 94.92 Gbits/sec 92.46 Gbits/sec
♦ Summary
(1) To get maximum bandwidth tune following driver attributes.
queues_rx
queues_tx
rx_max_pkts
tx_send_cnt
(2) Without jumbo_frame, bandwidth increases only approximately 2 times. With jumbo_frame, bandwidth increases approximately 6 times. The jumbo_frame is required to achieve bandwidth close to line speed.
(3) Without jumbo_frame, 256 Kbytes of TCP window size gives maximum bandwidth. With jumbo_frame, 512 Kbytes of TCP window size gives maximum bandwidth.
(4) Increasing TCP window size higher than 512 Kbytes does not increase the bandwidth.
Author: Darshan Patel
Operating System: AIX and VIOS
Hardware: Power
Feedback:
aix_feedback@wwpdl.vnet.ibm.com

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"ARM Category":[{"code":"a8m0z000000cw0qAAA","label":"Performance->Network"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
15 May 2020

UID

ibm16208263