Network study – 1 Gb Ethernet versus 10 Gb Ethernet

This is a study of network traffic between the WebSphere® Application Server and the workload generating clients. Network traffic is measured during workload runs, and the issue of network saturation is discussed.

Network considerations

In analyzing the first data from the workload submission rate 600 runs, as shown in Comparing 64-bit WebSphere versus 31-bit WebSphere, response times increased dramatically from sub-seconds to approaching 5 seconds. With CPU utilization at less than 75%, little or no swapping has occurred, indicating that there is no contention for resources. While the configuration of WebSphere and DB2® was designed to take advantage of the available memory, the near absence of swapping to disk indicates that constrained memory did not cause the poor response time.

Examination of the sar and netstat command data from submission rate 500 runs showed good response time, so it had to be determined if the workload submission rate of 600 created network traffic exceeding the bandwidth of the 1 Gb OSA Express2 Ethernet card. Some benchmark measurements indicated some reduced throughput at a workload submission rate of 600.

Methodology

Several key measurements are taken to establish network traffic rates at different data points. Network traffic bits per second are found in the sar command report under these two fields:
rxkB/s
Represents kilobytes per second read.
txkB/s
Represents kilobytes per second transmitted.
To use these values to obtain a total number of network bits per second, perform this calculation:
  1. Add the rxkB/s and txkB/s values together.
  2. Multiply this sum by 1024. This produces the total number of bytes.
  3. Multiply the total number of bytes by eight. This produces the total number of network bits per second.
The total number of network bits per second is a good overall measurement of network traffic.

Two measurements of network congestion consistent with an I/O-bound workload were obtained. The first measurement, segments retransmitted, came from the netstat command report. A netstat snapshot before and after the workload is taken, because the ramp-up or warmup phase of the benchmark would also accumulate network data and that should not be included in the measurements.

To calculate the number of segments retransmitted during the steady-state phase, subtract the number of segments retransmitted found in the first netstat command report (that included ramp-up data) from the number displayed in the second report.

The second measurement, txdrop/s is the number of packets dropped per second, because of resource constraints. It is found in the sar report.

Table 1 summarizes these results in tabular format. Figure 1 and Figure 2 are graphical representations of the results.
Table 1. Network study: Data read and written per second, segment retransmits, and packets dropped
31-bit/64-bit WebSphere Workload submission rate Network card link speed Megabits read or written per second Segments retransmitted txdrop/sec
31bit 500 1 Gb Ethernet 746 N/A 0.46
64-bit 500 1 Gb Ethernet 749 N/A 0.29
31bit 600 1 Gb Ethernet 770 N/A 241
64-bit 600 1 Gb Ethernet 783 158,828 249
31bit 600 10 Gb Ethernet 901 4232 0.75
64-bit 600 10 Gb Ethernet 902 198 0.07
Figure 1. Network study: Utilization of the 1 Gb OSA card from the WebSphere Application Server from the traffic to the clients
Graph of network utilization for 1 Gb OSA card
Figure 2. Network study: Utilization of the 10 Gb OSA card from the WebSphere Application Server from the traffic to the clients
Graph of network utilization for 10 Gb OSA card

Observations

The number of packets dropped per second was a high value (greater than 240) for both 64-bit and 31-bit WebSphere. Related to these same high drop rates is a high number of segment retransmits, which additionally increase the load on the network card. With the 10 Gb Ethernet card, the total throughput increased by 12%, and the amount of packages increased by 14%.

Conclusions

The poor response times with the workload submission rate of 600 could be easily attributed to network I/O traffic reaching the limit of the 1 Gb Ethernet on the WebSphere system being tested. The results also show that for this workload, the practical limit for good response time on a 1 Gb Ethernet might be approximately 780 Mb per second, which indicates that the throughput itself is not the major limiting factor.

Another factor is the number of packages, where the number of 80 000 packages or more could be considered as close to the upper limits. There is seen, on average, a very small package size here (less than 100 bytes). In these cases, the maximum throughput will not be reached because the high number of I/O requests per second are the limiting factor. This might also reach limits on other resources such as the network switch.