Examples of the network throughput you can see from an individual dedicated instance and aggregate throughput when multiple instances are running a network-intensive workload on a dedicated host.
IBM Cloud Virtual Servers can be configured as dedicated instances that run on dedicated hosts, to ensure physical isolation for your workloads. This allows you to control noisy neighbor effects between instances sharing the same host, and it provides you with a level of control on network throughput capabilities of your instances.
Results from a study that provide examples of network throughput
In this blog post, we share results from a study that provide examples of the network throughput you might expect to see from an individual dedicated instance, as well as aggregate throughput when multiple instances are running a network-intensive workload on a dedicated host. A few workload types that can be considered to have medium-to-heavy network bandwidth requirements include file downloads/uploads, music and video streaming, and video and audio web conferencing.
A common misconception regarding instance network throughput capabilities is that the available bandwidth for an instance is capped at 1 Gbps. This is understandable, given the network port speed choices that are available.
However, it is important to note that the bandwidth is capped only in the case where the 100 Mbps speed is selected. If the 1 Gbps speed is chosen, the maximum possible instance throughput is theoretically limited only by the physical bandwidth available to the dedicated host. Dedicated hosts are configured with bonded 2 x 10 Gbps Ethernet links; therefore, the aggregate throughput capacity is 20 Gbps.
Here are key characteristics of the environment that we used for our performance evaluation:
Instance Types: 1 vCPU x 2GB RAM (1×2), 4 vCPUs x 8GB RAM (4×8), 32 vCPUs x 128 GB RAM (32×128)
Operating System: CentOS 7
Network: Private Network
MTU Sizes: 1500-byte, 9000-byte
Benchmark: iperf3, 1 and 8 TCP parallel connections
Dedicated instance network throughput
For our initial scenario, we ran a set of network data streaming tests using the iperf3 benchmark application between a pair of instances of the same configuration, with each instance on a different host. We also varied the number of TCP connections and the MTU size across those tests.
In general, the results were in line with what we expected. The larger MTU size and increased number of TCP connections yielded higher throughput. Also, we saw noticeable improvement for instances configured with more vCPUs for the case where we used the 9000-byte MTU and multiple TCP connections.
The drop in performance for multiple TCP connections was not unexpected for the 1-vCPU instances. The multiple connections can simply increase contention for the single CPU, especially in the 1500-byte MTU case since a higher rate of packets needs to be processed.
We did notice an anomaly in the 9000-byte MTU, 1 TCP connection case, in that the 32×128 instance result was slightly lower than the 4×8 instance result. We wouldn’t expect that more vCPUs would yield a benefit for the 1 TCP connection case, but we wouldn’t expect a lower result either.
By analyzing per-physical-CPU thread utilization, we concluded that the lower throughput was likely due to impacts of poor CPU affinity. We saw that the work to handle the single stream of network packets ended up being scheduled across many different physical CPU threads during the run.
In the end, we didn’t consider the performance difference (of approximately 7%) to be significant for this workload.
Aggregate network throughput on a dedicated host
For our second scenario, we ran the same set of network data streaming tests. In this case, however, we executed them concurrently across all three pairs of instances to see what aggregate network throughput was achievable for the entire dedicated host.
Similar to the initial scenario, we observed the anomaly with the 4×8 instance achieving higher throughput than the 32×128 instance in the 9000-byte MTU, 1 TCP connection case. Overall, the results were as expected, with the larger MTU size and increased number of TCP connections yielding higher throughput.
Summary of study results
Given that we’ve demonstrated that the use of the larger MTU size results in higher throughput, it is important to point out that jumbo frame (9000-byte MTU) support is enabled on all of the dedicated hosts in all of our data centers. However, in order to allow the instance to use the larger size, you must configure that yourself at the instance operating system level. On CentOS 7, for example, to set the MTU size to 9000 bytes on interface eth0, you would execute the following:
# ip link set eth0 mtu 9000
Scroll to view full table
In summary, based on the results of our measurements, we were able to show the following:
A single instance on a dedicated host can achieve a network throughput >10 Gbps.
Multiple instances running a network-intensive workload concurrently can achieve an aggregate network throughput near the 20 Gbps physical bandwidth limit of the dedicated host.
Finally, we would like to mention that while dedicated instances are not offered on the IBM Cloud Virtual Private Cloud (VPC) infrastructure, virtual servers for VPC will also be able to achieve network throughput >10 Gbps. Depending on the profile selected, a VPC virtual server instance will be capable of utilizing up to 16 Gbps of network bandwidth.