How to choose between Infiniband (IB) or a RoCE Ethernet RDMA capable network type?

When deploying a Db2® pureScale® cluster, whether it is brand new or upgrading a current cluster to a next generation hardware stack, using RoCE Ethernet network is almost always the best choice.

Important: Starting from version 11.5.5, support for Infiniband (IB) adapters as the high-speed communication network between members and CFs in Db2 pureScale on all supported platforms is deprecated and will be removed in a future release. Use Remote Direct Memory Access over Converged Ethernet (RoCE) network as the replacement.

Ethernet is everywhere, and many of the switches are also RoCE capable. This not only increases availability but it often decreases the overall cost, both hardware and human.

From the raw bandwidth perspective, while QDR Infiniband seems to have 4x the advantage (40 Gb/s for QDR IB versus 10 Gb/s for RoCE Ethernet) this benefit does not translate to similar throughput gains in customer environments. Performance tests between the two network types conducted in a controlled lab environment shows a fairly small difference of 5% to 15%. This is due to two factors:
  1. Db2 pureScale has a mix of messages sizes, small messages are used when latency is of the utmost importance.
  2. Most workloads do not run at 100% network utilization.

As of Db2 version 11.5, QDR remains to be the highest speed IB adapter supported on AIX. On the RoCE front, 40Gb and 100Gb adapters have been added to the supported matrix recently. This has closed the small performance gap between the two networks to a point where RoCE performance meets the performance requirements for most, if not all, OLTP customers’ needs.

As a result, from both the cost and performance perspectives RoCE Ethernet is the recommended network type for even the most demanding production clusters.