RDMA over Converged Ethernet

Red Hat Enterprise Linux 8.6 LPAR mode z/VM guest KVM guest

Linux® on IBM® Z supports RDMA over Converged Ethernet (RoCE) in the form of RoCE Express features.

Red Hat® Enterprise Linux supports RoCE features as shown in Table 1. Note that the mapping of ports to function keys depend on the adapter hardware.

Table 1. Support for RDMA over Converged Ethernet features as of IBM z15
Feature IBM z16 IBM z15™
RoCE Express3
10 Gigabit Ethernet
25 Gigabit Ethernet
Not supported
RoCE Express2
Two adapter ports, different function IDs
10 Gigabit Ethernet
25 Gigabit Ethernet
10 Gigabit Ethernet
25 Gigabit Ethernet
RoCE Express
Two adapter ports, same function ID
Not supported
10 Gigabit Ethernet
Table 2. Support for RDMA over Converged Ethernet features as of IBM z13
Feature z14 and z14 ZR1 z13® and z13s®
RoCE Express3 Not supported Not supported
RoCE Express2
Two adapter ports, different function IDs
10 Gigabit Ethernet
25 Gigabit Ethernet
Not supported
RoCE Express
Two adapter ports, same function ID
10 Gigabit Ethernet
10 Gigabit Ethernet

The RoCE support requires PCI Express support.

You can use a PCI function as a base for MacVTab or OpenVSwitch similarly to an OSA adapter, see Using an HSCI interface as a base device for MacVTap or OpenVSwitch.

Using a RoCE device for SMC-R

SMC-R requires RoCE devices that are associated with network devices of TCP networks through a PNET ID, for example through statements in the IOCDS.

The following figure illustrates how a RoCE device and a Ethernet device are associated by a matching PNET ID. A communication peer has a similarly associated pair of an RoCE device and Ethernet device. With this setup, the TCP connection can switch over to an SMC-R connection over the SMC protocol. The communication peer can but need not be on the same CPC.
Figure 1. A matching PNET ID associates RoCE devices and Ethernet device
This graphic is described in the preceding text.

Using SMC-R link groups

Once established, SMC connections do not fall back to regular TCP communications should an SMC-R link fail. To protect against link failure, SMC-R creates link groups for you. Link groups use multiple RoCE devices with the same PNET ID. A similar association of an Ethernet device with multiple RoCE devices on the communication peer then results in multiple, independent SMC-R links within a link group.

Figure 2. Multiple SMC-R links protect against link failure
This graphic is described in the preceding text.

The SMC-R connection survives failures of individual RoCE devices if at least one device remains operational on each side.

Use the smcr command to explore SMC-R links, link groups, and devices (see smcr - Display information about SMC-R link groups, links and devices).

Note: SMC-R does not work with multiple SMC-R links, if the links are used in a bonding setup.