Technical Blog Post
Abstract
NVMe-oF (Non-Volatile Memory express over Fabrics) is an exciting new storage network technology that lets you take full advantage of IBM FlashCore Modules (FCM) from an Ethernet or InfiniBand network. IBM FlashSystem supports end-to-end NVMe-oF for Fibre Channel solutions as well as RDMA Ethernet based solutions.
Body

By Jack Tedjai, IBM Technology Expert Labs
NVMe-oF is designed to leverage the performance of NVMe technology across the network using Remote Direct Memory Access (RDMA). RDMA is a direct memory access from the memory of one computer into that of another without involving either one's operating system.

Figure 1: NVMe over Fabrics (NVMe-oF)

Figure 2: NVMe protocol Fabrics (NVMe-oF)
RDMA is provided by either the Transmission Control Protocol (TCP) with RDMA services (iWARP) that uses existing Ethernet setup and therefore no need of huge hardware investment, or RoCE (RDMA over Converged Ethernet) that does not need the TCP layer and therefore provides lower latency.
IBM FlashSystem supports end-to-end NVMe-oF for Fibre Channel solutions as well as RoCE (UDP) or iWARP (TCP).
RoCE uses UDP and needs a lossless Ethernet (Data Center Bridging protocol suite (DCB).
DCB is a discovery and capability exchange protocol to discover peers and negotiate Data Center Bridging configuration. DCB uses LLDP as the underlying protocol for exchange of parameters with the peer; thus, RoCE needs more configuration and switches.
iWARP is Standard TCP. TCP cares about retransmitting and Packet loss and can be used with any switch.
The requirements for running iSER are:
- Applications that can use SCSI and iSCSI layer (in the client example detailed in Figure 3 and following, this is VMware ESXi 7.x)
- A network capable of passing, for example, 25GbE SR SFP28 multimode optical fiber (MMF) or 50uM OM4 multimode fiber
- Adapter cards that support RDMA (Ethernet or InfiniBand), for example, Mellanox ConnectX-4 Lx 25GbE dual-port (RoCE) or Chelsio T6 2x25 Gbps adapter (iWarp)
- RDMA over Converged Ethernet switches (with Flow control 802.1Qbb or Priority-based Flow Control), for example, Dell switch S5048-F
- A target that supports iSER clustering (RoCE or iWarp), for example, all IBM FlashSystem with FCM Modules
In this client case, the following layers were used (see Figure 3):
- Front-end iSCSI for existing iSCSI 100 Gbe network (Shared Dell network, dedicated vlan)
- Back-end iSER network for iSER 100 Gbe Clustering (Shared Dell network, dedicated vlan)
- iSER Interconnect between the data center >10 Gbe darkfiber
- iSER RoCE VMware for the new ESXi host deployment

Figure 3: Client global design

Figure 4: iSER clustering (iWARP) ip-address setup from the FlashSystem Service Assistant

Figure 5: Overall view of all network card

Figure 6: iSER clustering Ethernet Connectivity between both FlashSystems
If both FlashSystem can see each other, then start creating the HyperSwap cluster by adding the nodes to an existing standard cluster as described here.
#lscontrolenclosurecandidate
#addcontrolenclosure -iogrp 1 -sernum 78Y00XX
Finally, create ip-quorum for the HyperSwap cluster:

Figure 7: ip-quorum overview
VMware design and VMware MPIO setup

Figure 8: VMware ESXi7.x multipath design
Note: iSER does not support NIC teaming. When configuring port binding, use only one RDMA adapter per vSwitch.
VMware ESXi Storage adapters settings:
From the “Configuration” page, click the “Storage adapters” page. Select the device under “Mellanox iSCSI over RDMA (iSER) Adapter” and click “Properties” to add the network configuration settings.

Figure 9: ESXi Storage adapters settings
Note: Double check the MTU size setting.

Figure 10: MTU settings on the Storwize

Figure 11: VLAN support

Figure 12: VMware ESXi 7.x host attachment overview from the FlashSystem

Figure 13: FlashSystem performance test run overview on FCM core modules
Conclusion
In closing, NVMe-oF offers better performance and efficiency, especially for small I/O:
- Lower latency
- Higher IOPS
- Less CPU utilization
- Less power consumption
IBM FlashSystem supports end-to-end NVMe-oF solutions.
Looking for Support?
IBM Technology Expert offers infrastructure services to help organizations build hybrid cloud and enterprise IT. Our Storage consultants can help you secure your enterprise with physical and software-defined storage solutions for on-premises, cloud, converged and virtualized environments.
Contact IBM Technology Expert Labs today to learn more.
UID
ibm16165051