Ceph NVMe-oF gateway

Storage administrators can install and configure an NVMe over Fabrics (NVMe-oF) gateway for an IBM Storage Ceph cluster. With the Ceph NVMe-oF gateway, you can effectively run a fully integrated block storage infrastructure with all features and benefits of a conventional Storage Area Network (SAN).

Note: The NVMe gateway supports VMware vSphere APIs (VAAI), which includes support for vMotion, compare and write, unmap, and write zero.

Block-level access to a Ceph storage cluster used to be limited to QEMU and librbd. Block-level access to the Ceph storage cluster can now take advantage of the NVMe-oF standard to provide data storage. In IBM Storage Ceph 7.1, significant enhancements were made to the Ceph NVMe-oF gateway, specifically to the CLI commands and the addition of High Availability.

The NVMe-oF gateway integrates IBM Storage Ceph with the NVMe over TCP (NVMe/TCP) protocol to provide an NVMe/TCP target that exports RADOS Block Device (RBD) images. The NVMe/TCP protocol allows clients, which are known as initiators, to send NVMe-oF commands to storage devices, which are known as targets, over an Internet Protocol network. Initiators can be Linux clients, VMWare clients, or both. For VMWare clients, the NVMe/TCP volumes are shown as VMFS Datastore and for Linux clients, the NVMe/TCP volumes are shown as block devices.

Figure 1. Ceph NVMe-oF gateway
Ceph NVMe-oF gateway

For more information about the NVMe over Fabrics (NVMe-oF) protocol, see NVMe over Fabrics.

High Availability with NVMe-oF gateway

High Availability (HA) provides I/O and control path redundancies for the host initiators. High Availability is also sometimes referred to as failover and failback support. The redundancy that HA creates is critical to protect against one or more gateway failures. With HA, the host can continue the I/O with only the possibility of performance latency until the failed gateways are back and functioning correctly.

NVMe-oF gateways are virtually grouped into gateway groups and the HA domain sits within the gateway group. An NVMe-oF gateway group supports four gateways. Each NVMe-oF gateway in the gateway group can be used as a path to any of the subsystems or namespaces that are defined in that gateway group. HA is effective with two or more gateways in a gateway group.

High Availability is enabled by default. To use High Availability, a minimum of two gateways and listeners must be defined. For more information, see Deploying the NVMe-oF gateway.

It is important to create redundancy between the host and the gateways. To create a fully redundant network connectivity, be sure that the host has two Ethernet ports that are connected to the gateways over a network with redundancy (for example, two network switches).

The HA feature uses the Active/Standby approach for each namespace. Using Active/Standby means that at any point in time, only one of the NVMe-oF gateways serve I/O from the host to a specific namespace. To properly use all NVMe-oF gateways, each namespace is assigned to a different load-balancing group. The number of load-balancing groups is equal to the number of NVMe-oF gateways in the gateway group.

With HA, if an NVMe-oF gateway fails, the initiator continues trying to connect. The amount of time that it tries to connect for depends on what is defined for the initiator. For more information about defining the reconnect time for the initiator and general configuration instructions, see Configuring the NVMe-oF gateway initiator.

Scaling-out with NVMe-oF gateway

The NVMe-oF gateway supports scale-out. NVMe-oF gateway scale-out supports:
  • One NVMe-oF gateway group.
  • Up to four NVMe-oF gateways in a gateway group.
  • Up to 16 NVMe-oF subsystems within a Ceph cluster.
  • Up to 32 hosts per NVMe-oF subsystem.
  • 400 namespaces per Ceph cluster.

NVMe Discovery

The IBM Storage Ceph NVMe-oF gateway also supports NVMe Discovery. Each NVMe-oF gateway that runs in the Ceph cluster also runs a Discovery Controller. The Discovery Controller reports all IP addresses of each of the gateways in the group that are defined with listeners.

For configuring information, see Configuring the NVMe-oF gateway initiator.