Suboptimal performance due to VERBS RDMA being inactive
IBM Storage Scale for Linux® supports InfiniBand Remote Direct Memory Access (RDMA) by using the Verbs API for data transfer between an NSD client and the NSD server. If InfiniBand VERBS RDMA is enabled on the IBM Storage Scale cluster and there is drop in the file system performance, then verify whether the NSD client nodes are using VERBS RDMA for communication to the NSD server nodes. If the nodes are not using RDMA, then the communication switches to using the GPFS node’s TCP/IP interface, which can cause performance degradation.
- For information about tuning RDMA parameters, see RDMA tuning.
- In IBM Storage Scale 5.0.4 and later, the GPFS daemon startup service waits for a specified time period for the RDMA ports on a node to become active. You can adjust the length of the timeout period and choose the action that the startup service takes if the timeout expires. For more information, see the descriptions of the verbsPortsWaitTimeout attribute and the verbsRdmaFailBackTCPIfNotAvailable attribute in the help topic mmchconfig command.
Problem identification
Issue the mmlsconfig | grep verbsRdma command to verify whether VERBS RDMA is enabled on the IBM Storage Scale cluster.
verbsRdma enable
If VERBS RDMA is enabled, check whether the status of VERBS RDMA on a node is Started by running the mmfsadm test verbs status command.
VERBS RDMA status: started
The following sample output shows the various disks in the gpfs1b file system and the NSD servers that are supposed to act as primary and secondary servers for these disks.
File system Disk name NSD servers
---------------------------------------------------------------------------
gpfs1b DMD_NSD01 c25m3n07-ib,c25m3n08-ib
gpfs1b DMD_NSD02 c25m3n08-ib,c25m3n07-ib
gpfs1b DMD_NSD03 c25m3n07-ib,c25m3n08-ib
gpfs1b DMD_NSD04 c25m3n08-ib,c25m3n07-ib
gpfs1b DMD_NSD05 c25m3n07-ib,c25m3n08-ib
gpfs1b DMD_NSD06 c25m3n08-ib,c25m3n07-ib
gpfs1b DMD_NSD07 c25m3n07-ib,c25m3n08-ib
gpfs1b DMD_NSD08 c25m3n08-ib,c25m3n07-ib
gpfs1b DMD_NSD09 c25m3n07-ib,c25m3n08-ib
gpfs1b DMD_NSD10 c25m3n08-ib,c25m3n07-ib
Issue the mmfsadm test verbs conn command to verify whether the NSD client node is communicating with all the NSD servers that use VERBS RDMA. In the following sample output, the NSD client node has VERBS RDMA communication active on only one of the two NSD servers.
RDMA Connections between nodes:
destination idx cook sta cli peak cli RD cli WR cli RD KB cli WR KB srv wait serv RD serv WR serv RD KB serv WR KB vrecv vsend vrecv KB vsend KB
----------- --- --- --- --- --- ------ -------- --------- --------- --- --- -------- -------- ----------- ----------- ------- ----- --------- --------
c25m3n07-ib 1 2 RTS 0 24 198 16395 12369 34360606 0 0 0 0 0 0 0 0 0 0
Problem resolution
Resolve any low-level InfiniBand RDMA issue like loose InfiniBand cables or InfiniBand fabric issues. When the low-level RDMA issues are resolved, issue system commands like ibstat or ibv_devinfo to verify whether the InfiniBand port state is active. The following system output displays the output for an ibstat command issued. In the sample output, the port state for Port 1 is Active, while that for Port 2 is Down.
CA 'mlx5_0'
CA type: MT4113
Number of ports: 2
Firmware version: 10.100.6440
Hardware version: 0
Node GUID: 0xe41d2d03001fa210
System image GUID: 0xe41d2d03001fa210
Port 1:
State: Active
Physical state: LinkUp
Rate: 56
Base lid: 29
LMC: 0
SM lid: 1
Capability mask: 0x26516848vverify
Port GUID: 0xe41d2d03001fa210
Link layer: InfiniBand
Port 2:
State: Down
Physical state: Disabled
Rate: 10
Base lid: 65535
LMC: 0
SM lid: 0
Capability mask: 0x26516848
Port GUID: 0xe41d2d03001fa218
Link layer: InfiniBand
Restart GPFS on the node and check whether the status of VERBS RDMA on a node is Started by running the mmfsadm test verbs status command.
In the following sample output, the NSD client (c25m3n03-ib) and the two NSD servers all show VERBS RDMA status as started.
c25m3n03-ib: VERBS RDMA status: started
c25m3n07-ib: VERBS RDMA status: started
c25m3n08-ib: VERBS RDMA status: started
Perform a large I/O activity on the NSD client, and issue the mmfsadm test verbs conn command to verify whether the NSD client node is communicating with all the NSD servers that use VERBS RDMA.
In the following sample output, the NSD client node has VERBS RDMA communication active on all the active NSD servers.
RDMA Connections between nodes:
destination idx cook sta cli peak cli RD cli WR cli RD KB cli WR KB srv wait serv RD serv WR serv RD KB serv WR KB vrecv vsend vrecv KB vsend KB
------------ --- --- --- --- --- ------- ------- --------- --------- --- --- ------- ------- ----------- ----------- ----- ------ --------- ---------
c25m3n08-ib 0 3 RTS 0 13 8193 8205 17179930 17181212 0 0 0 0 0 0 0 0 0 0
c25m3n07-ib 1 2 RTS 0 14 8192 8206 17179869 17182162 0 0 0 0 0 0 0 0 0 0