Suboptimal performance due to networking issues caused by faulty system components

The system might face networking issues, like significant network packet drops or packet errors, due to faulty system components like NIC, drivers, cables and network switch ports. This can impact the stability and the quality of the GPFS communication between the nodes, degrading the system performance.

Problem identification and verification

If IBM Storage Scale is configured over TCP/IP network interfaces like 10GigE or 40GigE, can use the netstat –in and ifconfig <GPFS_iface> commands to confirm whether any significant TX/RX packet errors or drops are happening.

In the following example, the 152326889 TX packets are dropped for the networking interface corresponding to the ib0 device:

# netstat -in

Kernel Interface table
Iface      MTU    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
ib0      65520    157606763073   0      0 0       165453186948     0     152326889       0 BMRU
#ifconfig ib0

ib0       Link encap:InfiniBand  HWaddr
80:00:00:49:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
          inet addr:192.168.1.100  Bcast:192.168.1.255
Mask:255.255.255.0
          inet6 addr: fe80::f652:1403:10:bb72/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
          RX packets:157606763073 errors:0 dropped:0 overruns:0 frame:0
          TX packets:165453186948 errors:0 dropped:152326889 overruns:0
carrier:0

Problem resolution and verification

Resolve low-level networking issues like bad NIC cable, or improper driver setting. If possible, shut down GPFS on the node with networking issues until the low-level networking problem is resolved. This is done so that GPFS operations on other nodes are not impacted. Issue the # netstat -in command to verify that the networking issues are resolved. Issue the mmstartup command to start GPFS on the node again. Monitor the network interface to ensure that it is operating optimally.

In the following example, no packet errors or drops corresponding to the ib0 network interface exist.

# netstat -in
Kernel Interface table
Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
ib0       65520   0 313534358      0      0      0 301875166      0    0      0 BMRU
#ifconfig ib0
ib0       Link encap:InfiniBand  HWaddr 80:00:00:03:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
                    inet addr:10.168.3.17  Bcast:10.168.255.255  Mask:255.255.0.0
                    inet6 addr: fe80::211:7500:78:a42a/64 Scope:Link
                    UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
                    RX packets:313534450 errors:0 dropped:0 overruns:0 frame:0
                    TX packets:301875212 errors:0 dropped:0 overruns:0 carrier:0
                    collisions:0 txqueuelen:256
                    RX bytes:241364128830 (224.7 GiB)  TX bytes:197540627923 (183.9 GiB)