Network events

The following table lists the events that are created for the Network component.

Table 1. Events for the network component
Event Event
Type
Severity Call Home Details
bond_degraded STATE_CHANGE WARNING no Message: Some secondaries of the network bond {0} went down.
Description: Some of the network bond parts are malfunctioning.
Cause: Some secondaries of the network bond are not functioning properly.
User Action: Check the bonding configuration, the network configuration, and cabling of the malfunctioning secondaries of the network bond.
bond_down STATE_CHANGE ERROR no Message: All secondaries of the network bond {0} are down.
Description: All secondaries of a network bond are down.
Cause: All secondaries of this network bond went down.
User Action: Check the bonding configuration, the network configuration, and cabling of all secondaries of the network bond.
bond_nic_recognized STATE_CHANGE INFO no Message: Bond NIC {id} was recognized. Children {0}.
Description: The specified network bond NIC was correctly recognized for usage by IBM Storage Scale.
Cause: N/A
User Action: N/A
bond_up STATE_CHANGE INFO no Message: All secondaries of the network bond {0} are working as expected.
Description: This network bond is functioning properly.
Cause: N/A
User Action: N/A
expected_file_missing INFO WARNING no Message: The expected configuration or program file {0} was not found.
Description: An expected configuration or program file was not found.
Cause: An expected configuration or program file was not found.
User Action: Check for the existence of the file. If necessary, then install required packages.
ib_rdma_disabled STATE_CHANGE INFO no Message: InfiniBand in RDMA mode is disabled.
Description: InfiniBand in RDMA mode is not enabled for IBM Storage Scale.
Cause: N/A
User Action: N/A
ib_rdma_enabled STATE_CHANGE INFO no Message: InfiniBand in RDMA mode is enabled.
Description: Infiniband in RDMA mode is enabled for IBM Storage Scale.
Cause: N/A
User Action: N/A
ib_rdma_ext_port_speed_low TIP TIP no Message: InfiniBand RDMA NIC {id} uses a smaller extended port speed than supported.
Description: The currently active extended link speed is less than the supported value.
Cause: The currently active extended link speed is less than the supported value.
User Action: Check the settings of the specified InfiniBand RDMA NIC (ibportstate).
ib_rdma_ext_port_speed_ok TIP INFO no Message: InfiniBand RDMA NIC {id} uses maximum supported port speed.
Description: The currently extended link speed is set to the supported extended speed.
Cause: N/A
User Action: N/A
Start of changeib_rdma_fatal_stateEnd of change Start of changeSTATE_CHANGEEnd of change Start of changeERROREnd of change Start of changenoEnd of change Start of changeMessage: VERBS RDMA entered error state.End of change
Description: An RDMA device used by IBM Storage Scale entered an error state and can no longer be used.
Cause: An RDMA device used by IBM Storage Scale reported an error state.
User Action: Check /var/adm/ras/mmfs.log.latest for the root cause hints. Reboot the node.
ib_rdma_libs_found STATE_CHANGE INFO no Message: All checked library files can be found.
Description: All checked library files (librdmacm and libibverbs) can be found with expected path names.
Cause: N/A
User Action: N/A
ib_rdma_libs_wrong_path STATE_CHANGE ERROR no Message: The library files cannot be found.
Description: At least one of the library files (librdmacm and libibverbs) cannot be found with an expected pathname.
Cause: Either the libraries are missing or their path names are wrongly set.
User Action: Check whether the libraries, 'librdmacm', and 'libibverbs' are installed. Also, check whether they can be found by the names that are referenced in the mmfsadm test verbs config command.
ib_rdma_link_down STATE_CHANGE ERROR no Message: InfiniBand RDMA NIC {id} is down.
Description: The physical link of the specified InfiniBand RDMA NIC is down.
Cause: Physical state of the specified InfiniBand RDMA NIC is not 'LinkUp' according to ibstat.
User Action: Check the cabling of the specified InfiniBand RDMA NIC.
ib_rdma_link_up STATE_CHANGE INFO no Message: InfiniBand RDMA NIC {id} is up.
Description: The physical link of the specified InfiniBand RDMA NIC is up.
Cause: N/A
User Action: N/A
ib_rdma_nic_down STATE_CHANGE ERROR no Message: NIC {id} is down according to ibstat.
Description: The specified InfiniBand RDMA NIC is down.
Cause: The specified InfiniBand RDMA NIC is down according to ibstat.
User Action: Enable the specified InfiniBand RDMA NIC.
ib_rdma_nic_found INFO_ADD_ENTITY INFO no Message: InfiniBand RDMA NIC {id} was found.
Description: A new InfiniBand RDMA NIC was found.
Cause: A new relevant InfiniBand RDMA NIC is listed by ibstat.
User Action: N/A
ib_rdma_nic_recognized STATE_CHANGE INFO no Message: InfiniBand RDMA NIC {id} was recognized.
Description: The specified InfiniBand RDMA NIC was correctly recognized for usage by IBM Storage Scale.
Cause: N/A
User Action: N/A
ib_rdma_nic_unrecognized STATE_CHANGE ERROR no Message: InfiniBand RDMA NIC {id} was not recognized.
Description: The specified InfiniBand RDMA NIC was not correctly recognized for usage by IBM Storage Scale.
Cause: The specified InfiniBand RDMA NIC is not reported in the mmfsadm dump verbs command.
User Action: Check '/var/adm/ras/mmfs.log.latest' for VERBS RDMA error messages.
ib_rdma_nic_up STATE_CHANGE INFO no Message: NIC {id} is up according to ibstat.
Description: The specified InfiniBand RDMA NIC is up.
Cause: N/A
User Action: N/A
ib_rdma_nic_vanished INFO_DELETE_ENTITY INFO no Message: InfiniBand RDMA NIC {id} vanished.
Description: The specified InfiniBand RDMA NIC cannot be detected anymore.
Cause: One of the previously monitored InfiniBand RDMA NICs is not listed by ibstat anymore.
User Action: N/A
ib_rdma_port_speed_low STATE_CHANGE WARNING no Message: InfiniBand RDMA NIC {id} uses a smaller port speed than enabled.
Description: The currently active link speed is lesser than the enabled maximum link speed.
Cause: The currently active link speed is lesser than the enabled maximum link speed.
User Action: Check the settings of the specified IB RDMA NIC (ibportstate).
ib_rdma_port_speed_ok STATE_CHANGE INFO no Message: InfiniBand RDMA NIC {id} uses maximum enabled port speed.
Description: The currently active link speed equal to the enabled maximum link speed.
Cause: N/A
User Action: N/A
ib_rdma_port_speed_optimal TIP INFO no Message: InfiniBand RDMA NIC {id} uses maximum supported port speed.
Description: The currently enabled link speed is equal to the supported maximum link speed.
Cause: The currently enabled link speed is equal to the supported maximum link speed.
User Action: N/A
ib_rdma_port_speed_suboptimal TIP TIP no Message: InfiniBand RDMA NIC {id} uses a smaller port speed than supported.
Description: The currently enabled link speed is lesser than the supported maximum link speed.
Cause: The currently enabled link speed is lesser than the supported maximum link speed.
User Action: Check the settings of the specified InfiniBand RDMA NIC (ibportstate).
ib_rdma_port_width_low STATE_CHANGE WARNING no Message: InfiniBand RDMA NIC {id} uses a smaller port width than enabled.
Description: The currently active link width is lesser than the enabled maximum link width.
Cause: The currently active link width is lesser than the enabled maximum link width.
User Action: Check the settings of the specified InfiniBand RDMA NIC (ibportstate).
ib_rdma_port_width_ok STATE_CHANGE INFO no Message: InfiniBand RDMA NIC {id} uses maximum enabled port width.
Description: The currently active link width equal to the enabled maximum link width.
Cause: N/A
User Action: N/A
ib_rdma_port_width_optimal TIP INFO no Message: InfiniBand RDMA NIC {id} uses maximum supported port width.
Description: The currently enabled link width is equal to the supported maximum link width.
Cause: The currently enabled link width is equal to the supported maximum link width.
User Action: N/A
ib_rdma_port_width_suboptimal TIP TIP no Message: InfiniBand RDMA NIC {id} uses a smaller port width than supported.
Description: The currently enabled link width is lesser than the supported maximum link width.
Cause: The currently enabled link width is lesser than the supported maximum link width.
User Action: Check the settings of the specified IB RDMA NIC (ibportstate).
ib_rdma_ports_ok STATE_CHANGE INFO no Message: verbsPorts is correctly set for InfiniBand RDMA.
Description: The verbsPorts setting has a correct value.
Cause: N/A
User Action: N/A
ib_rdma_ports_undefined STATE_CHANGE ERROR no Message: No NICs and ports are set up for InfiniBand RDMA.
Description: No NICs and ports are set up for InfiniBand RDMA.
Cause: The user did not configure verbsPorts by using the mmchconfig command.
User Action: Set up the NICs and ports to use with the verbsPorts setting in the mmchconfig command.
ib_rdma_ports_wrong STATE_CHANGE ERROR no Message: verbsPorts is incorrectly set for InfiniBand RDMA.
Description: verbsPorts setting has wrong contents.
Cause: The user incorrectly configured verbsPorts by using the mmchconfig command.
User Action: Check the format of the verbsPorts setting in the mmlsconfig command.
Start of changeib_rdma_state_okEnd of change Start of changeSTATE_CHANGEEnd of change Start of changeINFOEnd of change Start of changenoEnd of change Start of changeMessage: VERBS RDMA was started.End of change
Description: No RDMA device reports an IBM Storage Scale internal error state.
Cause: N/A
User Action: N/A
ib_rdma_verbs_failed STATE_CHANGE ERROR no Message: VERBS RDMA was not started.
Description: IBM Storage Scale cannot start VERBS RDMA.
Cause: The InfiniBand RDMA-related libraries are improperly installed or configured.
User Action: Check '/var/adm/ras/mmfs.log.latest' for the root cause hints. Also, check whether all relevant InfiniBand libraries are installed and correctly configured.
ib_rdma_verbs_started STATE_CHANGE INFO no Message: VERBS RDMA was started.
Description: IBM Storage Scale started VERBS RDMA.
Cause: N/A
User Action: N/A
many_tx_errors STATE_CHANGE ERROR FTDC upload Message: NIC {0} had many TX errors since the last monitoring cycle.
Description: The network adapter had many TX errors since the last monitoring cycle.
Cause: The '/proc/net/dev' folder lists much more TX errors for this adapter since the last monitoring cycle.
User Action: Check the network cabling and network infrastructure.
network_connectivity_down STATE_CHANGE ERROR no Message: NIC {0} cannot connect to the gateway.
Description: This network adapter cannot connect to the gateway.
Cause: The gateway does not respond to the sent connections-checking packets.
User Action: Check the network configuration of the network adapter, path to the gateway, and gateway itself.
network_connectivity_up STATE_CHANGE INFO no Message: NIC {0} can connect to the gateway.
Description: This network adapter can connect to the gateway.
Cause: N/A
User Action: N/A
network_down STATE_CHANGE ERROR no Message: NIC {0} is down.
Description: This network adapter is down.
Cause: This network adapter is disabled.
User Action: Enable this network adapter.
network_found INFO_ADD_ENTITY INFO no Message: NIC {0} was found.
Description: A new network adapter was found.
Cause: A new NIC, which is relevant for the IBM Storage Scale monitoring, is listed by ip a.
User Action: N/A
network_ips_down STATE_CHANGE ERROR no Message: No relevant NICs detected.
Description: No relevant network adapters detected.
Cause: No network adapters have the IBM Storage Scale-relevant IPs.
User Action: Find out, why the IBM Storage Scale-relevant IPs were not assigned to any NICs.
network_ips_partially_down STATE_CHANGE ERROR no Message: Some relevant IPs are not served by found NICs: {0}.
Description: Some relevant IPs are not served by network adapters.
Cause: At least one IBM Storage Scale-relevant IP is not assigned to a network adapter.
User Action: Find out why the specified IBM Storage Scale-relevant IPs were not assigned to any NICs.
network_ips_up STATE_CHANGE INFO no Message: Relevant IPs are served by found NICs.
Description: Relevant IPs are served by network adapters.
Cause: N/A
User Action: N/A
network_link_down STATE_CHANGE ERROR no Message: Physical link of the NIC {0} is down.
Description: The physical link of this adapter is down.
Cause: The 'LOWER_UP' flag is not set for this NIC in the output of ip a.
User Action: Check the network cabling and network infrastructure.
network_link_up STATE_CHANGE INFO no Message: Physical link of the NIC {0} is up.
Description: The physical link of this adapter is up.
Cause: N/A
User Action: N/A
network_up STATE_CHANGE INFO no Message: NIC {0} is up.
Description: This network adapter is up.
Cause: N/A
User Action: N/A
network_vanished INFO_DELETE_ENTITY INFO no Message: NIC {0} vanished.
Description: One of network adapters cannot be detected anymore.
Cause: One of the previously monitored NICs is not listed by ip a anymore.
User Action: N/A
nic_firmware_not_available STATE_CHANGE WARNING no Message: The expected firmware level of adapter {id} is not available.
Description: The expected firmware level is not available.
Cause: /usr/lpp/mmfs/bin/tslshcafirmware -Y does not return any expected firmware level for this adapter.
User Action: /usr/lpp/mmfs/bin/tslshcafirmware -Y does not return any firmware level for the expectedFirmware field and check whether it is working as expecting. This command uses /usr/lpp/mmfs/updates/latest/firmware/hca/FirmwareInfo.hca, which is provided with the ECE packages and check whether the file is available and accessible.
nic_firmware_ok STATE_CHANGE INFO no Message: The adapter {id} has the expected firmware level {0}.
Description: The adapter firmware level is as expected.
Cause: N/A
User Action: N/A
nic_firmware_unexpected STATE_CHANGE WARNING no Message: The adapter {id} has firmware level {0} and not the expected firmware level {1}.
Description: The adapter firmware level is not as expected.
Cause: N/A
User Action: N/A
no_tx_errors STATE_CHANGE INFO no Message: NIC {0} had no or a tiny number of TX errors.
Description: The NIC had no or an insignificant number of TX errors.
Cause: N/A
User Action: N/A
rdma_roce_cma_tos TIP TIP no Message: NIC {id} The CMA type of service class is not set to the recommended value.
Description: The CMA type of service class is not set to the recommended value.
Cause: The CMA type of service class is not set to the recommended value.
User Action: Check the settings of the specified InfiniBand RDMA NIC by using the cma_roce_tos command and the system health monitor configuration file by using the mmsysmonitor.conf file.
rdma_roce_cma_tos_ok STATE_CHANGE INFO no Message: NIC {id} The CMA type of service class is set to the recommended value.
Description: The CMA type of service class is set to the recommended value.
Cause: N/A
User Action: N/A
rdma_roce_mtu_low TIP TIP no Message: NIC {id} The actual MTU size is less than the maximum MTU size.
Description: The actual MTU size is less than the maximum MTU size.
Cause: The actual MTU size is less than the maximum MTU size.
User Action: Check the MTU settings of the specified NIC by using the 'ibv_devinfo' command.
rdma_roce_mtu_ok STATE_CHANGE INFO no Message: NIC {id} The actual MTU size is OK.
Description: The actual MTU size is set to the maximum MTU size.
Cause: N/A
User Action: N/A
rdma_roce_pfc_prio_buffer_bad STATE_CHANGE WARNING no Message: NIC {id} The PFC buffer priority class is not set to the recommended value.
Description: The PFC buffer priority class is not set to the recommended value, which might lead to a significant decrease in performance.
Cause: The PFC buffer priority class is not set to the recommended value.
User Action: Check the settings of the specified InfiniBand RDMA NIC by using the mlnx_qos command and the system health monitor configuration file by using the mmsysmonitor.conf file.
rdma_roce_pfc_prio_buffer_ok STATE_CHANGE INFO no Message: NIC {id} The PFC buffer priority class is set to the recommended value.
Description: The PFC buffer priority class is set to the recommended value.
Cause: N/A
User Action: N/A
rdma_roce_pfc_prio_enabled_bad STATE_CHANGE WARNING no Message: NIC {id} The enabled PFC priority class is not set to the recommended value.
Description: The enabled PFC priority class is not set to the recommended value, which might lead to a significant decrease in performance.
Cause: The enabled PFC priority class is not set to the recommended value.
User Action: Check the settings of the specified NIC (mlnx_qos) and the system health monitor configuration file (mmsysmonitor.conf).
rdma_roce_pfc_prio_enabled_ok STATE_CHANGE INFO no Message: NIC {id} The enabled PFC priority class is set to the recommended value.
Description: The enabled PFC priority class is set to the recommended value.
Cause: N/A
User Action: N/A
rdma_roce_qos_prio_trust STATE_CHANGE WARNING no Message: NIC {id} The RoCE QoS value for trust is not set to 'dscp'.
Description: The RoCE QoS setting for trust is not set to 'dscp', which might lead to a significant decrease in performance.
Cause: The RoCE QoS setting for trust is not set to 'dscp'.
User Action: Check the settings of the specified RoCE NIC by using the 'mlnx_qos' command.
rdma_roce_qos_prio_trust_dscp STATE_CHANGE INFO no Message: NIC {id} The RoCE QoS setting for trust is set to 'dscp'.
Description: The RoCE QoS setting for trust is set to 'dscp'.
Cause: N/A
User Action: N/A
rdma_roce_tclass TIP TIP no Message: NIC {id} The traffic class is not set to the recommended value.
Description: The traffic class is not set to the recommended value.
Cause: The traffic class is not set to the recommended value.
User Action: Check the settings of the specified InfiniBand RDMA NIC by using the /sys/class/infiniband/<interface>/tc/1/traffic_class and the system health monitor configuration file by using the mmsysmonitor.conf file.
rdma_roce_tclass_ok STATE_CHANGE INFO no Message: NIC {id} The traffic class is set to the recommended value.
Description: The traffic class is set to the recommended value.
Cause: N/A
User Action: N/A
Start of changesmc_active_sockets_errorEnd of change Start of changeSTATE_CHANGEEnd of change Start of changeWARNINGEnd of change Start of changenoEnd of change Start of changeMessage: Some SMC connections are using the TCP mode.End of change
Description: Some SMC connections between GPFS daemons are using the TCP mode.
Cause: The smcss command detected that some SMC connections use the TCP mode.
User Action: Run smcss command to check for the error codes. Run the tssmcdnodeverify and mmnetverify smcd commands for SMC-D diagnostics.
smc_active_sockets_ok STATE_CHANGE INFO no Message: All connections are using the SMC mode.
Description: All connections between GPFS daemons are using the SMC mode.
Cause: N/A
User Action: N/A
smc_d_many_rx_buffer_errors STATE_CHANGE WARNING no Message: SMC-D had many RX buffer full errors.
Description: SMC-D had a large number of RX buffer full errors.
Cause: The smcd stat command detected a large number of RX buffer full errors.
User Action: Check SMC-D RX buffer sizes by using the smcd stat command. Run mmchconfig command to increase the socketRcvBufferSize value.
smc_d_many_tcp_fallback STATE_CHANGE WARNING no Message: SMC-D had many TCP fallbacks.
Description: SMC-D had a large number of TCP fallbacks.
Cause: The smcd stat command detected a large number of TCP fallbacks.
User Action: Run the tssmcdnodeverify and mmnetverify smcd commands for SMC-D diagnostics. Restart the problem nodes reported by the mmnetverify smcd command.
smc_d_many_tx_buffer_errors STATE_CHANGE WARNING no Message: SMC-D had many TX buffer full errors.
Description: SMC-D had a large number of TX buffer full errors.
Cause: The smcd stat command detected a large number of TX buffer full errors.
User Action: Check SMC-D TX buffer sizes by using the smcd stat command. Run the mmchconfig command to increase the socketSndBufferSize value.
smc_d_no_rx_buffer_errors STATE_CHANGE INFO no Message: SMC-D had no or a tiny number of RX buffer full errors.
Description: SMC-D had no or an insignificant number of RX buffer full errors.
Cause: N/A
User Action: N/A
smc_d_no_tcp_fallback STATE_CHANGE INFO no Message: SMC-D had no or a tiny number of TCP fallbacks.
Description: SMC-D had no or an insignificant number of TCP fallbacks.
Cause: N/A
User Action: N/A
smc_d_no_tx_buffer_errors STATE_CHANGE INFO no Message: SMC-D had no or a tiny number of TX buffer full errors.
Description: SMC-D had no or an insignificant number of TX buffer full errors.
Cause: N/A
User Action: N/A
smc_d_v1 TIP TIP no Message: SMC-D Version 1, the recommended version is 2.
Description: SMC-D v1 is detected, the recommended version is 2.
Cause: The smcd info command detected SMC-D version 1.
User Action: Upgrade OS to RHEL 8.9, RHEL 9.3, SLES 15 SP5, or later version.
smc_d_v2 TIP INFO no Message: SMC-D Version 2.
Description: Shared Memory Communication Direct (SMC-D) v2 is available.
Cause: N/A
User Action: N/A
smc_disabled STATE_CHANGE INFO no Message: SMC is disabled.
Description: Shared Memory Communication (SMC) is disabled.
Cause: N/A
User Action: N/A
smc_enabled STATE_CHANGE INFO no Message: SMC is enabled.
Description: Shared Memory Communication (SMC) is enabled.
Cause: N/A
User Action: N/A
smc_no_active_sockets STATE_CHANGE INFO no Message: No active SMC connections found.
Description: The node has no active SMC connections to other nodes at the moment.
Cause: N/A
User Action: N/A