Performance metrics for switches
Monitor the performance metrics that are collected for switches, switch ports, and Trunks.
Performance metrics for switches are divided
into the following categories:
Attention: To enhance troubleshooting experience for switches, two new switch port
metrics namely Total Physical Port Error Rate and Total Logical Port Error Rate are available from
2Q22 update of IBM Storage
Insights. With these newly
added metrics, you can quickly identify specific switch port that has high rate of physical and/or
logical error rates. The two new metrics are listed in the Table 1.
Metric | Description |
---|---|
Bandwidth Percentage (Send) | The percentage of the port bandwidth that is used for send operations. This value is an indicator of port bandwidth usage that is based on the speed of the port. |
Bandwidth Percentage (Receive) | The percentage of the port bandwidth that is used for receive operations. This value is an indicator of port bandwidth usage that is based on the speed of the port. |
Bandwidth Percentage (Overall) | The percentage of the port bandwidth that is used for send and receive operations. This value is an indicator of port bandwidth usage that is based on the speed of the port. |
Data Rate (Send) | The average rate at which data is sent by the port. A send operation is a read operation that is processed, or a write operation that is initiated by the port. The rate is measured in MiB per second. |
Data Rate (Receive) | The average rate at which data is received by the port. A receive operation is a write operation that is processed, or a read operation that is initiated by the port. The rate is measured in MiB per second. |
Data Rate (Total) | The average rate at which data is transferred through the port. The rate is measured in MiB per second and includes both send and receive operations. |
Total Physical Port Error Rate (cnt/s) | The sum of all the physical error rates such as Error Frames,
CRC Errors, Short Frames, and Link Failures that are detected on the switch port. Total Physical
Port Error Rate is the sum of the following physical error rates:
|
Total Logical Port Error Rate (cnt/s) | The sum of all the logical error rates such as F-BSY Frames,
F-BSY Frames, Discarded Frames, and Encoding Disparity that are detected on the switch port. Total
Logical Port Error Rate is the sum of the following logical error rates:
|
Total Port Error Rate | The average number of times per second that an error was detected on the port. This rate is a summation of all the other error rates for the port. |
Metric | Description |
---|---|
Port Frame Rate (Send) | The average number of frames per second that are sent by the port. |
Port Frame Rate (Receive) | The average number of frames per second that are received by the port. |
Port Frame Rate (Total) | The average number of frames per second that are transferred. This value includes frames that are sent and received by the port. |
Metric | Description |
---|---|
Bad EOF CRC Error Rate1 | The percentage of nonsequential read operations that find data in the cache.You can use this value to understand throughput or response times. Low cache-hit percentages can increase response times because a cache miss requires access to the back-end storage resources. |
CRC Error Rate | The percentage of nonsequential write operations that are handled in the cache. |
Discarded Class 3 Frame Rate | The average number of class 3 frames per second that are discarded. |
Error Frame Rate1 | The average number of error frames per second that are received. An error frame is a frame that violates the Fibre Channel Protocol. |
F-BSY Frame Rate2 | The average number of F-BSY frames per second that are generated. An F-BSY frame is issued by the fabric to indicate that a frame cannot be delivered because the fabric or destination N_port is busy. |
F-RJT Frame Rate2 | The average number of F-RJT frames per second that are generated.An F-RJT frame is issued by the fabric to indicate that delivery of a frame was denied. |
Long Frame Rate | The average number of frames that are received per second that are longer than 2140 octets. This number excludes start-of-frame bytes and end-of-frame bytes. The 2140 octet limit is calculated based on the assumption that a frame has 24 header bytes, 4 CRC bytes, and 2112 data bytes. |
Short Frame Rate2 | The average number of frames that are received per second that are shorter than 28 octets. This number excludes start-of-frame bytes and end-of-frame bytes. The 28 octet limit is calculated based on the assumption that a frame has 24 header bytes, and 4 CRC bytes. |
Notes:
|
Metric | Description |
---|---|
Class 3 Receive Timeout Frame Rate1 | The average number of class 3 frames per second that were discarded after reception because of a timeout condition. The timeout condition occurs while a transmitting port waits for buffer credit from a port at the other end of the fibre. When you troubleshoot a SAN, use this metric to help identify port conditions that might slow the performance of the resources to which those ports are connected. |
Class 3 Send Timeout Frame Rate1 | The average number of class 3 frames per second that were discarded before transmission because of a timeout condition. The timeout condition occurs while the switch or port waits for buffer credit from the receiving port at the other end of the fibre. When you troubleshoot a SAN, use this metric to help identify port conditions that might slow the performance of the resources to which those ports are connected. |
Credit Recovery Link Reset Rate | The estimated average number of link resets per second that a switch or port completed to recover buffer credits. This estimate attempts to disregard link resets that were caused by link initialization. When you troubleshoot a SAN, use this metric to help identify port conditions that might slow the performance of the resources to which those ports are connected. |
Discarded Frame Rate1 | The average number of frames per second that are discarded because host buffers are unavailable for the port. |
Link Reset Received Rate | The average number of times per second that the port changes from an active (AC) state to a Link Recovery (LR2) state. |
Link Reset Transmitted Rate | The average number of times per second that the port changes from an active (AC) state to a Link Recovery (LR1) state. |
Port Congestion Index | The estimated degree to which frame transmission was delayed due to a lack of buffer credits. This value is generally 0 - 100. The value 0 means there was no congestion. The value can exceed 100 if the buffer credit exhaustion persisted for an extended amount of time. When you troubleshoot a SAN, use this metric to help identify port conditions that might slow the performance of the resources to which those ports are connected. |
Zero Buffer Credit Percentage | The amount of time, as a percentage, that the port was not able to send frames between ports because of insufficient buffer-to-buffer credit. The amount of time value is measured from the last time that metadata was collected. In Fibre Channel technology, buffer-to-buffer credit is used to control the flow of frames between ports. |
Zero Buffer Credit Rate | The average number of Zero Buffer Credit conditions per second that occurred. A Zero Buffer Credit condition occurs when a port is unable to send frames because of a lack of buffer credit since the last node reset. When you troubleshoot a SAN, use this metric to help identify port conditions that might slow the performance of the resources to which those ports are connected. |
Notes:
|
Metric | Description |
---|---|
Encoding Disparity | The average number of disparity errors per second that are received. |
Invalid Link Transmission Rate | The average number of times per second that an invalid transmission word was detected by the port while the link did not experience any signal or synchronization loss. |
Invalid Word Transmission Rate | The average number of bit errors per second that are detected. |
Link Failure Rate | The average number of miscellaneous fibre channel link errors per second for ports. Link errors might occur when an unexpected Not Operational (NOS) is received or a link state machine failure was detected. |
Loss of Signal Rate | The average number of times per second at which the port lost communication with its partner port. These types of errors usually indicate physical link problems, caused by faulty SFP modules or cables, or caused by faulty connections at the switch or patch panel. However, in some cases, this error can also occur when the maximum link distance between ports is exceeded, for the type of connecting cable and light source. |
Loss of Sync Rate | The average number of times per second that the port lost synchronization with its partner port. These types of errors usually indicate physical link problems, caused by faulty SFP modules or cables, or caused by faulty connections at the switch or patch panel. However in some cases this can also occur due to mismatching port speeds between the partner ports, when auto-negotiation of link speed is disabled. |
Primitive Sequence Protocol Error Rate | The average number of primitive sequence protocol errors per second that are detected. This error occurs when there is a link failure for a port. |
Metric | Description |
---|---|
Link Quality Percentage | The percentage is based on whether the port is an expansion port (E_port) or a fabric port (F_port), and on the numbers and types of errors that are detected by the port. |
Port Frame Size (Overall) | The average frame transfer size. This value is measured in KiB and includes frames that are sent and frames that are received by the port. |
Port Frame Size (Receive) | The average size of a frame, in KiB, that is received by the port. |
Port Frame Size (Send) | The average size of a frame, in KiB, that is sent through the port. |