IBM Support

FCP_ERR4 "dropped or damaged frame" errors on VIOS for NPIV client's virtual WWPN

Question & Answer


Question

What is the cause of FCP_ERR4 error on VIOS?
LABEL:           FCP_ERR4
IDENTIFIER:      4B436A3D
Date/Time:       Fri Jan 13 14:09:38 2023
Sequence Number: 44064
Machine Id:      00C102D14B00
Node Id:         <VIOS>
Class:           H
Type:            TEMP
WPAR:            Global
Resource Name:   fscsi1
Resource Class:  driver
Resource Type:   emfscsi
Location:        U78D2.001.WZS0XY9-P1-C7-T2

Description
LINK ERROR
    Recommended Actions
    PERFORM PROBLEM DETERMINATION PROCEDURES
Detail Data
SENSE DATA
0000 0020 0000 0327 0000 0000 0203 0101 1000 0010 9BDA 0DA6 0000 0000 0045 B100 ...
"summ" diagnostics tool can be used to decode errors in the VIOS errlog from the oem_setup_env shell.  In this case, the error is decoded as follows:
Jan 13 14:09:38 fscsi1     T FCP_ERR4            NPIV cmd READ(10) from CXXXXXXXXXXX000A receiving from (45XX40 500507680C126D3B) failed; lr_code (1D) poss. dropped or damaged frame  45XX01 wwpn CXXXXXXXXXXX000A
where CXXXXXXXXXXX000A is the virtual FC adapter's WWPN and 45XX40 is the remote N_Port ID
The above sample error is from VIOS version 3.1.3.21

Cause

The error is commonly due to an issue with the fibre cable, the switch port, or the GBIC/SFP.
The error means that some physical component on the physical fscsi# path is dropping or damaging fibre channel frames as they come back from the storage to the VIOS and back to the NPIV client LPAR.  
In other words, the error indicates a faulty link or component on the physical path between the VIOS and the storage.  In particular an indication of dropped or damaged frames indicate some faulty component on the physical path is causing fibre channel frames to be dropped or damaged as they travel back from the storage to the host.  When that happens, it will lead to performance degradation on the client LPAR(s) utilizing this physical path - as the LPAR will be forced to completely cancel and I/O that fails with this type of error - then completely redrive it on a different path
In this example, all FCP_ERR4 errors were against the same physical port, fscsi1, for multiple N_Port IDs.  The fact that the physical adapter logs the dropped/damaged frames is an indication that the physical component is external to the VIOS.  If it was an issue with the physical FC adapter port, other client LPARs mapped to the same physical adapter port would be impacted.

Answer

Work with your SAN administrator to isolate the physical component at fault.
Things to consider
  1. Check the fibre cable between the physical port (fcs1 in this example) and the switch port.  In cases where the error is logged for multiple client LPARs/virtual adapters, it can be an indication of a faulty fibre cable between the physical fcs# port and the switch.  Try replacing it. 
  2. If errors continue after that, check the switch port.  Try moving the cable between the fcs# port and the switch to a different switch port or replace the GBIC/SFP in the switch port where fcs# is connected to.
  3. When VIOS/AIX logs these particular errors, switch logs are frequently silent on the issue.  If the frames get dropped/damaged due to a marginal or faulty SFP, the switch is often unaware of that.
    If the errors persist after items 1 and 2 have been ruled out, contact your local Hardware Support Representative to check the FC adapter port.

[{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSPHKW","label":"PowerVM Virtual I\/O Server"},"ARM Category":[{"code":"a8m0z0000001espAAA","label":"MPIO-\u003EFC"}],"ARM Case Number":"TS011830150","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
28 October 2024

UID

ibm16855405