A fix is available
APAR status
Closed as program error.
Error description
A stable VSWITCH network began to exhibit poor performance, connect failures, and/or timeouts after OSA microcode update. In one case, system performance was impacted and one processor was found to be looping through entry point HCPVQSEQ. Network traces revealed that a significant number of malformed packets were being generated, and being processed by the OSA Express. This appears to be a contributing factor to the problem. Special processing is required by the OSA Express, and by the host device driver (CP VSWITCH logic in this case). An error in this process leads to (1) the loss of additional, valid, packets when the malformed packet is discarded, and (2) the loop in HCPVQS if the unexpected buffer state is detected at the wrong point in processing. The loop in HCPVQS will disrupt connectivity through the affected OSA RDEV connection, and lead to an ABEND MCW002 or ABEND MPC008.
Local fix
Apply PTF
Problem summary
**************************************************************** * USERS AFFECTED: Systems using VSWITCH in a physical network * * where malformed packets may be generated * * could notice loss of valid packets and * * (in rare cases) a system hang with a loop * * in HCPVQSEQ. * **************************************************************** * PROBLEM DESCRIPTION: * **************************************************************** * RECOMMENDATION: APPLY PTF * **************************************************************** The OSA Express detected a malformed packet and marked the associated buffer segment with an error state (part of the protocol for that interface). A flaw in the OSA firmware resulted in a change to other buffer state indicators that was outside the defined protocol. The VSWITCH device driver processing inbound data (HCPVQSEQ) could not process that buffer state, but it also could not advance to the next buffer segment. That left HCPVQS with an infinite loop. This can lead to a hang, ABEND MCW002 or ABEND MPC008. Note that a firmware update has corrected the OSA Express flaw, but the VSWITCH driver needs to address the possibility that a protocol error is possible in some future firmware upgrade, and make provisions to avoid a CP failure.
Problem conclusion
HCPVQS is updated to handle buffer state tests using a strategy that will detect an invalid state (by testing all valid states). This includes logic to count a "bad" buffer as an error packet in the NICBK error count. When a protocol error (such as an unsupported buffer state) is encountered, HCPVQS will force a reset (CSCH) of the adapter. HELP HCP2832E is modified to include this section: Connection device for VSWITCH SYSTEM switchname is not active OSA Express interface error detected. Explanation: The VSWITCH device driver in CP encountered an unexpected or undefined status associated with the OSA Express queue buffer. This makes it impossible to continue using the interface without resetting it to a known state.
Temporary fix
FOR RELEASE VM/ESA CP/ESA R640 : PREREQ: VM66160 CO-REQ: NONE IF-REQ: NONE FOR RELEASE VM/ESACP/ESAR710 : PREREQ: VM66219 VM66283 CO-REQ: NONE IF-REQ: NONE
Comments
APAR Information
APAR number
VM66280
Reported component name
VM CP
Reported component ID
568411202
Reported release
710
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2019-04-22
Closed date
2019-09-27
Last modified date
2021-02-17
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UM35538 UM35539 UM35540
Modules/Macros
HCPMES HCPMESA HCPMESB HCPMXRBK HCPVQS HCP2832E
Fix information
Fixed component name
VM CP
Fixed component ID
568411202
Applicable component levels
RA64 PSY UM35812
UP21/02/17 I 1000 ¢
R640 PSY UM35539
UP19/09/30 P 2001 ¢
R710 PSY UM35540
UP19/09/30 P 2001 ¢
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG27M","label":"APARs - z\/VM environment"},"Platform":[{"code":"PF054","label":"z\/OS"}],"Version":"710","Line of Business":{"code":"LOB16","label":"Mainframe HW"}}]
Document Information
Modified date:
27 February 2021