A fix is available
APAR status
Closed as program error.
Error description
System hang because unresponsive processor handler is prevented from generating ABENDMCW002 due to a Vary Proc Lock hang.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: All users of z/VM. * **************************************************************** * PROBLEM DESCRIPTION: * **************************************************************** * RECOMMENDATION: APPLY PTF * **************************************************************** z/VM OPERATOR console shows repeated MSHCPMPG9152E messages for the same unresponsive processor and the system stays up but becomes unusable because it is hung. If Monitor was running then it probably stopped generating sample records around the time of the first MSHCPMPG9152E message. When the unresponsive processor detection front-end (HCPMPGUP) finds an unresponsive processor, it stacks a call to the unresponsive processor recovery (HCPMPCPR) to determine whether the processor can be restarted to recover from the error, or to generate ABENDMCW002 and restart the system. That process requires the Vary Processor Lock (HCPRCCVA). A lock hang on that lock will prevent the recovery action from being performed and the system will eventually hang. There are cases where the unresponsive processor may prevent a task that holds HCPRCCVA from completing. In that case HCPRCCVA is never released and the unresponsive processor recovery is not able to run and generate ABENDMCW002 and restart the system. When the unresponsive processor recovery is blocked because of an HCPRCCVA hang, the system hangs and becomes unusable. The customer impact of the outage is greater because the system is essentially unavailable while hung even though it is running. The problem of the HCPRCCVA hang preventing the unresponsive processor detection function from running is not the cause of the outage. The cause of the outage is the function that caused a processor to become unresponsive. However, the hang on the HCPRCCVA lock prevents the unresponsive processor recovery from running and also prevents the generation of ABENDMCW002. Without a dump of the system it is not possible to determine the cause of the unresponsive processor. This APAR addresses system availability and FFDC (first-failure data capture) aspects of the problem.
Problem conclusion
The APAR fix adds the capability to detect a Vary Proc Lock (HCPRCCVA) hang in unresponsive processor recovery prior to where it attempts to acquire HCPRCCVA. If it detects a lock hang on HCPRCCVA then an ABENDMPC008 dump is generated. This satisfies the FFDC concern by generating a dump as close to the point of failure as is reasonable. The dump allows the cause of the unresponsive processor to be diagnosed. The fix also improves availability by detecting a permanent error closer to the point it occurs and forcing an abend dump and re-IPL rather than allowing the system to remain up in an unusable state. Changed parts: - HCPRCC ASSEMBLE - HCPMPC ASSEMBLE - HCPLCK ASSEMBLE - HCPSGP ASSEMBLE - HCPMTC ASSEMBLE - HCPCCF ASSEMBLE SRL changes: GC24-6270-01 CP messages and Codes - z/VM Version 7 Release 1 - Page 107 - add the MPC008 abend information. - This is Chapter 2. System Codes - CP Abend Codes GC24-6177-12 CP messages and Codes - z/VM Version 6 Release 4 - Page 86 - add the MPC008 abend information. - This is Chapter 2. System Codes - CP Abend Codes - Abend Codes A - M MPC008 Explanation: This module is distributed as object code only; therefore, no source program materials are available. User response: See z/VM: Diagnosis Guide for information on gathering the documentation you need to assist IBM in diagnosing the problem; then contact your IBM Support Center personnel.
Temporary fix
FOR RELEASE VM/ESA CP/ESA R640 : PREREQ: VM65988 VM66105 CO-REQ: NONE IF-REQ: NONE FOR RELEASE VM/ESACP/ESAR710 : PREREQ: NONE CO-REQ: NONE IF-REQ: NONE
Comments
APAR Information
APAR number
VM65776
Reported component name
VM CP
Reported component ID
568411202
Reported release
640
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2019-04-16
Closed date
2019-06-13
Last modified date
2020-12-16
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UM35376 UM35377
Modules/Macros
HCPCCF HCPLCK HCPMPC HCPMTC HCPRCC HCPSGP
Fix information
Fixed component name
VM CP
Fixed component ID
568411202
Applicable component levels
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG27M","label":"APARs - z\/VM environment"},"Platform":[{"code":"PF054","label":"z\/OS"}],"Version":"640","Line of Business":{"code":"LOB16","label":"Mainframe HW"}}]
Document Information
Modified date:
12 January 2021