Notification
Risk classification
IBM Z HIPER Alerts
Abstract
Description
-
Document number:
231101
-
Machine type:
8561 / 8562
-
Models affected:
All
-
Abstract:
HIPER MCLs released to address potential system outage(s) on Machine Types 8561 and 8562 (Update to previous IBM Z Machine Alert 231018).
Description
Based on additional evaluation and insight, IBM has designated D41C Bundles S80 & S81 as HIPER MCL releases. There are no other changes to what was previously documented in IBM Z Machine Alert 231018.
For convenience, see below for IBM Z Machine Alert 231018 in it’s entirety:
IBM has identified two issues that may result in a complete system outage for Machine Types 8561 and 8562. With these exposures in mind, IBM is making a strong recommendation to install the latest firmware release for D41C which is Bundle S81.
The first exposure is in a Channel Subsystem (CSS) Recovery routine that generates an incorrect memory address for updating Hardware System Area (HSA). A code routine could encounter this changed memory sometime later causing unpredictable results. The impact could be anything from software application problems to a complete system outage. While this problem is extremely rare, this CSS Recovery routine will be invoked when errors occur on various channel types.
The second exposure is in the Power and Cooling Subsystem impacting both Radiator and Water cooled 8561 machine types. In a rare case, a pump fail-over to the redundant pump may be delayed, eventually leading to an over-temperature condition on Single Chip Modules (SCM). The Power subsystem will power the affected CPC drawer off to protect the hardware from physical damage, resulting in an unplanned system outage.
Note: Machine Type 8562 is not exposed to the second issue as it is air-cooled.
Fix information:
z15 Driver 41 SE-I390ML EC Stream P46601 MCL177 (Bundle S80) - HIPER MCL
z15 Driver 41 SE-POWERC EC Stream P46610 MCL 034 (Bundle S81) - HIPER MCL
/servers/resourcelink/lib03020.nsf/pages/machinealert231018?OpenDocument
Recommended Action
IBM strongly recommends installing the latest release of code (Driver 41C / Bundle S81) to prevent either of these impacting events. The individual fixes mentioned above will prevent any further occurrences of the two exposures. Both exposures are base z15 hardware / firmware issues, not introduced by any previously released fixes.
Unfortunately, if the HSA exposure has already occurred due to a previous CSS recovery event, this code installation will not clear the unintentionally changed HSA memory. Included with the SE-I390ML MCL in D41C/ Bundle S80 is an HSA interrogation routine that will execute once during MCL activation. It is designed to identify the known areas of HSA corruption. If HSA corruption is detected, a new hardware problem will be created and post a Hardware Message during the MCL installation with the following:
- Problem Statement
- A hardware failure was detected in the central processor complex (CPC) or in the Channel Subsystem (CSS).
If encountered, IBM Service and/or customer should REQUEST SERVICE when viewing the hardware message and that will create a case for support to engage. IBM Support will need to review logs for further actions.
If HSA corruption is detected during the MCL installation, it is very likely that a Power-On-Reset / CPC activation will be required to eliminate any exposure to the HSA problem that was described above. This will be confirmed after IBM support has analyzed the problem.
Important note: Machines that are still running at D41C / Bundle S70A or earlier release, should review previously published z15 Machine Alert Document number 230317 as you will need to schedule extra time for the MCL Bundle upgrade
Please contact next level of support for any questions or concerns.
Reference ID
231101
Date first published
01 November 2023
Was this topic helpful?
Document Information
Modified date:
19 February 2025
UID
97435E9826E0E1E885258A5A0057E5EE