IBM Support

Partitions with IASP and TR10 or TR4 may stop responding

News


Abstract

Partitions with an IASP and V7R4 Tech Refresh 10 or V7R5 Tech Refresh 4 applied are exposed to a critical issue.
An internal counter could go negative and cause the system to stop writing changed IASP pages to disk leading to pool starvation and critically slow system-wide performance requiring an IPL to resolve.

Content

Summary
Known Issue DT397753
Apply 7.4 MJ02267 and vary off/on any IASP.
Apply 7.5 MJ02268 and vary off/on any IASP.
To correct the problem, apply MJ02267/MJ02268 or the superseding fixes.
Note: Superseding fixes MJ02304/MJ02305 have a TR requisite of TR11/TR5.
Details
If V7R4 TR10 or V7R5 TR4 is applied, use the following commands to verify the partition is exposed to the issue:
  1. WRKDEVD *ASP
  2. Use DSPASPSTS to review the current state of the ASP devices on the system
If TR10/TR4 is applied but the fixes are not installed
Apply 7.4 MJ02267 and vary off/on any IASP*.
Apply 7.5 MJ02268 and vary off/on any IASP*.
*If you cannot vary off/on one of the IASPs, a recommendation is provided in the Related Information section of this document.
 
If TR10/TR4 is applied and the fix is already installed
If the special instructions were followed, nothing needs to be done.
Use DSPASPSTS to confirm the IASP was varied off after the fix was applied.
 
If TR10/TR4 is not installed
You are not exposed to the problem.
If you already ordered TR10/TR4 but have not applied it, first ensure the fixes are installed then IPL to install the TR which will activate the fix.
 
Potential Symptoms and Recovery
After the problem starts, you may see one or all of the following symptoms
o an increase in fault rates
o jobs waiting in pool over-commitment
o SMPO0* task using CPU
If jobs are waiting in pool-overcommitment, secondary symptoms could be
o jobs will hang or not end
o jobs could be waiting on a lock
 
Using CLRPOOL, on the pool with the most faulting, may improve the performance.
     Try to use CLRPOOL on a subsystem if all jobs in a subsystem are hanging.  However, if the system is getting to the point of everything locking up due to
     QHST being locked, this may not work and you are too far in to the problem to get out.
CHGASPACT ASPDEV(ASPDEV) OPTION(*FRCWRT) 
IPL the system to recover; consider collecting a Main Storage Dump if you need evidence that you encountered this known issue.

[{"Type":"MASTER","Line of Business":{"code":"LOB68","label":"Power HW"},"Business Unit":{"code":"BU070","label":"IBM Infrastructure"},"Product":{"code":"SWG60","label":"IBM i"},"ARM Category":[{"code":"a8m0z000000cvtOAAQ","label":"Performance-\u003EAPAR"}],"Platform":[{"code":"PF012","label":"IBM i"}],"Version":"7.4.0;7.5.0"}]

Document Information

Modified date:
28 October 2024

UID

ibm17173542