A fix is available
APAR status
Closed as program error.
Error description
Change Team finds this storage shortage is due to the build up of ITRQP control blocks (which are created for a page chain) however that chain changes while a client channel scans it to check for messages on the queue. When that browsing process completes the current browse/get operation should free the ITRQPs however this operation never completes because the pagechain continues to change while it is being searched. This results in the accumulation of ITRQPs on the free chain which eventually leads to storage exhaustion (which can be represented by the abend in ADDITIONAL SYMPTOMS) The root of the problem is the length of the page chain (which contains thousands of empty pages which are being scanned each time, and the time taken to scan through these increases the chance of the pagechain changing while that scan takes place). The scavenger should be removing any empty pages from the page chain and, for an indexed queue, this is done when an application informs the scavenger that a page may be empty. Dump review shows this process is occurring, but every so often a page is missed, leading to an accumulation over time of empty pages on the page chain. The problem occurs when the last remaining message is got from a page by an application that uses MQGMO_MARK_SKIP_BACKOUT and the application backs out. During backout processing the IPSE to inform the scavenger that the page may be empty is deleted (by design) however due to mark skip backout processing the message is never returned to the queue to be go again. This means no new IPSE will be created for that page (ie. the scavenger is never informed that the page is empty and does not deallocate it) Under normal circumstances this is not noticeable, however the partial scavenger would also scan the pagechain removing empty pages, and so would locate and process these missed pages (In this case that does not occur due to a valid message in the head page. The partial scavenger only processes empty pages at the head of the pagechain) . Additional symptoms/keywords: 5C6 00E2000B ABEND5C6 ABENDS5C6 S5C6 S05C6 SP229 KEY7 leak
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: All users of WebSphere MQ for z/OS Version 8 * * Release 0 Modification 0. * **************************************************************** * PROBLEM DESCRIPTION: Abend 5C6-00E2000B and abnormal queue * * manager termination due to a short on * * storage (SOS) condition due to a build * * up of ITRQP control blocks. * * * * In some cases, empty pages are not * * freed, leading to unexpected pageset * * expansion or MQRC 2192 * * (MQRC_STORAGE_MEDIUM_FULL / * * MQRC_PAGESET_FULL ) errors. * **************************************************************** * RECOMMENDATION: * **************************************************************** While an application browsing or getting messages from a queue searches for the next available message, ITRQP control blocks are created by other tasks performing operations that could require the application to restart its search. If the application determines that it needs to restart the search (for example, due to an ITRQP indicating that an uncommitted message is now available earlier on the queue, or due to the page chain being changed by the scavenger), it moves any unneeded ITRQPs to a freechain to be freed later, when the browse/get operation completes. In rare cases, the browse/get operation can be required to restart the search multiple times, causing the number of ITRQPs on the freechain to grow. As the number of ITRQPs on the freechain increases, the time to process them increases, which increases the likelihood of needing to restart the search again, compounding the problem and leading to the get/browse operation never completed, and consequently never freeing the ITRQPs on the freechain, and eventually resulting in storage exhaustion. In the reported case the browse/get the search of the pagechain was taking a long time due to a large number of empty pages on the pagechain - in the time taken to scan through the empty pages the chain would be changed by the scavenger, requiring the search to restart, and so triggering the ITRQP build up. The buildup of empty pages was due mark skip backout (MQGMO_MARK_SKIP_BACKOUT) processing, in this case from the CICS bridge, failing to inform the scavenger when skipping the backout of the last message on a page.
Problem conclusion
CSQIATRQ is changed to free unneeded ITRQPs immediately rather than accumulating them on the freechain, and consequently preventing the buildup of ITRQPs and resulting storage shortage. CSQIDLM1 is changed to inform the scavenger when skipping the backout of a message on an indexed queue, ensuring the page is deallocated if it is now empty and preventing the build up of empty pages on the pagechain. 000Y CSQIDLM1 CSQIMGEF CSQIMGE3
Temporary fix
********* * HIPER * *********
Comments
APAR Information
APAR number
PI63036
Reported component name
WMQ Z/OS 8
Reported component ID
5655W9700
Reported release
000
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2016-05-25
Closed date
2016-06-20
Last modified date
2016-10-10
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UI38791 PI68957
Modules/Macros
CSQIDLM1 CSQIMGEF CSQIMGE3
Fix information
Fixed component name
WMQ Z/OS 8
Fixed component ID
5655W9700
Applicable component levels
R000 PSY UI38791
UP16/07/28 P F607 ¢
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.0","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
10 October 2016