A fix is available
APAR status
Closed as program error.
Error description
When the overflow structure fills during overflow threshold processing, objects may be moved back to the primary structure but remain marked as "in overflow". A CQS queue name can be added to the "in overflow" control list entry multiple times. When a queue name is removed from overflow and is in the overflow control list entry multiple times, other unrelated queue names can be left in the overflow hash table after they are removed from overflow. This can leave transactions or LTERMs in a state where some messages go into the overflow structure and can not be retrieved.
Local fix
There is no local fix for this problem.
Problem summary
**************************************************************** * USERS AFFECTED: * * All users of IMS V15 CQS for shared message queues or shared * * EMH queues * **************************************************************** * PROBLEM DESCRIPTION: * * CQS overflow processing may leave * * queue names in an inconsistent state * * with respect to being in overflow or * * not when overflow structure fills * * during overflow threshold processing. * **************************************************************** * RECOMMENDATION: * * INSTALL CORRECTIVE SERVICE FOR APAR/PTF * **************************************************************** Due to several errors in CQS overflow processing, CQS can lose track of which structure -- primary or overflow -- a queue of objects is stored in. This can cause objects to be inaccessible to a CQS client. Two CQSs can have different views of the location of a queue, allowing one CQS to be able to access the queue of objects while the other one cannot. The trigger to these issues is the overflow structure becoming full during overflow threshold processing (when CQS moves queues of objects from the primary to the overflow structure). This is more likely for very small structures as might be found in a test environment. Specifically: 1. When CQS selects queue names to move to the overflow structure, it counts the number of objects on each queue to make sure the number of objects moved will fit into the overflow structure. A logic error in the code that counts the objects can result in an undercount of the number of objects on a given list header when the IXLLIST READ_LIST call is processed synchronously. This can cause the overflow structure to become full during the overflow threshold process. 2. Module CQSOFL10 moves the objects from the primary structure to the overflow structure during overflow threshold processing. There is one CQSOFL10 thread for each of the 11 sets of CQS queue types in the structure. If the overflow structure becomes full during the move of a queue, CQSOFL10 will "back out" that move and move all of the objects for that queue name back to the primary structure. However, only the last-processed AWE's return code (success or failure) is passed back to the controlling code that should remove the backed-out queue name from being marked "in overflow". If the queue name's AWE was not the last one processed, the knowledge that it was backed out is lost, and it remains in overflow, even though the objects are back in the primary structure. Further, future overflow threshold processing can again select the problematic queue name to move to overflow, resulting in it being listed twice (or more) in the overflow control list entry in the structure. 3. Module CQSOFL50 processes queue names that are removed from overflow. Near label QNUR1200, CQSOFL50 issues a CQSTBL FIND call to find the queue name in the queue name remove list to be removed from overflow. If this find fails, a trace entry is made. The call to make the trace entry sets R3 to TTHE_LEV_ERROR to select level ERROR (1) for the trace entry. But R3 is the BCT register for the loop removing multiple queue names from overflow. This has the effect of terminating the loop early and leaving any later queue names in the overflow hash table while still removing them from the "in overflow" control list entry. The CQS that removed the queue names will still think that the queue is in overflow. A CQS that starts after this point will think that the queue name is not in overflow. 4. Message CQS0248I is issued once per minute during overflow threshold processing as a progress message. The message includes a count of the number of objects moved so far, which is kept in the primary structure block. When objects are being moved back to the primary structure due to the overflow structure filling, the code that issues this message in CQSOFL10 has R10 pointing to the overflow structure block, not the primary structure block. This causes an extra CQS0248I message to be issued with an incorrect count value.
Problem conclusion
The issues noted above are fixed: 1. CQSOFL00 is changed to load R5 to point to the IXLLIST answer area at the start of the READ_LIST routine, to ensure it is set for both synchronous and asynchronous path. 2. CQSOFL10 is changed to set QTPRSN in the queue type table entry to the maximum RC received for any queue name's move, rather than the RC of the last queue name moved. 3. CQSOFL50 is changed to use R4 for the queue name remove loop BCT register (rather than R3). 4. The code to issue MSGCQS0248I in CQSOFL10 is changed to always point to the primary structure block. Additionally, the following changes are made in overflow processing to improve future diagnostics: CQSFM020, the CQSSTRUC block dump formatter, is changed to print the number of EMCs, structure size, structure max size, and overflow percentage in the structure statistics summary. A new dump formatter option - CQSTBL - is created to format the hash table element information for CQSTBL-format tables. CQSTBL is a new option on the CQS low-level dump formatter panel. The option is implemented in reserved dump formatter module CQSFM110. CQSOBND0 is updated to understand the new formatter option. Table CQSAABL0 is updated to add "CQSTBL" as an option. New MSGCQS0249E is issued if an error (such as structure full) occurs when moving an object between one structure and another during overflow move processing. Four new trace entries are defined for the CQS OFLW trace table: TROFQSLB X'15' - Overflow queue name selection begin. This entry is written by the overflow master CQS when it begins looking at queues on the primary structure as candidates for overflow processing. It contains information about entry and element space in the overflow structure, along with target and maximums that will be used in the overflow selection process. Written by CQSOFL00. TROFQSLD X'16' - Queue name selected for overflow. This entry is written by the overflow master CQS for each queue name selected to be in overflow. Written by CQSOFL00. TRO1MBBG X'37' - CQSOFL10 move back begin. TRO1MBEN X'38' - CQSOFL10 move back end. These entries are written by CQSOFL10 when it begins and ends moving queues back from the overflow structure to the primary structure. Prior to this change, both move-to and move-from overflow begin/end entries used trace codes X'34' and X'39'. Having separate codes makes it easier to tell the move direction when using the OFLW trace. The content of certain existing CQS OFLW trace entries is updated to improve diagnostics: TRO5TBLE X'A1' CQSTBL request error trace entry is updated to include the queue name associated with the failed CQSTBL request when available. TRO1MV10 X'3D' CQSOFL10 CQSMOV10 error trace entry is updated to include an indicator of which trace point in CQSOFL10 made the entry, the CQSMOV10 return code, the IXLLIST return and reason codes, and the count of objects moved for the queue name up to the error (if available). TRO1MVEN X'39' TRO1MBEN X'37' Move to / move back end: Add list number being processed to be consistent with the corresponding move begin trace entries. Add MOVE_OBJ internal routine RC in byte 2 (B2). The OFLW trace formatter decode table in CQSFTRC0 is updated to include the new trace entries, several missing entries, and queue name EBCDIC printing for those entires that have queue names but were missing the indicators to have them printed. A new "Printable character validation" TRT table is added to the BPE translate table module, BPETRTB0. This table is used by the new CQS0249E message generation code to decide to print a CQS queue name as either printable EBCDIC, or as hex digits. It is also used by the new CQSFM110 CQSTBL dump format module. A similar table was coded in module CQSOFL50; CQSOFL50 is changed to remove this table and use the BPE-provided one. BPEOSRV0, the BPE dump formatter services module, is updated to include the BPE translate table module BPETRTB0 in the load module (BPEOS210) and uses these tables instead of internally coded tables for dump formatter modules. MSGCQS0249E description: ------------------------ In IMS V15 manual Messages and Codes, Volume 2: Non-DFS Messages (GC27-6790), in Chapter 4. CQS messages, add new message CQS0249E: CQS0249E OVERFLOW MOVE TO STRUCTURE structure FAILED FOR QNAME qname RC=rc Explanation: CQS overflow processing was not able to move an object between structures during overflow threshold processing. This message is issued to provide diagnostic information about a move-across-structure failure. In the message: structure: The name of the structure to which the object was being moved. qname: The queue name of the object being moved, in one of two formats: tt-ccccccccccccccc tt-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx where: - tt is the queue type, in hex - ccccccccccccccc is the queue name in printableEBCDIC (when all 15 characters are printable) - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx is the queue name in hex (when at least one character is not printable EBCDIC) The meaning of the queue type is defined by the client of CQS. For an IMS shared queue structure, tt can be one of the following: 01 - Transaction ready queue 02 - Transaction staging queue 03 - Transaction suspend queue 04 - Transaction serial queue 05 - LTERM ready queue 06 - LTERM staging queue 07 - APPC ready queue 08 - Remote ready queue 09 - OTMA ready queue For an IMS shared expedited message handler (EMH) structure, tt can be one of the following: 01 - Program ready queue 05 - LTERM ready queue rc: The return code from CQSMOV10 indicating the failure reason See CQSMOV10 return codes in section "CQS service return codes" in manual Messages and Codes, Volume 4: IMS Component Codes. System Action: CQS processes the move failure based on the return code from CQSMOV10. For example, if the failure was because the overflow structure became full, CQS removes the queue name from overflow and moves objects back to the primary structure. If the failure was due to a structure failure, CQS initiates rebuild processing. System programmer response: This message is issued for diagnostic purposes; no action is specifically required. However, depending on the reason for the move failure, other CQS messages indicated may indicate actions are needed. For example, if the overflow structure filled, then it may need to be made larger. Modules: CQSOFL10
Temporary fix
Comments
APAR Information
APAR number
PH23717
Reported component name
IMS V15
Reported component ID
5635A0600
Reported release
500
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-03-26
Closed date
2020-07-20
Last modified date
2020-08-03
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
PH27014 UI70667
Modules/Macros
CQSFM020 CQSFTRC0 CQSOFL50 BPETRTB0 CQSOFL10 CQSM1ENU CQSTROFL CQSFM110 CQSOBND0 CQSAABL0 CQSOFL00 BPEOSRV0 CQS$TST0
| GC27679000 |
Fix information
Fixed component name
IMS V15
Fixed component ID
5635A0600
Applicable component levels
R500 PSY UI70667
UP20/07/22 P F007 ¢
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPH2","label":"IMS"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"15","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]
Document Information
Modified date:
22 December 2023