IBM Support

PH23717: QUEUE NAMES MAY NOT BE REMOVED FROM THE SHARED QUEUES OVERFLOW HASH TABLE AFTER AN ERROR CONDITION.

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • When the overflow structure fills during overflow threshold
    processing, objects may be moved back to the primary
    structure but remain marked as "in overflow".  A CQS queue
    name can be added to the "in overflow" control list entry
    multiple times.  When a queue name is removed from overflow
    and is in the overflow control list entry multiple times,
    other unrelated queue names can be left in the overflow
    hash table after they are removed from overflow.  This can
    leave transactions or LTERMs in a state where some messages
    go into the overflow structure and can not be retrieved.
    

Local fix

  • There is no local fix for this problem.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All users of IMS V15 CQS for shared message queues or shared *
    * EMH queues                                                   *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * CQS overflow processing may leave                            *
    * queue names in an inconsistent state                         *
    * with respect to being in overflow or                         *
    * not when overflow structure fills                            *
    * during overflow threshold processing.                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * INSTALL CORRECTIVE SERVICE FOR APAR/PTF                      *
    ****************************************************************
    Due to several errors in CQS overflow processing, CQS can lose
    track of which structure -- primary or overflow -- a queue of
    objects is stored in.  This can cause objects to be inaccessible
    to a CQS client.  Two CQSs can have different views of the
    location of a queue, allowing one CQS to be able to access the
    queue of objects while the other one cannot.  The trigger to
    these issues is the overflow structure becoming full during
    overflow threshold processing (when CQS moves queues of objects
    from the primary to the overflow structure).  This is more
    likely for very small structures as might be found in a test
    environment.
    
    Specifically:
    
      1. When CQS selects queue names to move to the overflow
         structure, it counts the number of objects on each queue
         to make sure the number of objects moved will fit into the
         overflow structure.  A logic error in the code that counts
         the objects can result in an undercount of the number of
         objects on a given list header when the IXLLIST READ_LIST
         call is processed synchronously.  This can cause the
         overflow structure to become full during the overflow
         threshold process.
    
      2. Module CQSOFL10 moves the objects from the primary
         structure to the overflow structure during overflow
         threshold processing.  There is one CQSOFL10 thread for
         each of the 11 sets of CQS queue types in the structure.
         If the overflow structure becomes full during the move of
         a queue, CQSOFL10 will "back out" that move and move all
         of the objects for that queue name back to the primary
         structure.  However, only the last-processed AWE's return
         code (success or failure) is passed back to the controlling
         code that should remove the backed-out queue name from
         being marked "in overflow".   If the queue name's AWE was
         not the last one processed, the knowledge that it was
         backed out is lost, and it remains in overflow, even though
         the objects are back in the primary structure.  Further,
         future overflow threshold processing can again select the
         problematic queue name to move to overflow, resulting in
         it being listed twice (or more) in the overflow control
         list entry in the structure.
    
      3. Module CQSOFL50 processes queue names that are removed from
         overflow.  Near label QNUR1200, CQSOFL50 issues a CQSTBL
         FIND call to find the queue name in the queue name remove
         list to be removed from overflow.  If this find fails, a
         trace entry is made.  The call to make the trace entry sets
         R3 to TTHE_LEV_ERROR to select level ERROR (1) for the
         trace entry.  But R3 is the BCT register for the loop
         removing multiple queue names from overflow.  This has the
         effect of terminating the loop early and leaving any later
         queue names in the overflow hash table while still removing
         them from the "in overflow" control list entry.  The CQS
         that removed the queue names will still think that the
         queue is in overflow.  A CQS that starts after this point
         will think that the queue name is not in overflow.
    
      4. Message CQS0248I is issued once per minute during overflow
         threshold processing as a progress message.  The message
         includes a count of the number of objects moved so far,
         which is kept in the primary structure block.  When objects
         are being moved back to the primary structure due to the
         overflow structure filling, the code that issues this
         message in CQSOFL10 has R10 pointing to the overflow
         structure block, not the primary structure block.  This
         causes an extra CQS0248I message to be issued with an
         incorrect count value.
    

Problem conclusion

  • The issues noted above are fixed:
    
      1. CQSOFL00 is changed to load R5 to point to the IXLLIST
         answer area at the start of the READ_LIST routine, to
         ensure it is set for both synchronous and asynchronous
         path.
    
      2. CQSOFL10 is changed to set QTPRSN in the queue type table
         entry to the maximum RC received for any queue name's move,
         rather than the RC of the last queue name moved.
    
      3. CQSOFL50 is changed to use R4 for the queue name remove
         loop BCT register (rather than R3).
    
      4. The code to issue MSGCQS0248I in CQSOFL10 is changed to
         always point to the primary structure block.
    
    Additionally, the following changes are made in overflow
    processing to improve future diagnostics:
    
    CQSFM020, the CQSSTRUC block dump formatter, is changed to print
    the number of EMCs, structure size, structure max size, and
    overflow percentage in the structure statistics summary.
    
    A new dump formatter option - CQSTBL - is created to format the
    hash table element information for CQSTBL-format tables.  CQSTBL
    is a new option on the CQS low-level dump formatter panel.  The
    option is implemented in reserved dump formatter module
    CQSFM110.  CQSOBND0 is updated to understand the new formatter
    option.  Table CQSAABL0 is updated to add "CQSTBL" as an option.
    
    New MSGCQS0249E is issued if an error (such as structure full)
    occurs when moving an object between one structure and another
    during overflow move processing.
    
    Four new trace entries are defined for the CQS OFLW trace table:
    
      TROFQSLB X'15' - Overflow queue name selection begin.  This
      entry is written by the overflow master CQS when it begins
      looking at queues on the primary structure as candidates for
      overflow processing.  It contains information about entry and
      element space in the overflow structure, along with target and
      maximums that will be used in the overflow selection process.
      Written by CQSOFL00.
    
      TROFQSLD X'16' - Queue name selected for overflow.  This entry
      is written by the overflow master CQS for each queue name
      selected to be in overflow.  Written by CQSOFL00.
    
      TRO1MBBG X'37' - CQSOFL10 move back begin.
      TRO1MBEN X'38' - CQSOFL10 move back end.
    
        These entries are written by CQSOFL10 when it begins and
        ends moving queues back from the overflow structure to the
        primary structure.  Prior to this change, both move-to and
        move-from overflow begin/end entries used trace codes X'34'
        and X'39'.  Having separate codes makes it easier to tell
        the move direction when using the OFLW trace.
    
    The content of certain existing CQS OFLW trace entries is
    updated to improve diagnostics:
    
      TRO5TBLE X'A1' CQSTBL request error trace entry is updated to
      include the queue name associated with the failed CQSTBL
      request when available.
    
      TRO1MV10 X'3D' CQSOFL10 CQSMOV10 error trace entry is updated
      to include an indicator of which trace point in CQSOFL10 made
      the entry, the CQSMOV10 return code, the IXLLIST return and
      reason codes, and the count of objects moved for the queue
      name up to the error (if available).
    
      TRO1MVEN X'39'
      TRO1MBEN X'37'
    
        Move to / move back end:  Add list number being processed
        to be consistent with the corresponding move begin trace
        entries.  Add MOVE_OBJ internal routine RC in byte 2 (B2).
    
    The OFLW trace formatter decode table in CQSFTRC0 is updated to
    include the new trace entries, several missing entries, and
    queue name EBCDIC printing for those entires that have queue
    names but were missing the indicators to have them printed.
    
    A new "Printable character validation" TRT table is added to the
    BPE translate table module, BPETRTB0.  This table is used by the
    new CQS0249E message generation code to decide to print a CQS
    queue name as either printable EBCDIC, or as hex digits.  It is
    also used by the new CQSFM110 CQSTBL dump format module.  A
    similar table was coded in module CQSOFL50; CQSOFL50 is changed
    to remove this table and use the BPE-provided one.
    
    BPEOSRV0, the BPE dump formatter services module, is updated to
    include the BPE translate table module BPETRTB0 in the load
    module (BPEOS210) and uses these tables instead of internally
    coded tables for dump formatter modules.
    
    MSGCQS0249E description:
    ------------------------
    
    In IMS V15 manual Messages and Codes, Volume 2: Non-DFS
    Messages (GC27-6790), in Chapter 4. CQS messages, add
    new message CQS0249E:
    
    CQS0249E OVERFLOW MOVE TO STRUCTURE structure FAILED
             FOR QNAME qname RC=rc
    
    Explanation: CQS overflow processing was not able to move
    an object between structures during overflow threshold
    processing.  This message is issued to provide diagnostic
    information about a move-across-structure failure.
    
    In the message:
    
      structure: The name of the structure to which the object
                 was being moved.
    
      qname:     The queue name of the object being moved, in
                 one of two formats:
    
                   tt-ccccccccccccccc
                   tt-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    
                 where:
    
                   - tt is the queue type, in hex
                   - ccccccccccccccc is the queue name in
                     printableEBCDIC (when all 15 characters
                     are printable)
                   - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx is the
                     queue name in hex (when at least one
                     character is not printable EBCDIC)
    
                 The meaning of the queue type is defined by
                 the client of CQS.  For an IMS shared queue
                 structure, tt can be one of the following:
    
                   01 - Transaction ready queue
                   02 - Transaction staging queue
                   03 - Transaction suspend queue
                   04 - Transaction serial queue
                   05 - LTERM ready queue
                   06 - LTERM staging queue
                   07 - APPC ready queue
                   08 - Remote ready queue
                   09 - OTMA ready queue
    
                 For an IMS shared expedited message handler
                 (EMH) structure, tt can be one of the
                 following:
    
                   01 - Program ready queue
                   05 - LTERM ready queue
    
      rc:        The return code from CQSMOV10 indicating the
                 failure reason  See CQSMOV10 return codes in
                 section "CQS service return codes" in manual
                 Messages and Codes, Volume 4: IMS Component
                 Codes.
    
    System Action: CQS processes the move failure based on the
    return code from CQSMOV10.  For example, if the failure
    was because the overflow structure became full, CQS
    removes the queue name from overflow and moves objects
    back to the primary structure.  If the failure was due to
    a structure failure, CQS initiates rebuild processing.
    
    System programmer response: This message is issued for
    diagnostic purposes; no action is specifically required.
    However, depending on the reason for the move failure,
    other CQS messages indicated may indicate actions are
    needed.  For example, if the overflow structure filled,
    then it may need to be made larger.
    
    Modules: CQSOFL10
    

Temporary fix

Comments

APAR Information

  • APAR number

    PH23717

  • Reported component name

    IMS V15

  • Reported component ID

    5635A0600

  • Reported release

    500

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2020-03-26

  • Closed date

    2020-07-20

  • Last modified date

    2020-08-03

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    PH27014 UI70667

Modules/Macros

  • CQSFM020 CQSFTRC0 CQSOFL50 BPETRTB0 CQSOFL10 CQSM1ENU CQSTROFL
    CQSFM110 CQSOBND0 CQSAABL0 CQSOFL00 BPEOSRV0 CQS$TST0
    

Publications Referenced
GC27679000    

Fix information

  • Fixed component name

    IMS V15

  • Fixed component ID

    5635A0600

Applicable component levels

  • R500 PSY UI70667

       UP20/07/22 P F007 ¢

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPH2","label":"IMS"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"15","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]

Document Information

Modified date:
22 December 2023