IBM Support

PM47095: INTERNAL ROLB DUE TO INIT STATUS GROUPB + DEADLOCK NOT HANDLED CORRECTLY IN SHARED QUEUES ENVIRONMENT

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The following scenario occurred:
    Unprotected OTMA CM1 transaction arrives on FE IMS.
    Transaction runs on BE IMS.
    Application uses INIT STATUS GROUPB, and gets a deadlock on
    a Full Function DB PCB DL/I call.
    DFSCPY00 sets PSTFUNCT=FUNCIRLB and issues DFSTMS0,FUNC=BACKOUT
    DFSTMS00 makes a DFSRRSI FUNC=DETERMINE_SYNCPT_COORD call to
    determine if other RMs have interest.
    This is a cascaded flow so two IMSs ( FE and BE ) have interest.
    IMS needs to make an ATRRUFS1 call here to determine if the
    two interests are in fact just IMS FE and IMS BE. But current
    DFSRRSI0 logic here only does this if PSTFUNCT=FUNCROLB.
    So this is skipped and we now proceed as if we are doing a
    ROLB in a protected ( distributed syncpoint ) path, as
    described in PK37614. That flow should abend U0711-34, but
    does not because it also checks for PSTFUNCT=FUNCROLB.
    DFSTMS00 then calls DFSFXC30 with PSTSYNCF=SFABORT.
    ABORTCON in the ROLB path will now clear PSTFLAG1,PST1OTMA.
    PSTFUNCT is now set to FUNCROLB and DFSFXC40 is called.
    DFSFXC40 will now take the PK37614 path, but the input
    message will be queued back to the FE via DFSAPPCQ and not
    DFSOTMAQ, although it has an OTMA prefix.
    The FE IMS receives this message and discards it after cutting
    a x'67D007' diagnostic log record, which is fortunate since
    original PST on BE IMS is still in control. The application
    gets a 'BC' status code and then ISRTs some error logging
    transactions via express PCBs. These messages end up with
    invalid UOWEs and OTMA prefixes ( from input message )
    due to the prior processing flow.
    The root of the problem here is that the internal IRLB needs
    to be treated by DETERMINE_SYNCPT_COORD in the same way as
    a true ROLB, and ATRRUFS1 called in a cascaded flow to
    correctly determine the RMs with interests. If this had been
    done none of the downstream symptoms would have occurred.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All IMS V10 APPC/OTMA users in a shared      *
    *                 queues environment with AOS=Y                *
    ****************************************************************
    * PROBLEM DESCRIPTION: Input message is incorrectly discarded  *
    *                      following improper rollback processing  *
    *                      on the backend IMS after the            *
    *                      application issues INIT STATUS GROUPB   *
    *                      and suffers a deadlock condition        *
    ****************************************************************
    * RECOMMENDATION: INSTALL CORRECTIVE SERVICE FOR APAR/PTF      *
    ****************************************************************
    Client is using OTMA synchronous transaction support in a shared
    queues environment with AOS=Y specified.  An unprotected OTMA
    CM1 transaction arrives on the frontend IMS system and gets
    scheduled on the backend IMS system.  The application running on
    the backend IMS issues the INIT STATUS GROUPB DLI call.  As the
    application is running, it eventually encounters a deadlock on a
    full function DB PCB DLI call.
    
    During deadlock processing, DFSCPY00 is invoked to handle the
    pseudo-abend U0777.  As part of this deadlock processing,
    DFSCPY00 sets PSTFUNCT=FUNCIRLB in order to initiate an internal
    rollback ( IRLB, internal ROLB ) before returning to the
    application with BC status code.  DFSCPY00 issues DFSTMS0
    FUNC=BACKOUT to perform the rollback processing.  DFSTMS00
    issues DFSRRSI FUNC=DETERMINE_SYNCPT_COORD to determine if other
    RMs (Resource Managers) have interest in order to determine
    whether or not RRS should be engaged to control the backout
    processing as the syncpoint controller.  In this application
    environment, IMS should remain the syncpoint controller because
    the input transaction is not protected and the application did
    not issue an outbound protected transaction.
    
    Two IMSs ( FE and BE ) have interest because this is a cascaded
    flow.  IMS needs to make the ATRRUFS1 call during DFSRRSI
    FUNC=DETERMINE_SYNCPT_COORD processing to determine the two
    interests are in fact just IMS FE and IMS BE, and, therefore IMS
    should remain the syncpoint controller.  However, current
    DFSRRSI0 logic only issues ATRRUFS1 if PSTFUNCT=FUNCROLB. Thus,
    in this flow, ATRRUFS1 is skipped because PSTFUNCT=FUNCIRLB. We
    now incorrectly proceed as if we are processing a ROLB call from
    the application and the application has issued an outbound
    protected transaction so we think there are other interested RMs
    other than IMS.
    
    DFSTMS00 then calls DFSFXC30 with PSTSYNFC=SFABORT. The ROLB
    path in DFSFXC30 turns off the PST1OTMA bit in flag byte
    PSTFLAG1 prematurely for an application that has issued INIT
    STATUS GROUPB.  DFSFXC30 then calls DFSFXC40 to perform rollback
    processing for the IMS TM component.
    
    Because we think there are other non-IMS RMs with interest,
    DFSFXC40 incorrectly invokes the logic to queue the input
    message back to the FE so the FE can requeue the input message
    for reschedule.  Also, DFSFXC40 incorrectly queues the input
    message back to the FE using DFSAPPCQ instead of DFSOTMAQ
    because the DFSFXC30 logic previously mentioned had prematurely
    turned off the PST1OTMA bit causing the DFSFXC40 logic to
    incorrectly assume the input message came from APPC.
    
    The FE IMS receives the message and discards it after writing a
    X'67D007' diagnostic log record.  Eventually, the application
    receives the BC status code and then ISRTs some error logging
    transactions using express PCBs.  These messages end up with
    invalid UOWEs and OTMA prefixes ( from input message ) due to
    the prior incorrect processing flow described.
    
    The root of the problem is that the internal ROLB ( IRLB ) needs
    to be treated by DETERMINE_SYNCPT_COORD the same way as a true
    ROLB where ATRRUFS1 is called in a cascaded flow to correctly
    determine the RMs with interests.
    

Problem conclusion

  • GEN:
    KEYWORDS:
     SYSPLEXSQ
    
    *** END IMS KEYWORDS ***
    DFSRRSI0:
    Logic has been added to the DETERMINE_SYNCPT_COORD function in
    the cascaded transaction flow to check if PSTFUNCT=FUNCIRLB. If
    so, IMS will now invoke ATRRUSF1 to determine if there truly are
    non-IMS RMs with interest.
    
    Also, logic has been added to END_CONTEXT to skip the ATREND
    call if a cascaded transaction is being processed.  We can get
    into this logic with a cascaded transaction due to the internal
    ROLB processing for the deadlock re-queuing the input message
    which causes bits LCR3MSGQ and LCR3ECTX to get turned on by
    DFSFXC40 due to the application issuing INIT STATUS GROUPB.
    The RRS ATREND call is invalid from the child UR of a cascaded
    transaction.
    
    
    DFSTMS00:
    Logic has been added throughout backout processing for the
    cascaded transaction with outbound protected flow to correctly
    check for PSTFUNCT=FUNCIRLB in addition to FUNCROLB and perform
    the correct action such as issuing abend U711 with a new reason
    code of X'35' ( U0711 ABENDU0711 ABENDU711 RSN35 RSN35x 711-35
    U711-35 U0711-35 ) in order to properly terminate the
    application after backout processing for the deadlock ( U777
    ABENDU777 ABENDU0777 STATUSBC ) and after the input message has
    been successfully requeued by the FE IMS for reschedule.
    
    
    DFSFXC30:
    The logic has been removed that was prematurely turning off bits
    PSTLU62 and PST1OTMA in flag byte PSTFLAG1 for an application
    that has issued INIT STATUS GROUPB before invoking DFSFXC40.
    
    Logic has been modified to ensure DFSRRSI
    FUNC=SET_SIDE_INFORMATION is invoked for a cascaded transaction
    undergoing internal ROLB processing due to INIT STATUS GROUPB.
    Because the input message is re-queued in this deadlock
    scenario, the RRS set side information call (ATRSUSI2) must be
    issued for the cascaded child UR undergoing the internal ROLB
    deadlock process in order to inform RRS that this child UR is
    complete so that when the next child UR that processes the
    requeued input message completes, the parent UR on the frontend
    IMS system will be able to successfully complete commit
    processing with RRS.
    
    
    DFSRRSIB:
    New equate IMS_CASCADE_IRLB_777_OUTB_PROT (=X'35') has been
    added for this new U711 reason code.
    
    
    
    **************** PUBS CHANGE ***********************
    
    Book: GC18971407 - Messages and Codes, Volume 3: IMS Abend Codes
    
    Under the U0711 description, under section "DFSTMS00:", add the
    following new reason code:
    
    X'35'
    An application processing a cascaded transaction on a back-end
    IMS system after issuing INIT STATUS GROUPB and after issuing an
    APPC outbound protected conversation has undergone internal
    rollback processing due to a deadlock condition that would have
    resulted in an FD status code or BC status code returned to the
    application.  The application is terminated abnormally to avoid
    data integrity errors and the original transaction is
    rescheduled.
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    PM47095

  • Reported component name

    IMS V10

  • Reported component ID

    5635A0100

  • Reported release

    010

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2011-09-02

  • Closed date

    2012-06-01

  • Last modified date

    2012-07-02

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    PM60761 UK79278

Modules/Macros

  • DFSFXC30 DFSRRSIB DFSRRSI0 DFSTMS00
    

Publications Referenced
GC18971407    

Fix information

  • Fixed component name

    IMS V10

  • Fixed component ID

    5635A0100

Applicable component levels

  • R010 PSY UK79278

       UP12/06/06 P F206 Ž

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"10.1","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSCVRBJ","label":"System Services"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"10.1","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
02 July 2012