IBM Support

PK45967: MPP HANGS ON BE IMS PROCESSING SYNCHRONOUS OTMA|APPC MESSAGE WHEN INPUT IS UNPROTECTED AND ESS PHASE I FAILS.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as fixed if next.

Error description

  • MPP hangs in Shared Queues environment when processing
    unprotected synchronous APPC or OTMA transaction on back end
    IMS, and transaction updated and ESS, and ESS phase I fails.
    The wait is at label TMSFF036 in DFSTMS00.
    Application completes and syncpoint begins.
      LUM Send completes.
      PST waits at TMSFF036 to be posted by FE IMS Commit.
      FE IMS Commits.
      RRS drives Only_Agent exit on BE IMS.
      Exit (PC Routine ) saves exit number in LCRE, POSTs the PST,
      reponds RC=LATER, and returns to RRS
      PST wakes up and does DFSRRSI PERFORM_SYNCPOINT
      This calls DFSRRSI0 under DEP TCB.
      DFSRRSI0 sets LCRERRS3,LCR3EXIT and uses LCURXITN to determine
      path. It's ONLY_AGENT.
      RRS1300->RRS1360->RRS1363->DFSTMS0 FUNC=COMMIT
      This drives DFSTMS00 recursively here, from the top.
      TMSFF110 -> DFSFXC30 gets called  For Phase I and calls
      DFSFESP0 because transaction had ESS work ( DB2 ).
      DFSFESP0 calls DB2 for Phase I and DB2 has a problem.
      DFSFESP0 sets PSTABTRM for U3055, issues DFS554
      and returns RC=8 to DFSFXC30.
    
      DFSFXC30 ->ABORTSP1 -> ABTIQPOK->ABORTSP->RETCODE ( because
      LCRERRS3,LCR3EXIT set ) -> back to DFSTMS00 bad RC.
      DFSTMS00 -> TMSFFX10 ( LCRERRS3,LCR3EXIT) -> DFSRRSI
      END_CONTEXT DFSRRSI RETAIN_INTERST-> TMS10000 (trace) -> exit
      bad RC.
      DFSRRSI0 - > RRS1366 -> RRS1390 (R15=ATRX_BACKOUT) ->
      DFSRRSI FUNC=POST_DEFERRED_UR calls DFSRRSI0 again:
      -> RRS1000 - >ATRPDUE with LCURRETC as return code ( BACKOUT)
      RRS1390 -> DFSTMS00
      DFSTMS00 -> loop back to TMSFF036 to see if complete
      (LCRERRS2,LCR2PH2C).
      Not set, so SCPWAIT again - Hang.
      RRS will not call us again for BACKOUT since this was
      ONLY_AGENT processing.
    The syncpoint design doesn't appear to handle ONLY_AGENT
    processing correctly when a Phase I failure occurs in any
    of the IMS-managed RMs ( like ESS ).
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All IMS V8 APPC/OTMA users in a shared       *
    *                 queues environment with AOS=Y.               *
    ****************************************************************
    * PROBLEM DESCRIPTION: MPP hangs in Shared Queues environment  *
    *                      when processing unprotected synchronous *
    *                      APPC or OTMA transaction on back end    *
    *                      IMS.                                    *
    *                                                              *
    *                      The wait is at label TMSFF036 in        *
    *                      DFSTMS00.                               *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    In an AOS=Y environment, a non-protected cascaded transaction is
    established by the Front End (FE) IMS system and routed to the
    Back End (BE) IMS system for processing.  In this instance, IMS
    is the only Resource Manager (RM) expressing interest in this
    transaction but RRS is the syncpoint coordinator.
    
    On the BE IMS system, the application program processing this
    cascaded transaction does some IMS and DB2 work (using the IMS
    ESAF) and is ready to commit. The associated PST waits in IMS
    syncpoint (label TMSFF036) until posted by the RRS Commit
    request initiated by the FE IMS.
    
    As IMS is the sole RM participant, RRS drives IMS's Only Agent
    exit on the BE IMS system. A LATER reply is given to RRS and
    the waiting PST on the BE side is posted to perform its
    syncpoint.
    
    However, during Phase 1 Commit processing, DB2 is unable to
    commit due to a problem, resulting in an ABENDU3055 condition.
    This Phase 1 failure causes IMS to perform syncpoint ABORT so
    that the unit of recovery will be backed out.  At the conclusion
    of Abort processing, IMS issues an DFSRRSI FUNC=POST_DEFERRED_UR
    call to RRS, with a BACKOUT return code, advising that its prior
    Commit request was instead backed out. Control is then returned
    to DFSTMS00 (label TMSFF036), which checks to see if Phase 2 of
    syncpoint has been done (LCR2PH2C) and SCP waits if not.
    
    Unfortunately, during Abort processing from the Phase 1 error,
    the LCR2PH2C bit was never set. Hence, the PST remains in an
    endless SCP wait, causing the reported hang condition.
    

Problem conclusion

Temporary fix

Comments

  • This IMS V8 apar is being closed as FIN.
    
    However, the solution to the problem reported by this apar will
    be addressed in the V9 release of the IMS product, via apar
    PK53872 .
    

APAR Information

  • APAR number

    PK45967

  • Reported component name

    IMS V8

  • Reported component ID

    5655C5600

  • Reported release

    800

  • Status

    CLOSED FIN

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2007-05-25

  • Closed date

    2008-02-07

  • Last modified date

    2008-02-07

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    PK53872

Fix information

  • Fixed component name

    IMS V8

  • Fixed component ID

    5655C5600

Applicable component levels

  • R800 PSN

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSCVRBJ","label":"System Services"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"800","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
07 February 2008