A fix is available
APAR status
Closed as program error.
Error description
The following scenario occurred: Unprotected OTMA CM1 transaction arrives on FE IMS. Transaction runs on BE IMS. Application uses INIT STATUS GROUPB, and gets a deadlock on a Full Function DB PCB DL/I call. DFSCPY00 sets PSTFUNCT=FUNCIRLB and issues DFSTMS0,FUNC=BACKOUT DFSTMS00 makes a DFSRRSI FUNC=DETERMINE_SYNCPT_COORD call to determine if other RMs have interest. This is a cascaded flow so two IMSs ( FE and BE ) have interest. IMS needs to make an ATRRUFS1 call here to determine if the two interests are in fact just IMS FE and IMS BE. But current DFSRRSI0 logic here only does this if PSTFUNCT=FUNCROLB. So this is skipped and we now proceed as if we are doing a ROLB in a protected ( distributed syncpoint ) path, as described in PK37614. That flow should abend U0711-34, but does not because it also checks for PSTFUNCT=FUNCROLB. DFSTMS00 then calls DFSFXC30 with PSTSYNCF=SFABORT. ABORTCON in the ROLB path will now clear PSTFLAG1,PST1OTMA. PSTFUNCT is now set to FUNCROLB and DFSFXC40 is called. DFSFXC40 will now take the PK37614 path, but the input message will be queued back to the FE via DFSAPPCQ and not DFSOTMAQ, although it has an OTMA prefix. The FE IMS receives this message and discards it after cutting a x'67D007' diagnostic log record, which is fortunate since original PST on BE IMS is still in control. The application gets a 'BC' status code and then ISRTs some error logging transactions via express PCBs. These messages end up with invalid UOWEs and OTMA prefixes ( from input message ) due to the prior processing flow. The root of the problem here is that the internal IRLB needs to be treated by DETERMINE_SYNCPT_COORD in the same way as a true ROLB, and ATRRUFS1 called in a cascaded flow to correctly determine the RMs with interests. If this had been done none of the downstream symptoms would have occurred.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: All IMS V10 APPC/OTMA users in a shared * * queues environment with AOS=Y * **************************************************************** * PROBLEM DESCRIPTION: Input message is incorrectly discarded * * following improper rollback processing * * on the backend IMS after the * * application issues INIT STATUS GROUPB * * and suffers a deadlock condition * **************************************************************** * RECOMMENDATION: INSTALL CORRECTIVE SERVICE FOR APAR/PTF * **************************************************************** Client is using OTMA synchronous transaction support in a shared queues environment with AOS=Y specified. An unprotected OTMA CM1 transaction arrives on the frontend IMS system and gets scheduled on the backend IMS system. The application running on the backend IMS issues the INIT STATUS GROUPB DLI call. As the application is running, it eventually encounters a deadlock on a full function DB PCB DLI call. During deadlock processing, DFSCPY00 is invoked to handle the pseudo-abend U0777. As part of this deadlock processing, DFSCPY00 sets PSTFUNCT=FUNCIRLB in order to initiate an internal rollback ( IRLB, internal ROLB ) before returning to the application with BC status code. DFSCPY00 issues DFSTMS0 FUNC=BACKOUT to perform the rollback processing. DFSTMS00 issues DFSRRSI FUNC=DETERMINE_SYNCPT_COORD to determine if other RMs (Resource Managers) have interest in order to determine whether or not RRS should be engaged to control the backout processing as the syncpoint controller. In this application environment, IMS should remain the syncpoint controller because the input transaction is not protected and the application did not issue an outbound protected transaction. Two IMSs ( FE and BE ) have interest because this is a cascaded flow. IMS needs to make the ATRRUFS1 call during DFSRRSI FUNC=DETERMINE_SYNCPT_COORD processing to determine the two interests are in fact just IMS FE and IMS BE, and, therefore IMS should remain the syncpoint controller. However, current DFSRRSI0 logic only issues ATRRUFS1 if PSTFUNCT=FUNCROLB. Thus, in this flow, ATRRUFS1 is skipped because PSTFUNCT=FUNCIRLB. We now incorrectly proceed as if we are processing a ROLB call from the application and the application has issued an outbound protected transaction so we think there are other interested RMs other than IMS. DFSTMS00 then calls DFSFXC30 with PSTSYNFC=SFABORT. The ROLB path in DFSFXC30 turns off the PST1OTMA bit in flag byte PSTFLAG1 prematurely for an application that has issued INIT STATUS GROUPB. DFSFXC30 then calls DFSFXC40 to perform rollback processing for the IMS TM component. Because we think there are other non-IMS RMs with interest, DFSFXC40 incorrectly invokes the logic to queue the input message back to the FE so the FE can requeue the input message for reschedule. Also, DFSFXC40 incorrectly queues the input message back to the FE using DFSAPPCQ instead of DFSOTMAQ because the DFSFXC30 logic previously mentioned had prematurely turned off the PST1OTMA bit causing the DFSFXC40 logic to incorrectly assume the input message came from APPC. The FE IMS receives the message and discards it after writing a X'67D007' diagnostic log record. Eventually, the application receives the BC status code and then ISRTs some error logging transactions using express PCBs. These messages end up with invalid UOWEs and OTMA prefixes ( from input message ) due to the prior incorrect processing flow described. The root of the problem is that the internal ROLB ( IRLB ) needs to be treated by DETERMINE_SYNCPT_COORD the same way as a true ROLB where ATRRUFS1 is called in a cascaded flow to correctly determine the RMs with interests.
Problem conclusion
GEN: KEYWORDS: SYSPLEXSQ *** END IMS KEYWORDS *** DFSRRSI0: Logic has been added to the DETERMINE_SYNCPT_COORD function in the cascaded transaction flow to check if PSTFUNCT=FUNCIRLB. If so, IMS will now invoke ATRRUSF1 to determine if there truly are non-IMS RMs with interest. Also, logic has been added to END_CONTEXT to skip the ATREND call if a cascaded transaction is being processed. We can get into this logic with a cascaded transaction due to the internal ROLB processing for the deadlock re-queuing the input message which causes bits LCR3MSGQ and LCR3ECTX to get turned on by DFSFXC40 due to the application issuing INIT STATUS GROUPB. The RRS ATREND call is invalid from the child UR of a cascaded transaction. DFSTMS00: Logic has been added throughout backout processing for the cascaded transaction with outbound protected flow to correctly check for PSTFUNCT=FUNCIRLB in addition to FUNCROLB and perform the correct action such as issuing abend U711 with a new reason code of X'35' ( U0711 ABENDU0711 ABENDU711 RSN35 RSN35x 711-35 U711-35 U0711-35 ) in order to properly terminate the application after backout processing for the deadlock ( U777 ABENDU777 ABENDU0777 STATUSBC ) and after the input message has been successfully requeued by the FE IMS for reschedule. DFSFXC30: The logic has been removed that was prematurely turning off bits PSTLU62 and PST1OTMA in flag byte PSTFLAG1 for an application that has issued INIT STATUS GROUPB before invoking DFSFXC40. Logic has been modified to ensure DFSRRSI FUNC=SET_SIDE_INFORMATION is invoked for a cascaded transaction undergoing internal ROLB processing due to INIT STATUS GROUPB. Because the input message is re-queued in this deadlock scenario, the RRS set side information call (ATRSUSI2) must be issued for the cascaded child UR undergoing the internal ROLB deadlock process in order to inform RRS that this child UR is complete so that when the next child UR that processes the requeued input message completes, the parent UR on the frontend IMS system will be able to successfully complete commit processing with RRS. DFSRRSIB: New equate IMS_CASCADE_IRLB_777_OUTB_PROT (=X'35') has been added for this new U711 reason code. **************** PUBS CHANGE *********************** Book: GC18971407 - Messages and Codes, Volume 3: IMS Abend Codes Under the U0711 description, under section "DFSTMS00:", add the following new reason code: X'35' An application processing a cascaded transaction on a back-end IMS system after issuing INIT STATUS GROUPB and after issuing an APPC outbound protected conversation has undergone internal rollback processing due to a deadlock condition that would have resulted in an FD status code or BC status code returned to the application. The application is terminated abnormally to avoid data integrity errors and the original transaction is rescheduled.
Temporary fix
********* * HIPER * *********
Comments
APAR Information
APAR number
PM47095
Reported component name
IMS V10
Reported component ID
5635A0100
Reported release
010
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2011-09-02
Closed date
2012-06-01
Last modified date
2012-07-02
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
PM60761 UK79278
Modules/Macros
DFSFXC30 DFSRRSIB DFSRRSI0 DFSTMS00
| GC18971407 |
Fix information
Fixed component name
IMS V10
Fixed component ID
5635A0100
Applicable component levels
R010 PSY UK79278
UP12/06/06 P F206
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"10.1","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSCVRBJ","label":"System Services"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"10.1","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
02 July 2012