IBM Support

PI97598: IEC031I D37-04 ON SECONDARY OLDS IN FDBR REGION OR DURING ERE PROCESSING 18/05/25 PTF PECHANGE

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • After IMS terminated abnormally, IMS needs to perform a Log
    Recovery process against the latest OLDS. IMS needs to add two
    log records (06/48), then close the OLDS properly. This occurs
    either in the FDBR region or in the followed ERE processing.
    
    During this procesing, IMS received a D37 error on the secondary
    OLDS, IEC031I D37-04,IFG0554P,imsctl,,DFSOLSxx...
    IMS successfully added the two records into primary OLDS, but
    failed to add the records into secondary OLDS.
    
    As a response to the D37 error, IMS incorrectly marked the
    primary OLDS in error and generated an archive JCL against the
    secondary OLDS. Since we missed two log records in the secondary
    OLDS, this also caused a problem to the log sequence number
    (LSN) in SLDS.
    
    If single OLDS logging is being used, then after the D37, the
    /ERE
    fails with ABENDU005-1C
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All IMS V15 users with PI75575/UI54239 applied.              *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * While closing the log as part of FDBR processing, XRF        *
    * takeover, or emergency restart, IMS can either abend or      *
    * record incorrect information about an OLDS in DBRC.          *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * INSTALL CORRECTIVE SERVICE FOR APAR/PTF                      *
    ****************************************************************
    While closing the log as part of FDBR processing, XRF takeover,
    or emergency restart processing, the IMS Logger tries to write a
    type06 and type48 record to the OLDS being closed. An incorrect
    argument to the BSAM POINT macro was causing this WRITE to be
    attempted past the end of the OLDS, causing the WRITE to get an
    ABENDD37. What happens after the ABENDD37 depends on whether
    single or dual OLDS logging is in place. With single OLDS
    logging, ABENDU0005 RSN0000001C occurs after msgDFS0738X and
    msgDFS0738I are produced.
    
    If dual logging is being used, the ABENDD37 occurs on the
    secondary OLDS, but msgDFS0738I is produced, indicating a CLOSE
    error on the _primary_ OLDS, and restart appears to complete
    successfully. However, since it was incorrectly determined that
    the error occured on the primary, the secondary OLDS is the one
    used by archive for the PRILOG record in DBRC. But since it's
    the secondary OLDS that got the error, the LSN range for that
    log is incorrect as it doesn't have the last 2 log records,
    which are the ones whose WRITE got the ABENDD37.
    
    Also, if the ABENDD37 happens during FDBR processing, an ensuing
    emergency restart of the failed IMS can take ABEND0C9 in
    DFSFDLY1 because of RLWECNT being 0. The root cause of the
    ABEND0C9 is the ABENDD37, which is addrsesed by this APAR.
    

Problem conclusion

  • So this APAR addresses 3 problems in this area.
    
    1. The cause of the ABENDD37 is a bad POINT argument in code
    added by PI75575 to address an 'out of sync' condition when
    zHyperWrite is used to where an update could make it to the
    remote copy of the data set, but not on the primary copy in a
    small timing window. Upon restarting, IMS must re-write the last
    255 blocks of the OLDS right back in place so that they can be
    replicated to the new secondary, which was the old primary that
    might not have gotten the WRITEs initially. Note that the terms
    'primary' and 'secondary' in this context refer to the same data
    set in a replication environment, not primary and secondary
    OLDS, which are different data sets.
    
    For the ABENDD37 to occur, IMS must be trying to write past the
    end of the OLDS. IMS tries to write past the end of the data set
    if the number of blocks read by restart times 2, minus 255,
    exceeds the capacity of the OLDS. Note that restart only reads a
    subset of the blocks on the OLDS.
    
    2. Upon getting the ABENDD37 on the secondary OLDS, it appeared
    that a CLOSE error happened on the primary OLDS. A CLOSE error
    was not actually happening, but it appeared that it was
    happening because a flag was on indicating that the abend exit
    was driven after the CLOSE. This flag being on was residual and
    was actually turned on by the WRITE to the secondary that got
    the ABENDD37, but was not checked after that WRITE, so the next
    operation for an OLDS (in this case closing the primary OLDS),
    made it appear that the CLOSE operation resulted in the abend
    exit being driven.
    
    The WRITE to the secondary that got the ABENDD37 still resulted
    in a good POST code from BSAM, but this APAR adds code to now
    check whether or not the abend exit was driven for the WRITE.
    
    3. While re-creating the reported problem, it was also
    discovered that addressing the 'out of sync' condition for
    zHyperWrite processing was only happening on the secondary OLDS
    and not the primary OLDS in a dual OLDS logging environment.
    This APAR now causes both OLDS go through the code that
    addresses the 'out of sync' condition.
    
    Note that the log recovery utility (DFSULTR0) also has code to
    address the 'out of sync' condition for zHyperWrite (as it also
    closes the log), but does not have these errors.
    
    Also note that during restart, it's possible that IMS might not
    be able to determine whether or not the abending IMS was using
    zHyperWrite for the OLDS, so an IMS does not have to have
    ZHYPERWRITE=(OLDS=YES) specified to be exposed to this.
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    PI97598

  • Reported component name

    IMS V15

  • Reported component ID

    5635A0600

  • Reported release

    500

  • Status

    CLOSED PER

  • PE

    YesPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-05-04

  • Closed date

    2018-06-15

  • Last modified date

    2018-07-02

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UI56597

Modules/Macros

  • DFSFDLU0 DFSFDLY0
    

Fix information

  • Fixed component name

    IMS V15

  • Fixed component ID

    5635A0600

Applicable component levels

  • R500 PSY UI56597

       UP18/06/20 P F806 ­

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPH2","label":"IMS"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"15","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]

Document Information

Modified date:
04 January 2024