IBM Support

PM53092: DSP0002I UNABLE TO OPEN RECON DURING BACKUP.RECON COMMAND.

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • One IMS batch job running the BACKUP.RECON command, which
    triggered the RECON quiesce processes in IMSPLEX environment.
    
    Right after the quiesce processes completed with MSGDSP1133I,
    another Image Copy job failed to OPEN the RECON by msgDSP0002I.
      DSP0002I  UNABLE TO OPEN RECON1   DATA SET
      DSP0002I  VSAM RETURN CODE=08  ERROR CODE=000
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All V11 IMS DBRC & BPE PRA users will be     *
    *                 affected.                                    *
    ****************************************************************
    * PROBLEM DESCRIPTION: RECON open failure RC=08 RSN=000 for    *
    *                      one of the DB Image Copy jobs that led  *
    *                      to the UABEND3312 failure for that      *
    *                      job. This occurred when this job was    *
    *                      started while a BACKUP.RECON job was    *
    *                      being run.                              *
    ****************************************************************
    * RECOMMENDATION: INSTALL CORRECTIVE SERVICE FOR APAR/PTF      *
    ****************************************************************
    The initial problem encountered by the customer was a RECON
    open failure with RC=08 RSN=000 for one of the DB Image Copy
    jobs that led to the job failure with ABENDU3312.
    This problem occurs when this IC job is starting while a
    BACKUP.RECON job is running.
    
    The problem is due to the fact that during initialization, DBRC
    will wait to be notified for not quiesced before proceeding
    with RECON open processing.  In this case, the RECONs were
    quiesced. While waiting, the new DBRC gets an end quiesce
    notification and processes it. Part of end quiesce processing
    opens any closed RECONs. Once complete, the initialization
    continues processing and tries to OPEN the RECONs again which
    result in a failure.
    
    While fixing this, we ran a stress test to emulate the customer
    processing where a large number of IC jobs are starting while
    a RECON quiesce is being issued. For the test, BACKUP.RECON was
    done as well as enabling and disabling parallel RECON access.
    This uncovered other QUIESCE related issues.
    
    Case 1: Get a NOT QUIESCED response even when quiesced.
    This problem occurs when SCI messages/notifications  are
    received or processed in an order other than the order they
    event occurred.  This can happen as order is not guaranteed,
    especially when multiple LPARs are involved. In this case,
    JOB1 and JOB2 are running when JOB1 sends a quiesce request.
    At the same time JOB3 starts and waits for a not quiesced
    notification. JOB2 gets the notification that JOB3 exists before
    the quiesced notification and sends JOB3 a not quiesced
    message allowing JOB3 to continue its RECON OPEN processing.
    JOB2 then processes the quiesce request from JOB1 and
    acknowledges it. JOB1 gets the quiesce acknowledgment before
    the 'new DBRC' notification for JOB3. So JOB1 considers the
    quiesce acknowledged by all and proceeds as if it owns the
    RECONs while JOB3 still has them opened.
    
    Case 2: Delete of other DBRC results in implicit not-quiesced
    response to a new DBRC even when RECONs are still quiesced.
    If a new DBRC receives a notification that another DBRC is gone,
    it will be treated as an implicit not quiesced notification. If
    there are no DBRCs that were started before the quiesce, the
    assumption is that any DBRC started during the quiesce would
    also be waiting on a not-quiesced. If everyone started prior
    to this is gone, no quiesce could be in progress. The problem
    is that the logic in DSPRLI00 sets all pre-existing DBRCs
    as 'STARTED DURING QUIESCE'.  So, if a quiesce is in progress
    when JOB3 starts, and JOB3 gets a notification that a DBRC is
    gone, it will be treated as an implicit not quiesced whether
    it was the quiescer who failed or any other DBRC.
    
    Case 3: ABENDU2480 will occur when there are more logical closes
    than opens. When a RECON loss notification is received, DBRC
    does a logical open and then the logical close call.  If logical
    open fails for some reason while logical close is still being
    called, then it will end up with the ABENDU2480 for the reason
    that more logical closes than opens.
    

Problem conclusion

  • GEN:
    KEYWORDS:
    
    *** END IMS KEYWORDS ***
    
    For the initial problem, code is added to DSPDSS10 to set an
    indication that DBRC is in the middle of SCI registration
    during physical open. Group services logic is changed to no
    longer invoke dsprclos quiesce or dspropn end-quiesce
    if this indication is set as they are unnecessary.
    
    Case 1: This problem requires a method for serializing start-up
    with quiesce. During quiesce processing, a SYSTEMS level enqueue
    with major name DSPURI03 and minor name equal to the plexname
    and group ID (PLEXGRP in note below) will be done to serialize
    the process.
    
    When getting a quiesce, DBRC will obtain an exclusive ENQ on
    DSPURI03.PLEXGRP after getting acknowlegements from all
    DBRCs that it knows about. Also, just prior to registering with
    SCI, a new DBRC will get a shared ENQ on DSPURI03.PLEXGRP.
    This will be released after physical OPEN processing completes.
    This ensures new DBRCs will wait prior to opening RECONs if
    a quiesce is being done. It also ensures (for all intents/
    purposes) that a new DBRC will be known to SCI everywhere by
    the time the quiescer gets an exclusive lock.
    
    To allow for quiesce race, the quiescer issues a quiesce and
    waits for responses from all DBRCs. This will handle any race
    condition as it does today.  Note that new DBRCs will not be
    allowed to do a quiesce if they hold the shared enqueue. Once
    everyone responded, the quiesce winner will get an EXclusive
    ENQ. This will wait for any new jobs it did not know about to
    complete open processing and release the SH ENQ. Once the ENQ
    is obtained, a query will be done to see if there are any DBRCs
    that it did not know about, and if so, these will be sent a
    quiesce notification and an acknowledgement will be waited on.
    
    Because there is a timing issue that could lead to a deadlock
    if the exclusive ENQ just waits, DBRC needs to be allowed to
    process other SCI notifications while waiting for the exclusive
    ENQ.  A separate TCB will be attached to do the EX ENQ and
    eventual DEQ.  This code will request an exclusive ENQ. Once it
    has it, it will enqueue a new brlsb (brlbf2EN) on DSPRLN_RQ
    which will get dequeued by DSPRLN00 and processed by DSPRLX10.
    In a BPE environment a new AWE would be created and enqueued to
    DSPBGS00 for processing. The TCB code then will wait on its ECB
    before proceeding.
    
    When the new brlsb (or AWE) is processed, a CSLSCQRY will be
    done to check for new DBRCs (e.g. the ones that were holding
    the shared ENQ we did not know about).  If any exists, the
    quiescer will wait.  It should get a notification that DBRC
    started (ready) and will waiting for a response. Once all have
    responded, the ENQ TCB's ECB will get posted to DEQ the global
    resource and RLX10 will WAIT on that TCB's term ECB to ensure
    it is done before proceed.
    
    Case 2:
    To fix this issue, existing DBRCs will no longer be considered
    as 'started during quiesce', but will be considered as started
    before a quiesce. To account for the situation where all new
    DBRCs started after a job failed while quiescing, any DBRC that
    determines none_started_before the quiesce is an implicit
    'not_quiesced'  will broadcast not_quiesced to the other DBRCs.
    Note that it only needs one DBRC to see none_started_before the
    quiesce (other than the one that failed) while the others will
    think that all of them started before the quiesce and will
    continue to wait so the one that detects it must do the
    broadcast.
    
    Case 3:
    To prevent more logical closes than opens, DSPRCLOS will not
    be called if DSPROPEN fails when processing RECON loss
    notifications.
    
    ----------------------------------------------------------------
    
    THE FOLLOWING MODULES WERE CHANGED:
    
    DFSBRLSB
      - Add BRLBF2GE for grp services related enqueue
    
    DSPBCODE
      - Add AWGS_GOTENQ (message that we got exclusive enqueue)
            AWGS_RELENQ (request to release shared ENQ)
    
    DSPBDS00
      - Fix the problem with too many closes than opens
    
    DSPBGS00
      - Add code to track DBRCs started during quiesce and handle
        implicit 'not quiesced'
      - if Quiesce acknowledgement message has no DSN, ignore DSN
        list.
      - do not invoke DSS if inOpen
      - In AWGS_QUACK and AWGS_DBRCDOWN processing add logic to
        request an exclusive ENQ if all known DBRCs have responded
        to a quiesce request.  If the EXclusive ENQ is already held
        when we detect all have responded, then releases the ENQ.
      - add new AWGS_GOTENQ support.
      - Add new AWGS_RELENQ support
      - If we get a quiesce request while waiting for not_quiesced,
        release the shared enqueue if we hold it.
      - If all DBRCs we knew about when started go down and we are
        waiting on not_quiesced, treat as implicit end quiesce or
        not quiesced and POST Group services INIT waiter.
      - Add routine, found_new_dbrc, to issue CSLSCQRY to determine
        if any new DBRCs exist.
      - Add routine, GetENQForQuiesce, to attach a TCB, DSPRLNQ0 to
        request an exclusive ENQ.
      - Add routine  ReleaseENQ to DEQ the shared ENQ or POST
        DSPRLNQ0 to DEQ the exclusive ENQ
      - Add code in subtract_a_dbrc to detect if any DBRC started
        before the quiesce.
      - Add routine, Reset_flags, to reset indication that DBRC was
        started during a quiesce.
    
    DSPBGS10:
      - Set DSPDGML_started_during_quiesce based on whether
        DSPRLN_quiesced is set or not. This prevents hang during
        quiesce.
    
    DSPBGS20:
      - Get Shared ENQ on new resource before registering with SCI
      - Add routine, GetENQForQuiesce, to get shared ENQ.
    
    DSPBGS50
      - Add support to broadcast a not_quiesced message
    
    DSPBIN20
      - Add code to set entry point for subtask DSPRLNQ0
    
    DSPCRTR0
      - Add code to set entry point for subtask DSPRLNQ0
    
    DSPDCLMD
      - Add DSPRLNQ0 and DSPETXR0 to DSPCINT0
    
    DSPDSS01
      - do not free deferred irc if quiesce race in BPE
      - make sure propogate quiesce reach when deferred end quiesce
    
    DSPDSS10
      - indicate in open
      - prevent quiesce if still own shared ENQ on new resource
      - invoke DSPGRPSV to DEQ shared ENQ
    
    DSPEF040
      - add support for missing brlbf2 functions (check for more)
    
    DSPGDB
      - add GDBRLIinOpen
      - Remove gdb_quiend_state - see if can manage in DSPRLN only!
    
    DSPGRPSV
      - add REL_DEQ function
    
    DSPPRAB
      - fix formatting of pras flags
    
    DSPRLI00
      - get shared ENQ on new resource
      - fix dspdgml_started_during_quiesce indication (should be
        off)
      - Add routine, GetENQForQuiesce, to get shared ENQ.
    
    DSPRLN
      - add several fields to coordinate quiesce processing
    
    DSPRLN00
      - added support to broadcast not_quiesce
      - when quiesce/quiclose, get global ENQ if no other DBRCs and
        wait for response before continuing
      - Add routine, GetENQForQuiesce, to attach a TCB, DSPRLNQ0 to
        request an exclusive ENQ.
    
    DSPRLXB0
      - Removed the issuance of MSGDSP1126I (moved to DSPRLX10)
      - Removed DSP1128 to prevent potential ABENDS138
    
    DSPRLX10
      - Added code to issue DSP1126 message for brlbf2qn.
      - Added code to issue DSP1128 message.
      - Do not call dsprclos if dspropen fails
      - do not call DSPRCLOS QUIESCE or DSPROPEN ENDQUI if
        GDBRLIInOpen is set.
      - if Quiesce Acknowldegement (brlbf2qk) message has no dsns,
        ignore dsn list.
      - If we get a quiesce request while waiting for not_quiesced,
        release the shared enqueue if we hold it.
      - In brlbf2qk or brlbf2dd processing, add logic to request an
        exclusive ENQ if all known DBRCs have responded to a quiesce
        request.  If the EXclusive ENQ is already held when we
        detect all have responded, then releases the ENQ.
      - add new brlb2ge (got enqueue) support.  Check if found a new
        DBRC.  If not, we own the RECONs and process as in
          brlbf2qk or brlbf2dd when own RECONs.
      - If all DBRCs we knew about when started go down and we are
        waiting on not_quiesced, treat as implicit end quiesce and
        broadcast not quiesced to other DBRCs.
      - Add routine, found_new_dbrc, to issue CSLSCQRY to determin
        if any new DBRCs exist.
      - Add routine, GetENQForQuiesce, to attach a TCB, DSPRLNQ0 to
        request an exclusive ENQ.
      - Add routine  ReleaseENQ to DEQ the shared ENQ or POST
        DSPRLNQ0 to DEQ the exclusive ENQ
    
    DSPTRAC1
      - add support for missing brlbf2 functions
    
    
    THE FOLLOWING MODULES WERE RECOMPILED:
    
    DSPCINT0 - Recompile for new parts DSPETXR0/DSPRLNQ0
    DSPEF00F - Recompile for DSPRLN change
    DSPEF01F - Recompile for DSPRLN change
    DSPEF03F - Recompile for DSPRLN change
    DSPEF0AF - Recompile for DSPGDB change
    DSPLOADR - Recompile for new modules DSPRLNQ0 & DSPEXTR0
    DSPRLXI0 - Recompile to pick up BRLBF2SS
    
    THE FOLLOWING ARE THE NEW PARTS:
    
    PSPETXR0
      - Propagate ABENDs originating in subtasks as U4095.
        ETXR routines are specified when a task is attached.
        They are entered when the task terminates.  If an
        IMS task uses this routine as an ETXR routine when
        attaching subtasks, then any ABENDs originating in
        the subtasks will result in a user 4095 ABEND of
        the attaching task.
    
    DSPRLNQ0
      - This module is run as a DBRC subtask to request an
        exclusive ESYSTEMS ENQ on a new resource for quiesce
        processing serialization.
    
        Once obtained, it will enqueue a brlsb or an AWE depending
        on the environment to let group services know the ENQ was
        obtained. It finally will wait until posted to to the DEQ.
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    PM53092

  • Reported component name

    IMS V11

  • Reported component ID

    5635A0200

  • Reported release

    100

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2011-11-29

  • Closed date

    2012-03-27

  • Last modified date

    2012-04-03

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    PM59495 UK77421

Modules/Macros

  • DFSBRLSB DSPBDS00 DSPBGS00 DSPBGS10 DSPBGS20
    DSPBGS50 DSPBIN20 DSPCINT0 DSPCRTR0 DSPDSS01 DSPDSS10 DSPEF0AF
    DSPEF00F DSPEF01F DSPEF03F DSPEF040 DSPETXR0 DSPLOADR DSPRLI00
    DSPRLNQ0 DSPRLN00 DSPRLXB0 DSPRLXI0 DSPRLX10 DSPTRAC1 DSPURI00
    HMK1100J
    

Fix information

  • Fixed component name

    IMS V11

  • Fixed component ID

    5635A0200

Applicable component levels

  • R100 PSY UK77421

       UP12/03/29 P F203 Ž

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"100","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSCVRBJ","label":"System Services"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"100","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
03 April 2012