IBM Support

PK77695: A DEADLOCK BETWEEN TWO BATCH JOBS WHEN THE RECON2 BECOMING FULL

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Customer reported a RECON deadlock problem between two batch
    jobs. The first batch job running with RECON2 and RECON3
    allocated. When the RECON2 becoming full, it tries to allocate
    the RECON1 to copy the RECON3 over.
    But the RECON1 is not available because another batch job is
    submitted with DBRC=Y. The second batch job has allocated the
    RECON1 and RECON2, it needs to allocate the RECON3 to determine
    the RECON configuration. But the RECON3 is still in used by the
    first batch job, hence a deadlock occurred.
    A display of the GRS information show the following deadlock,
    D GRS,C
    S=SYSTEMS DSPURI01 RECONP3.RECON3
    SYSNAME   JOBNAME   ASID   TCBADDR  EXC/SHR   STATUS
    SYV       RAP122    01DA   006EBD08 EXCLUSIVE OWN
    SYV       LIF121    01BF   006EBD08 EXCLUSIVE WAIT
    S=SYSTEM DSPURI02 RECONP1.RECON1
    SYSNAME   JOBNAME   ASID   TCBADDR  EXC/SHR   STATUS
    SYV       LIF121    01BF   006EBD08 EXCLUSIVE OWN
    SYV       RAP122    01DA   006EBD08 EXCLUSIVE WAIT
    .
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of IMS Version 10 Release 1.       *
    ****************************************************************
    * PROBLEM DESCRIPTION: DBRC hang occurs during RECON           *
    *                      reconfiguration due to deadlock on      *
    *                      ENQUEUE of DSPURI01 and DSPURI02        *
    *                      QNAMEs.                                 *
    ****************************************************************
    * RECOMMENDATION: INSTALL CORRECTIVE SERVICE FOR APAR/PTF      *
    ****************************************************************
      The DBRC hang occurs because there is a deadlock involving the
    ENQ on the DSPURI02 QNAME and the RESERVE/ENQUEUE on the
    DSPURI01 QNAME.
    .
      The RECON configuration prior to the deadlock was COPY1=RECON2
    COPY2=RECON3 and SPARE=RECON1.  JOB1 is running and while
    processing a request, it will get an ENQ on QNAME=DSPURI02,
    RNAME=recon2dsn and a RESERVE on QNAME=DSPURI01 RNAMEs recon2dsn
    and recon3dsn.
    .
      When JOB2 starts, its initialization processing follows the DD
    order.  It will ENQ on DSPURI02.recon1dsn and get it.  It will
    then RESERVE DSPURI01.recon1dsn (and get it) but then waits when
    it tries to get the RESERVE for DSPURI01.recon2dsn.
    .
      When JOB1 encounters an error on RECON2, it dequeues the
    DSPURI02.recon2dsn and DSPURI01.recon2dsn resources.  It then
    tries to reserve use of the SPARE RECON1.  Since it no longer
    holds a DSPURI02 enqueue, it will attempt to get an ENQ on
    DSPURI02.recon1dsn.  But JOB2 holds this. JOB2, in the mean
    time will get the DSPURI01.recon2dsn reserve and then attempt to
    get the DSPURI01.recon3dsn reserve which JOB1 holds. This will
    result in a deadlock condition.
    .
      Similar variations that lead to deadlocks can occur if RECON2
    is the spare.
    .
      The problem is related to the DSPURI02 enqueue.  Once a DBRC
    request has obtained it so that it can then reserve RECONs
    during logical open, it should never try to get DSPURI02 again
    within the same request even if it dequeues the enqueue.
    .
      DSPURI02 is just a gate to get to reserve for batch jobs.
    Once the gate is opened there is no need to re-open the gate as
    long as a RESERVE is held on at least one resource.
    

Problem conclusion

  • AIDS: RIDS/DBRC RIDS/SER DBRC SER
     GEN:
    KEYWORDS:
    
    *** END IMS KEYWORDS ***
      DBRC will no longer attempt to get an ENQUEUE on the DSPURI02
    QNAME if it has already obtained it in the request.
    .
      DSPRSV00 is changed to add an extra condition when getting the
    DSPURI02 enqueue.  It will now only get it if the reserve was
    for ALL RECONs.  This is only done during first logical open,
    true open or true close.
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    PK77695

  • Reported component name

    IMS V10

  • Reported component ID

    5635A0100

  • Reported release

    010

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2008-12-17

  • Closed date

    2009-01-09

  • Last modified date

    2009-06-01

  • APAR is sysrouted FROM one or more of the following:

    PK77569

  • APAR is sysrouted TO one or more of the following:

    UK43027

Modules/Macros

  • DSPRSV00
    

Fix information

  • Fixed component name

    IMS V10

  • Fixed component ID

    5635A0100

Applicable component levels

  • R010 PSY UK43027

       UP09/01/16 P F901 Ž

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"10.1","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSCVRBJ","label":"System Services"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"10.1","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
01 June 2009