IBM Support

IC77640: STANDBY SHUTDOWN AFTER LOG RETRIEVE ATTEMPT FAILURE

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • When a storage manager is setup on standby, database activation
    will fail if it cannot retrieve a log file required for
    recovery. This behavior is expected on a standard database, but
    not on a HADR standby. Standby should move from local catchup to
    remote catchup in order to fetch all the log files.
    Usually, if the problem is hit, the return code from userexit
    program is 4 or 8.
    
    
    Here are the related db2diag.log entries:
    2011-05-21-15.22.48.976692-300 I227718A364        LEVEL: Warning
    PID     : 26476788             TID  : 5142        PROC : db2sysc
    0
    INSTANCE: db2inst1               NODE : 000
    EDUID   : 5142                 EDUNAME: db2logmgr (SAMPLE) 0
    FUNCTION: DB2 UDB, data protection services,
    sqlpgRetrieveLogFile, probe:4130
    MESSAGE : Started retrieve for log file S0249205.LOG.
    
    2011-05-21-15.27.09.530887-300 E229599A511        LEVEL: Error
    PID     : 26476788             TID  : 5142        PROC : db2sysc
    0
    INSTANCE: db2inst1               NODE : 000
    EDUID   : 5142                 EDUNAME: db2logmgr (SAMPLE) 0
    FUNCTION: DB2 UDB, data protection services,
    sqlpgUserexitLogAdminMsg, probe:1180
    MESSAGE : ADM1835E  The user exit program returned an error when
    retrieving log
              file "S0249205.LOG" to "/db2/SAMPLE/log_dir/NODE0000/"
    for database
              "SAMPLE".  The error code was "8".
    
    2011-05-21-15.27.09.544174-300 E230111A431        LEVEL: Warning
    PID     : 26476788             TID  : 5142        PROC : db2sysc
    0
    INSTANCE: db2inst1               NODE : 000
    EDUID   : 5142                 EDUNAME: db2logmgr (SAMPLE) 0
    FUNCTION: DB2 UDB, data protection services,
    sqlpgRetrieveLogFile, probe:4165
    MESSAGE : ADM1847W  Failed to retrieve log file "S0249205.LOG"
    on chain "23" to
              "/db2/SAMPLE/log_dir/NODE0000/".
    
    2011-05-21-15.27.10.055514-300 I230543A469        LEVEL: Error
    PID     : 26476788             TID  : 5913        PROC : db2sysc
    0
    INSTANCE: db2inst1               NODE : 000
    EDUID   : 5913                 EDUNAME: db2lfr (SAMPLE) 0
    FUNCTION: DB2 UDB, recovery manager, sqlplfrOpenExtentRetrieve,
    probe:225
    MESSAGE : Received error from db2logmgr on retrieve of log
    249205, rc:
    DATA #1 : Hexdump, 4 bytes
    0x0700000019FFD890 : 0000 0008
    ....
    
    2011-05-21-15.27.10.057751-300 I231013A478        LEVEL: Error
    PID     : 26476788             TID  : 16193       PROC : db2sysc
    0
    INSTANCE: db2inst1               NODE : 000         DB   :
    SAMPLE
    APPHDL  : 0-8                  APPID: *LOCAL.DB2.110521202116
    EDUID   : 16193                EDUNAME: db2redom (SAMPLE) 0
    FUNCTION: DB2 UDB, recovery manager, sqlpPRecReadLog, probe:1275
    RETCODE : ZRC=0x82100016=-2112880618=SQLPLFR_RC_RETRIEVE_FAILED
              "Log could not be retrieved"
    
    
    2011-05-21-15.27.10.437046-300 E233417A922        LEVEL:
    Critical
    PID     : 26476788             TID  : 4370        PROC : db2sysc
    0
    INSTANCE: db2inst1               NODE : 000         DB   :
    SAMPLE
    APPHDL  : 0-8                  APPID: *LOCAL.DB2.110521202116
    EDUID   : 4370                 EDUNAME: db2agent (SAMPLE) 0
    FUNCTION: DB2 UDB, base sys utilities,
    sqeLocalDatabase::MarkDBBad, probe:10
    MESSAGE : ADM14001C  An unexpected and critical error has
    occurred:
              "DBMarkedBad". The instance may have been shutdown as
    a result.
              "Automatic" FODC (First Occurrence Data Capture) has
    been invoked and
              diagnostic information has been recorded in directory
    
    "/db2/SAMPLE/db2dump/FODC_DBMarkedBad_2011-05-21-15.27.10.432900
    /".
              Please look in this directory for detailed evidence
    about what
              happened and contact IBM support if necessary to
    diagnose the
              problem.
    

Local fix

  • 1. Move userexit program (the file should be
    sqllib/bin/db2uext2) to another path or rename it.
    2. start standby: because userexit program is not available, we
    will get message similar to this:
    
    2011-07-05-20.49.52.653962-420 E123698E549         LEVEL: Error
    PID     : 12496                TID  : 47739391961408PROC :
    db2sysc
    INSTANCE: sfbao                NODE : 000
    EDUID   : 60                   EDUNAME: db2logmgr (HADRDB)
    FUNCTION: DB2 UDB, data protection services,
    sqlpgUserexitLogAdminMsg, probe:1170
    MESSAGE : ADM1834E  DB2 was unable to find the user exit program
    when
              retrieving log file "S0003169.LOG" to
              "/u/sfbao/sfbao/NODE0000/SQL00001/SQLOGDIR/" for
    database "HADRDB".
              The error code was "24".
    
    However, log manager should return to lfr instantly after it
    detects this error, and standby will start up eventually i.e.,
    move into remote catchup state.
    
    3. Once HADR goes into a peer mode, move or rename userexit
    program back
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * ALL                                                          *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to 9.7 FP6                                           *
    ****************************************************************
    

Problem conclusion

  • Problem first fixed on DB2 Version 9.7 Fix Pack 6
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC77640

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    970

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2011-07-20

  • Closed date

    2012-12-05

  • Last modified date

    2012-12-05

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • R970 PSN

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.7","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
05 December 2012