Restart events that might occur in Db2 pureScale environments

Information on types of restart events that might occur in Db2 pureScale. Restart Light diag message should not be confused with other types of restart messages.

The following events are types of restart events that might occur in Db2 pureScale environments.
  1. Member Local Restart
    2009-11-08-18.56.29.767906-300 I151976017E340       LEVEL: Event
    PID : 26319                    TID : 46989525792064 KTID : 26319
    PROC : db2rocm 1
    INSTANCE: inst1                NODE : 001
    HOSTNAME: hostA
    FUNCTION: Db2, high avail services, sqlhaStartPartition, probe:1636
    DATA #1 : <preformatted>
    Successful start of member
    
    2009-11-08-18.56.29.769311-300 I151976358E373       LEVEL: Event
    PID : 26319                    TID : 46989525792064 KTID : 26319
    PROC : db2rocm 1
    INSTANCE: inst1                NODE : 001
    HOSTNAME: hostA
    FUNCTION: Db2, high avail services, db2rocm_main, probe:1381
    DATA #1 : String, 30 bytes
    db2rocm 1 Db2 inst1 1 START
    DATA #2 : String, 7 bytes
    SUCCESS 
    

    Note the Successful start of member string from function sqlhaStartPartition, which indicates a Local Restart has initiated.

  2. Member Restart Light
    2009-08-27-23.37.52.416270-240 I6733A457            LEVEL: Event
    PID     : 1093874              TID  : 1             KTID : 2461779
    PROC    : db2star2
    INSTANCE:                      NODE : 001
    HOSTNAME: hostC
    EDUID   : 1
    FUNCTION: Db2, base sys utilities, DB2StartMain, probe:3368
    MESSAGE : Idle process taken over by member
    DATA #1 : Database Partition Number, PD_TYPE_NODE, 2 bytes
    996
    DATA #2 : Database Partition Number, PD_TYPE_NODE, 2 bytes
    1

    Note the Idle process taken over by member which indicates that a recovery idle process was activated to perform a Restart Light where member 1 failed over to hostC.

  3. Member crash recovery
    2009-11-09-13.55.08.330120-300 I338293E831           LEVEL: Info
    PID     : 24616                TID  : 47881260099904 KTID : 24731
    PROC    : db2sysc 0
    INSTANCE:                      NODE : 000          DB   : 
    APPHDL  : 0-52                 APPID: *N0.DB2.091109185417
    EDUID   : 24                   EDUNAME: db2agent (     ) 0
    FUNCTION: Db2, data protection services, sqlpgint, probe:430
    DATA #1 : <preformatted>
    Crash recovery decision:
    Is this member consistent? No
    Is the database marked consistent in the GLFH? No
    Is the database restore pending? No
    Is the database rollforward pending? No
    Were the CF structures valid on startup? No
    Are we performing an offline restore? No
    Are we performing group crash recovery? No
    Are we performing member crash recovery? Yes
    Are we initializing the LFS for the group? No
    
    ..............
    
    2009-11-09-13.55.09.002292-300 I50967394E495        LEVEL: Info
    PID     : 24616                TID : 47881260099904 KTID : 24731
    PROC    : db2sysc 0
    INSTANCE:                      NODE : 000           DB : 
    APPHDL  : 0-52                 APPID: *N0.DB2.091109185417
    AUTHID  :  
    EDUID   : 24                   EDUNAME: db2agent (     ) 0
    FUNCTION: Db2, recovery manager, sqlpresr, probe:3170
    DATA #1 : <preformatted>
    Crash recovery completed. Next LSN is 00000000006DB3D1

    The Are we performing member crash recovery? Yes string shows a members crash recovery event occurred.

  4. Group Restart
  5. Group Crash Recovery
    2009-11-09-22.24.41.330120-300 I338293E831           LEVEL: Info
    PID     : 10900                TID  : 46949294139712 KTID : 18929
    PROC    : db2sysc 2
    INSTANCE:                      NODE : 002          DB   : 
    APPHDL  : 2-52                 APPID: *N2.DB2.091110032438
    EDUID   : 24                   EDUNAME: db2agnti (        ) 2
    FUNCTION: Db2, data protection services, sqlpgint, probe:430
    DATA #1 : <preformatted>
    Crash recovery decision:
    Is this member consistent? No
    Is the database marked consistent in the GLFH? No
    Is the database restore pending? No
    Is the database rollforward pending? No
    Were the CF structures valid on startup? No
    Are we performing an offline restore? No
    Are we performing group crash recovery? Yes
    Are we performing member crash recovery? No
    Are we initializing the LFS for the group? Yes
    
    2009-11-09-22.24.41.540562-300 E365262E436           LEVEL: Info
    PID     : 10900                TID  : 46949294139712 KTID : 18929
    PROC    : db2sysc 2
    INSTANCE:                      NODE : 002          DB   : 
    APPHDL  : 2-52                 APPID: *N2.DB2.091110032438
    EDUID   : 24                   EDUNAME: db2agnti (        ) 2
    FUNCTION: Db2, base sys utilities, sqledint, probe:3559
    MESSAGE : Crash Recovery is needed.
    
    2009-11-09-22.24.53.177348-300 E431539E458           LEVEL: Info
    PID     : 10900                TID  : 46949294139712 KTID : 18929
    PROC    : db2sysc 2
    INSTANCE:                      NODE : 002          DB   : 
    APPHDL  : 2-52                 APPID: *N2.DB2.091110032438
    EDUID   : 24                   EDUNAME: db2agnti (        ) 2
    FUNCTION: Db2, recovery manager, sqlpresr, probe:210
    MESSAGE : ADM1527I  Group crash recovery has been initiated.
    
    ..............
    
    2009-11-09-22.25.14.432113-300 E552559E467           LEVEL: Info
    PID     : 10900                TID  : 46949294139712 KTID : 18929
    PROC    : db2sysc 2
    INSTANCE:                      NODE : 002          DB   : 
    APPHDL  : 2-52                 APPID: *N2.DB2.091110032438
    EDUID   : 24                   EDUNAME: db2agnti (        ) 2
    FUNCTION: Db2, recovery manager, sqlpresr, probe:3110
    MESSAGE : ADM1528I  Group crash recovery has completed successfully.
    • The following lines from the message log indicate a Group Crash Recovery event occurred.
      Are we performing group crash recovery? Yes
      Group crash recovery has been initiated.
      Group crash recovery has completed successfully.