IBM Support

75 ways to demystify DB2: #19: Techtip : Slow crash recovery in DB2 due to replaying load pending operations

Technical Blog Post


Abstract

75 ways to demystify DB2: #19: Techtip : Slow crash recovery in DB2 due to replaying load pending operations

Body

During crash recovery, if many load operations have to be replayed, then it might take a long time to complete. The load operation will keep the pending operations in transaction logs no matter if it is recoverable load or non-recoverable. These pending operations will be kept in transaction log just before the table status is changed back from "load in progress" state to "normal" state. If the crash recovery starting point  - min(lowtranlsn, minbufflsn) is before load pending operations then db2 will have to replay these pending operations during crash recovery.

By design, during the crash recovery, db2 will purge all the bufferpool pages for each load pending operation. Thus, if there exists a large amount of load operations and large bufferpools before crash recovery, it may take a long time just waiting for bufferpool pages to be purged for many times.
 

Symptom:


db2pd -recovery -repeat 5 ** every 5 seconds

The "completedWork" column shows very slow byte processing, for example - around 7mb/ 5 sec that is 1.4mb/ sec below :

Database Partition 0 -- Database JSCTDM -- Active -- Up 0 days 00:07:30 -- Date 2014-05-13-23.31.26.438283

Recovery:
Recovery Status     0x00000C01
Current Log         S0000009.LOG
Current LSN         0168699ECB8B
Job Type            CRASH RECOVERY     
Job ID              1        
Job Start Time      (1399994638) Tue May 13 23:23:58 2014
Job Description     Crash Recovery     
Invoker Type        User    
Total Phases        2        
Current Phase       1        

Progress:
Address            PhaseNum   Description          StartTime           CompletedWork         TotalWork          
0x078000000025EBA8 1          Forward              Tue May 13 23:23:58 352084823 bytes     3977694480 bytes
0x078000000025ED30 2          Backward             NotStarted          0 bytes              3977694480 bytes

Database Partition 0 -- Database JSCTDM -- Active -- Up 0 days 00:07:35 -- Date 2014-05-13-23.31.31.510896

Recovery:
Recovery Status     0x00000C01
Current Log         S0000009.LOG
Current LSN         016869A87BA9
Job Type            CRASH RECOVERY     
Job ID              1        
Job Start Time      (1399994638) Tue May 13 23:23:58 2014
Job Description     Crash Recovery     
Invoker Type        User    
Total Phases        2        
Current Phase       1        

Progress:
Address            PhaseNum   Description          StartTime           CompletedWork          TotalWork          
0x078000000025EBA8 1          Forward              Tue May 13 23:23:58 352719733 bytes     3977694480 bytes           
0x078000000025ED30 2          Backward             NotStarted          0 bytes              3977694480 bytes           

Database Partition 0 -- Database JSCTDM -- Active -- Up 0 days 00:07:40 -- Date 2014-05-13-23.31.36.549922

Recovery:
Recovery Status     0x00000C01
Current Log         S0000009.LOG
Current LSN         016869B036CF
Job Type            CRASH RECOVERY     
Job ID              1        
Job Start Time      (1399994638) Tue May 13 23:23:58 2014
Job Description     Crash Recovery     
Invoker Type        User    
Total Phases        2        
Current Phase       1        

Progress:
Address            PhaseNum   Description          StartTime           CompletedWork        TotalWork          
0x078000000025EBA8 1          Forward              Tue May 13 23:23:58 353226395 bytes   3977694480 bytes           
0x078000000025ED30 2          Backward             NotStarted          0 bytes            3977694480 bytes   


A large bufferpool may be configured as follows, around 3GB :

Use following command to check it :

db2pd -bufferpools:

Num Bufferpools           6

Address            Id   Name               PageSz     PA-NumPgs  BA-NumPgs  BlkSize    NumTbsp              PgsLeft              CurrentSz            PostAlter            SuspndTSCt Automatic
 
0x070000004010EF80 2    BP32               32768      116508     0          0          6                    0                    116508               116508               0          False   

 

Trouble shooting:

Take a perf count trace by  db2trc on -perfcount -t

 

From perf count trace, db2trc.perffmt:

3195    (704 sec, 629469064 nanosec)     sqloWaitEDUWaitPost
2311    (207 sec, 507078242 nanosec)     sqlprFindQueue
74      (195 sec, 883402 nanosec)        sqloFDSETPoll
23      (140 sec, 409332 nanosec)        sqlorqueInternal
23      (140 sec, 674910 nanosec)        sqlorque2
4589    (24 sec, 161701958 nanosec)      sqlbPurgeObject
1180    (23 sec, 150892169 nanosec)      sqlu_process_pending_operation
4455    (23 sec, 690234808 nanosec)      sqluCheckObjectForRedo
4549    (23 sec, 934657142 nanosec)      sqlbPurgeAllPages
17632   (20 sec, 64221562 nanosec)       sqlpshrLogrecBuffGet
17527   (18 sec, 884269802 nanosec)      sqlpRecDbRedo
4       (18 sec, 521853496 nanosec)      sqlptppl
4       (18 sec, 521859255 nanosec)      sqlpRFWppl
4711    (18 sec, 548715007 nanosec)      sqlprProcDPSrec
1156    (6 sec, 153082066 nanosec)       sqluProcessPage0Update
4       (6 sec, 175060451 nanosec)       sqlu_change_obj_values
3257    (3 sec, 311695435 nanosec)       sqlorest

We can see the redo worker functions are waiting in sqlprFindQueue which means it waits for redo master to put the replaying log records in the working queue.

The redo master is purging the dirty pages from bufferpool when replaying the load pending operations.

Take several rounds of stack traces by db2pd -stacks -repeat 10 5 


There are two kinds of stack patterns for redom:

1>

sqlu_process_pending_operation
 |--sqlu_change_obj_values
      |--sqluCheckObjectForRedo

 0x0900000007EFCCB8 @71@sqlbPurgeAllPages__FP16SQLB_OBJECT_DESCiP12SQLB_GLOBALS + 0x16F0
  0x090000000622675C sqlbPurgeObject__FP16SQLB_OBJECT_DESCiP12SQLB_GLOBALS + 0xE4
  0x09000000062279D8 sqluCheckObjectForRedo__FP8sqeAgentP8SQLP_LSNP11SQLB_OBJECTT2PiT5PbT5iN29 + 0x2AC
  0x09000000078E7554 @147@sqlu_change_obj_values__FP8sqeAgentP18SQLU_OBJ_POOL_LISTP8SQLP_LSNiT4P15SQLD_RECOV_INFO + 0x42C
  0x09000000078E40BC sqlu_process_pending_operation__FP8sqeAgentiPcP8SQLP_LSNP12SQLP_PENLISTPUiP15SQLD_RECOV_INFO + 0x378
  0x09000000050CD76C sqlptppl__FP8sqeAgent + 0x468
  0x090000000534100C sqlpRFWppl__FP8sqeAgentP9SQLP_DBCBP11SQLP_TENTRYP10SQLP_FRAPPP10SQLP_FRSCBP9SQLP_LSN8Ui + 0x78
  0x0900000005353400 sqlpRecDbRedo__FP8sqeAgentP8SQLP_ACBP9SQLP_DBCBP10SQLP_FRAPPP11SQLP_TENTRYP16SQLPR_LOGREC_DISP10REDO_INPUT + 0x
 
 

2>

sqlu_process_pending_operation
 |--sqluCheckObjectForRedo

 
 0x0900000007EFB81C @71@sqlbPurgeAllPages__FP16SQLB_OBJECT_DESCiP12SQLB_GLOBALS + 0x254
  0x090000000622675C sqlbPurgeObject__FP16SQLB_OBJECT_DESCiP12SQLB_GLOBALS + 0xE4
  0x09000000062279D8 sqluCheckObjectForRedo__FP8sqeAgentP8SQLP_LSNP11SQLB_OBJECTT2PiT5PbT5iN29 + 0x2AC
  0x09000000078E4DB4 sqlu_process_pending_operation__FP8sqeAgentiPcP8SQLP_LSNP12SQLP_PENLISTPUiP15SQLD_RECOV_INFO + 0x1070
  0x09000000050CD76C sqlptppl__FP8sqeAgent + 0x468    

 

Resolving the issue:

Before crash recovery (activate db), set DB2_OVERRIDE_BPF in db2 registry variables to override default bufferpool to a smaller value.
E.g db2set DB2_OVERRIDE_BPF = 1,2000;2,1000
- which set bufferpool id 1 to 2000 pages and bufferpool id 2 to 1000 pages.
Then trigger crash recovery by db2 activate db <dbname>
After crash recovery is complete, unset DB2_OVERRIDE_BPF  by db2set DB2_OVERRIDE_BPF=
deactivate database, restart instance and re-connect to the database.

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm11141174