DB2 Version 10.1 for Linux, UNIX, and Windows

Memory considerations for restart light

A limited amount of memory is reserved for recovery purposes when DB2® is started, to facilitate the recovery of members in restart light mode. This reserved restart light memory is predefined, thus improving recovery performance, as the memory is reserved and immediately ready for use during recovery.

The amount of memory that can be reserved for restart light recovery purposes on a given host is limited by the rstrt_light_mem database manager configuration parameter. The default value of rstrt_light_mem is AUTOMATIC, which means that DB2 automatically calculates a fixed upper bound for the amount of memory to be pre-allocated and reserved for restart light recovery purposes and sets the value when DB2 is started. DB2 calculates the value based on the settings for the instance_memory and numdb configuration parameters and the number of members on the host. The automatically calculated value ranges between 1 and 10 percent of the instance memory limit and is included in the total amount of instance memory. However, because the amount of reserved restart light memory can affect the performance of a resident member, users can adjust the restart light memory configuration to be appropriate for their specific workloads.

Displaying the reserved restart light memory

To display information about the total amount of memory allocated on a DB2 host, use the db2pd command with the -totalmem option. This information includes the amount of reserved restart light memory that is preallocated on the current DB2 host being accessed. To retrieve information for all hosts in a cluster, run db2pd on separate hosts in parallel. In the following example, db2pd is run on Host B, which has member 20.
 db2pd -totalmem

                           Controller     Memory        Current        HWM           Cached
                           Automatic      Limit         Usage          Usage         Memory
 Member 20                 Yes           25750080 KB   9031201 KB     9391744 KB    480064 KB
  Restart Light Memory   Yes         2575008 KB     64182 KB       69265 KB       5250 KB

 Total current usage: 9095383 KB
 Total cached memory:  485314 KB

Recovery of hidden buffer pools

For member crash recovery, a reduced memory model is used for the buffer pools. As the buffer pools typically use up the largest amount of memory in the database shared memory set, the allocation of large buffer pools is very time consuming. The reduced memory model improves the recovery performance because small recovery hidden buffer pools are allocated instead of large user-defined buffer pools, which are very expensive. Just as with the existing hidden buffer pools, there are four recovery hidden buffer pools, one of each size 4K, 8K, 16K, and 32K. However, the hidden buffer pools are always 16 pages in size; the recovery hidden buffer pools have a minimum size of 250 pages and can be larger, depending on the restart light memory set size and the buffer pool size calculations.

In the following example, two user buffer pools, BP1 with 100 pages and BP2 with 200 pages, have been created for database TESTDB. Member 0 is in restart light mode and Member 1 is not in restart light mode. The example includes a portion of the output from the following db2pd command. Member 1 shows the user-created buffer pools and the hidden buffer pools, although Member 0 only shows the 4 recovery hidden buffer pools.
db2pd -allmembers -db testdb -bufferpools 

Database Member 1--Database TESTDB--Active--Up 0 days 00:00:14--Date 08/12/2010 18:55:19

Bufferpools:
First Active Pool ID      1
Max Bufferpool ID         3
Max Bufferpool ID on Disk 3
Num Bufferpools           7

Address            Id   Name               PageSz     PA-NumPgs  BA-NumPgs  BlkSize    
0x00002AAADE443140 1    IBMDEFAULTBP       4096       1000       0          0          
0x00002AAADE45B080 2    BP1                4096       100        0          0          
0x00002AAADE45F060 3    BP2                4096       200        0          0          
0x00002AAADDB83CC0 4096 IBMSYSTEMBP4K      4096       16         0          0          
0x00002AAADDB13CC0 4097 IBMSYSTEMBP8K      8192       16         0          0          
0x00002AAADDB03CC0 4098 IBMSYSTEMBP16K     16384      16         0          0          
0x00002AAADE453140 4099 IBMSYSTEMBP32K     32768      16         0          0

...NumTbsp    PgsToRemov CurrentSz  PostAlter  SuspndTSCt Automatic
   3          0          1000       1000       0          False    
   0          0          100        100        0          False    
   0          0          200        200        0          False    
   0          0          16         16         0          False    
   0          0          16         16         0          False    
   0          0          16         16         0          False    
   0          0          16         16         0          False    
          

Database Member 0 -- Database TESTDB -- Active -- Up 0 days 00:00:13

Bufferpools:
First Active Pool ID      4096
Max Bufferpool ID         0
Max Bufferpool ID on Disk 3
Num Bufferpools           4

Address            Id   Name               PageSz     PA-NumPgs  BA-NumPgs  BlkSize    
0x00002AAAD9F946E0 4096 IBMSYSTEMBP4K      4096       9954       0          0          
0x00002AAADA743140 4097 IBMSYSTEMBP8K      8192       250        0          0          
0x00002AAADA733140 4098 IBMSYSTEMBP16K     16384      250        0          0          
0x00002AAADA74B080 4099 IBMSYSTEMBP32K     32768      250        0          0          

...NumTbsp    PgsToRemov CurrentSz  PostAlter  SuspndTSCt Automatic
   3          0          9954       9954       0          False    
   0          0          250        250        0          False    
   0          0          250        250        0          False    
   0          0          250        250        0          False    

Memory consumption during a restart light

Ideally, a restart allows the prompt recovery of a failed member on a host other than that member's home host without affecting that host's resident member. To achieve this, the reserved recovery memory for restart light is used first to perform database recovery operations. However, if the database recovery requires memory resources that exceed the restart light memory allocation, the restart light makes additional memory requests for free instance memory. These critical memory requests attempt to reduce current memory usage by the resident member. If there are still insufficient memory resources to finish the restart light process, DB2 requests additional memory from the operating system. If this occurs, all other noncritical memory requests fail until enough memory has been freed for the recovery operation. Applications running on the resident member can get out-of-memory failures, but the resident member still stays up. Once the recovery is completed, and the database or databases are consistent, any additional memory used by the guest member, beyond the originally reserved recovery memory, is freed.

These additional requests for memory resources for a restart light are temporary, but they can have a negative affect on the resident member's workload. If you find that the reserved recovery memory is insufficient, consider increasing the size of the rstrt_light_mem database manager configuration parameter. The parameter is configurable but not online, so any changes require either a global db2stop and db2start or, if you want to update rstrt_light_mem on a member-by-member basis (that is, you do not want to stop all of the members at the same time) you must stop and start each member and the instance on each member's host as follows:
db2 update dbm cfg using RSTRT_LIGHT_MEM 5

db2stop member 10
db2stop instance on hostA.torolab.ibm.com
db2start instance on hostA.torolab.ibm.com
db2start member 10

db2stop member 20
db2stop instance on hostB.torolab.ibm.com
db2start instance on hostB.torolab.ibm.com
db2start member 20 

Displaying the restart light memory consumption

There are two ways to display information about the total amount of memory being used on a host by the resident members and guest members:
  1. Running the db2pd command with the -totalmem option on each host. Take the following example:
    • Member 0 on Host A
    • Member 1 on Host B
    • Member 2 on Host C
    Member 0 fails over to Host B in restart light mode. The user runs the db2pd command on Host B with resident member 1, as guest member 0 is running in restart light mode. Then the user runs the db2pd command on Host C to display the memory for member 2. Member 0 is labeled as "guest" in the display output and the memory usage is displayed in kilobytes. db2pd does not require a database connection.
    Host B:
    $ db2pd -totalmem
    Total Memory Statistics in KB
    
                         Controller    Memory     Current       HWM       Cached
                         Automatic     Limit       Usage       Usage      Memory
    ==========================================================================+=====
    Member 0 (guest)         Yes       20677572      242496      244032       15168
    Member 1                 Yes       20677572      186624      697088       46080
    Restart Light Memory     Yes         839720      459392      459392       19200
    
    Total current usage:  888512
    Total cached memory:  80448
    Host C:
    $ db2pd -totalmem
    Total Memory Statistics in KB
    
                         Controller    Memory     Current       HWM       Cached
                         Automatic     Limit       Usage       Usage      Memory
    ==========================================================================+=====
    Member 2                 Yes       20153284     4728832     4728832     1055104
    Restart Light Memory     Yes         839720      689088      689088       28800
    
    Total current usage:  5417920
    Total cached memory:  1083904
  2. Using the SQL interface to display the memory usage for all members including members in restart light mode. It only needs to be run from one host. However, this method requires a database connection so it can't be run from a member in restart light mode because restart light members do not accept connections. This display does not label a member as "guest" and the memory usage is displayed in bytes.
    $ db2 'SELECT * FROM TABLE (SYSPROC.ADMIN_GET_MEM_USAGE()) AS T'
    
    DBPARTITIONNUM MAX_PARTITION_MEM    CURRENT_PARTITION_MEM PEAK_PARTITION_MEM
    -------------- -------------------- --------------------- --------------------
                 1          21173833728            4605345792           4605345792
                 0          21173833728             248840192            249888768
                 2          20636962816            4651352064           4651352064
    
      3 record(s) selected.