IBM Support

IT20105: CONTINUOUS PRIVATE MEMORY GROWTH ON DB2/AIX SYSTEMS WITH VERY LOW ACTIVITY

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • DB2 Instances running on AIX with very low activity (mostly
    idle) may experience a steady continuous growth in private
    memory usage.  This is due to an accounting anomaly in the DB2
    memory manager where new memory is allocated as opposed to
    existing decommitted/disclaimed memory being reused.  While most
    of the virtual memory growth is not backed by system memory (it
    has been decommitted using the AIX disclaim API), the malloc
    header for each allocation will be backed by a single operating
    system page (64K / medium page on AIX).  DB2 memory tools will
    not report this decommitted memory as it has been excluded for
    reuse/recommitment under DB2's "commit limit/size" for the DB2
    private memory area, and the DB2 memory manager scope does not
    include operating system process-level metadata.
    
    The problem is rare and only exists for mostly-idle instances
    where there is a high proportion of larger single allocations
    (typically from monitoring activity) relative to regular
    connection and SQL activity.  When larger single allocations are
    released, an inconsistency - which is usually temporary - may be
    created where the memory manager prefers to allocate new memory
    as opposed to reuse/recommit existing decommitted memory.  In
    environments with standard processing (connections, SQL
    activity), the excluded decommitted memory is quickly reabsorbed
    and reused through ongoing volatility in private memory
    usage/management.  However, in environments where SQL activity
    is minimal, yet there is regular specific activity (such as
    monitoring) triggering the problem, the decommitted memory (and
    associated virtual memory allocations) may accumulate.  The
    result is a gradual build up of private virtual memory
    allocations along with a much slower build-up of real system
    memory usage due to the cost of the associated process malloc
    metadata.
    
    The problem can be confirmed by the following :
    
    1. the system matches the description of vulnerable systems
    
    2. there is a continuous build-up of system memory usage for the
    database server process (db2sysc, indicated by the value of
    SZ/SIZE value in ps output :
    ps vg `db2pd -edus | awk '/db2sysc PID/ {print $NF}'`
    
    3. an analysis of the private memory usage for the db2sysc
    (database server) process shows a buildup of private memory
    virtual allocation along with a much smaller amount of system
    memory backing the allocation.
    
    svmon -P `db2pd -edus | awk '/db2sysc PID/ {print $NF}'`
    ...
    Vsid      Esid Type Description              PSize  Inuse   Pin
    Pgsp Virtual
    ...
      a82ba8        76 work text data BSS heap           m    994
    0    0     994
      c42b44        99 work text data BSS heap           m    970
    0    0     970
      dc2fdc        88 work text data BSS heap           m    964
    0    0     964
    ...
    
    Here we see a large number of private memory segments, each
    256MB, using the expected 64K "medium" page size.  Each has a
    capacity of 4096 64K pages.  But under the "Virtual" column, we
    see only a fraction of the pages are in use.  Normally as
    private memory allocation builds and is actually
    referenced/committed, the Virtual column will approach the full
    4096 capacity for at least most of the segments.  Under normal
    volatility these numbers may fluctuate, but with the problem
    represented by this APAR, the number of these "partially-backed"
    segments will continually grow.
    
    Note that tools such as
      db2pd -memblocks pid=<db2sysc pid>
      gencore
    should not be run when the process is in this state, as these
    tools will result in the full commitment of allocated process
    private memory, and drastically increase the system memory
    commitment level.  This could result in a system hang or crash
    due to paging space exhaustion.  In the least, it will result in
    the svmon -P output displaying that the private segments are
    fully backed by system memory (~4096 pages), making it harder to
    confirm this APAR.
    

Local fix

  • periodically recycle a mostly-idle DB2 instance.  the problem
    does not occur for instances with normal activity levels.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * ALL                                                          *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to DB2 11.1 Mod 2 Fix Pack 2 or higher               *
    ****************************************************************
    

Problem conclusion

  • First fixed in DB2 11.1 Mod 2 Fix Pack 2
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT20105

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    B10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2017-04-06

  • Closed date

    2017-06-23

  • Last modified date

    2017-06-23

  • APAR is sysrouted FROM one or more of the following:

    IT18865

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSEPGG","label":"DB2 for Linux- UNIX and Windows"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"11.1","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
29 June 2020