IBM Support

LI74256: HIGH SYSTEM OR DB2 MEMORY USAGE DUE TO SORTHEAP OVERALLOCATION B Y HASH JOIN

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • High memory usage or exhaustion has been occurring in rare
    scenarios starting in version 9 due to hash join drastically
    overallocation sortheap.  One case is when a query contains a
    hash join comparing a very large number of similar values.
    
    For this scenario to occur, the sort heap memory usage for a
    single sortheap will climb to at least 4GB regardless of the
    configured SORTHEAP setting - unless an error occurs first.
    
    Note this sortheap usage will not show up in the standard sort
    memory counters, but will show up in the sortheap memory pool
    reports (in application snapshot, memory tracker).
    
    On version 9.5, if INSTANCE_MEMORY becomes exhausted, the
    following db2diag.log entries will be seen :(note the
    excessively large physical heap size.  The large configured heap
    size is normal as it means "unlimited" - hash join is supposed
    to stay within the configured SORTHEAP limit on it's own)
    
    FUNCTION: DB2 UDB, SQO Memory Management,
    sqloMemLogPoolConditions, probe:30
    DATA #1 : <preformatted>
    Out of memory failure for Sort Heap on node 4.
    Requested block size           : 4000 bytes.
    Physical heap size             : 1091633152 bytes.
    Configured heap size           : 281474976645120 bytes.
    Unreserved memory used by heap : 0 bytes.
    Unreserved memory left in set  : 0 bytes.
    ...
    FUNCTION: DB2 UDB, runtime interpreter, sqlri_hsjnGetTupleBlock,
    probe:10
    MESSAGE : Virtual memory exhausted.
    
    The easiest way to monitor for this condition is by using the
    snap_get_agent_memory_pool routine as follows :
    db2 "SELECT AGENT_ID, AGENT_PID, DBPARTITIONNUM, POOL_ID,
    POOL_CUR_SIZE FROM
    TABLE(SNAP_GET_AGENT_MEMORY_POOL('SAMPLE',-2)) A WHERE
    POOL_ID='SORT' AND POOL_CUR_SIZE > 20000". Replacing 'SAMPLE'
    with the database name and '20000' with a value larger than
    SORTHEAP, eg. 2 * SORTHEAP.  When using STMM, the value may have
    to be larger - monitor the db2diag.log for normal SORTHEAP
    settings and use a value above that.
    

Local fix

  • It is possible that query plans may be improved to avoid the
    conditions required for this APAR by using the runstats utility
    on the tables involved in the query - including distribution
    statistics.
    

Problem summary

  • USERS AFFECTED:
    ALL
    
    PROBLEM DESCRIPTION:
    HIGH SYSTEM OR DB2 MEMORY USAGE DUE TO SORTHEAP OVERALLOCATION
    BY HASH JOIN
    

Problem conclusion

  • Problem was first fixed in Version 9.5 Fix Pack 5
    

Temporary fix

Comments

APAR Information

  • APAR number

    LI74256

  • Reported component name

    DB2 UDE ESE LIN

  • Reported component ID

    5765F4104

  • Reported release

    950

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2009-03-17

  • Closed date

    2010-01-18

  • Last modified date

    2012-02-03

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    LI74257 IC63231

Fix information

  • Fixed component name

    DB2 UDE ESE LIN

  • Fixed component ID

    5765F4104

Applicable component levels

  • R950 PSY

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"950","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
03 February 2012