IBM Support

LI76564: THREADED DB2FMP PROCESS LOOPS IN ITS SIGNAL HANDLER WHEN IT RECEIVES NESTED SIGNALS

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • A threaded db2fmp process loops in its signal handler when it
    receives nested signals. Once that happens, db2agent EDUs
    requesting new fenced stored procedures to run can wait
    forever on SQLO_LT_sqlerFmpRow__ipcLatch. You can find
    the following from db2pd -latches output:
    
    
    Database Partition 14 -- Active -- Up 20 days 23:46:45 -- Date
    05/31/2011 10:52:58
    Address            Holder     Waiter     Filename
    LOC        LatchType
    0x07800000007EDE64 0          32946      Unknown
    6931       SQLO_LT_sqlerFmpRow__ipcLatch
    0x07800000007EDE64 0          33210      Unknown
    6931       SQLO_LT_sqlerFmpRow__ipcLatch
    0x07800000007EDE64 0          36542      Unknown
    6931       SQLO_LT_sqlerFmpRow__ipcLatch
    
    And the db2agent holding SQLO_LT_sqlerFmpRow__ipcLatch is
    waiting for a response via semaphore from the spinning db2fmp
    process.  It has a stack similar to following one:
    
    
    <StackTrace>
    -------Frame------ ------Function + Offset------
    0x0900000000247820 semop + 0xC0
    0x0900000010AF631C sqloSSemP + 0x1F8
    0x0900000010D00968 @53@sqlccipcWaitSynch__FP18SQLCC_INITSTRUCT_T
    + 0x1D4
    0x0900000010D00BD0 sqlccipcinit + 0x218
    0x0900000010D006D0
    sqlccinit__FP18SQLCC_INITSTRUCT_TPP17SQLCC_COMHANDLE_TP12SQLCC_C
    OND_TP13SQLO_MEM_POOL
    + 0x250
    0x090000000E144834
    @136@sqlerInitCommsLayer__FP14sqlerFmpHandleP8sqeAgentb + 0xCC
    0x090000000DF5F0A8
    @136@sqlerMasterThreadReq__FP13sqlerFmpParmsP13sqlerFmpTableP14s
    qlerFmpHandleP18sqlerFmpThreadListP8sqeAgentUicT7P5sqlcab
    + 0x610
    0x090000000DF9FDE4
    sqlerGetFmpThreadEntry__FP11sqlerFmpRowP14sqlerFmpHandleP13sqler
    FmpParmsb
    + 0x29C
    0x090000000DF9FA64
    sqlerGetFmpThreadEntry__FP11sqlerFmpRowP14sqlerFmpHandleP13sqler
    FmpParmsb@glue345
    + 0x78
    0x090000000D502A94
    sqlerGetFmpFromPool__FP14sqlerFmpHandleP13sqlerFmpParms + 0x400
    0x0900000010CFDB98 sqlerInvokeFencedRoutine__FP13sqlerFmpParms +
    0xAC
    0x0900000010C7A358 sqlriInvokeInvoker__FP10sqlri_ufobb + 0x84
    0x090000000D730FB0 sqlriutf__FP8sqlrr_cb + 0xE8
    0x090000000EC927DC sqlri_tfopn__FP8sqlrr_cbP9sqlri_tao + 0x27C
    0x090000000D26C7A4 sqlriopn__FP8sqlrr_cbP9sqlri_taoPi + 0x998
    0x0900000010A3D21C sqlriopn__FP8sqlrr_cbP9sqlri_taoPi@glue220 +
    0x74
    0x0900000010A615FC sqlrita__FP8sqlrr_cb + 0x214
    0x0900000010A6536C sqlriSectInvoke__FP8sqlrr_cbP12sqlri_opparm +
    0x610
    0x090000000DD5371C sqlrr_dss_router__FP8sqlrr_cb + 0x34
    0x090000000DAB8900
    sqlrr_subagent_router__FP8sqeAgentP12SQLE_DB2RA_T + 0x5C0
    0x090000000CD4B664 sqleSubRequestRouter__FP8sqeAgentPUiT2 +
    0x5F8
    0x090000000DAB33E0 sqleProcessSubRequest__FP8sqeAgent + 0xA4C
    0x090000000CE59EB4 RunEDU__8sqeAgentFv + 0x2F8
    0x0900000010C1AF90 EDUDriver__9sqzEDUObjFv + 0xDC
    0x0900000010C02770 sqloEDUEntry + 0x260
    </StackTrace>
    

Local fix

  • To identify which partition is experiencing the symptom:
    
            db2_all "db2pd -latches" | grep -E "Database
    Partition|Holder|SQLO_LT_sqlerFmpRow__ipcLatch"
    
            Look for Database Partition which holds the latch.
    
    To recover for the symptom:
    
            login to the physical box where above problem Database
            Partition resides:
    
            ps -ef | grep "db2fmp" | grep ") <database partition
    num>"
    
            From above output, find and terminate the PID which
            keeps consuming CPU:
                    db2fmpterm <PID>
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * DB2 server systems                                           *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrading to DB2 V95FP10 resolves this issue.                *
    ****************************************************************
    

Problem conclusion

  • First fixed in DB2 V95FP10.
    

Temporary fix

Comments

APAR Information

  • APAR number

    LI76564

  • Reported component name

    DB2 UDE ESE LIN

  • Reported component ID

    5765F4104

  • Reported release

    950

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2012-01-05

  • Closed date

    2012-08-28

  • Last modified date

    2012-08-28

  • APAR is sysrouted FROM one or more of the following:

    IC76825

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 UDE ESE LIN

  • Fixed component ID

    5765F4104

Applicable component levels

  • R950 PSY

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"950","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
28 August 2012