IBM Support

IC84086: DB2 MAY HANG DUE TO LATCH ON SQLO_LT_SQLE_FEDFMP_APP_CB__FMPAPPL ATCH AFTER -1131 AND -1042 ERRORS.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • DB2 might experience a hung in which an agent is holding
    SQLO_LT_SQLE_FEDFMP_APP_CB__FMPAPPLATCH latch.  It can happen
    that other agents waiting on this latch will be holding crucial
    latches (example SQLO_LT_sqeLocalDatabase__dblatch or
    SQLO_LT_SQLP_TENTRY__tranEntryLatch) for database functioning,
    halting the whole system.
    
    The db2diag.log might show errors like:
    
    sqleFedFMPManager::disconnect, probe:30
    RETCODE : ZRC=0xFFFFFB95=-1131
    
    and
    
    sqleFedFMPManager::returnFmpToPool, probe:30
    RETCODE : ZRC=0xFFFFFBEE=-1042
    
    
    
    
    The following is an example of stacks of holder and waiter when
    this problem happens:
    
    EDU name     : db2agntdp (DW      ) 0
    EDU ID       : 348
    <LatchInformation>
    Holding Latch type: (SQLO_LT_SQLE_FEDFMP_APP_CB__fmpAppLatch) -
    Address: (0x200dfc4a8), Line: 481, File: sqle_fed_fmp.C
    HoldCount: 2
    </LatchInformation>
    0x00002AAAABF43973 sqloWaitEDUWaitPost + 0x0195
      (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1)
    0x00002AAAABB58298 _ZN8sqeAgent14WaitAgentEventEPji + 0x0012
      (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1)
    0x00002AAAABB7576B
    _ZN16sqeAgentServices14GetNextRequestEPjP8sqeAgent + 0x034d
      (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1)
    0x00002AAAABB556DD _ZN8sqeAgent6RunEDUEv + 0x0443
      (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1)
    0x00002AAAAC2249C0 _ZN9sqzEDUObj9EDUDriverEv + 0x00a6
      (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1)
    0x00002AAAAC224917 _Z10sqlzRunEDUPcj + 0x0009
    
    
    EDU name     : db2agent (DW) 0
    EDU ID       : 313
    
    <LatchInformation>
    
    Waiting on latch type: (SQLO_LT_SQLE_FEDFMP_APP_CB__fmpAppLatch)
    - Address: (0x200dfc4a8), Line: 481, File: sqle_fed_fmp.C
    
    Holding Latch type: (SQLO_LT_SQLP_TENTRY__tranEntryLatch) -
    Address: (0x2aaad4138268), Line: 748, File:
    /view/db2_v97fp4_linuxamd64_s110330/vbs/engn/include/sqlpi_inlin
    es.h HoldCount: 1
    Holding Latch type: (SQLO_LT_SQLP_SAVEPOINTS__spLatch) -
    Address: (0x2aaad4137d40), Line: 851, File:
    /view/db2_v97fp4_linuxamd64_s110330/vbs/engn/include/sqlpi_inlin
    es.h HoldCount: 1
    </LatchInformation>
    0x00002AAAAC8F27E5
    _ZN17sqleFedFMPManager14getFmpAppLatchEP18sqle_FedFMP_app_cbj +
    0x0063
      (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1)
    0x00002AAAAC8F24DB
    _ZN17sqleFedFMPManager14validateFedFmpEP18sqle_FedFMP_app_cbbP14
    sqlqg_Fmp_InfoP13sqlerFmpParmsP14sqlerFmpHandlePi + 0x006b
      (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1)
    0x00002AAAAC8F2027
    _ZN17sqleFedFMPManager14getFmpFromPoolEP13sqlerFmpParmsP14sqlqg_
    Fmp_InfoP14sqlerFmpHandle + 0x0081
      (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1)
    
    
    A trace when the issue occurs might show the following:
    
    
    17170      | | | | sqlerFedInvokeFencedRoutine entry [eduid 7457
    eduname db2agent]
    17172       | | | | | sqleFedFMPManager::getFmpFromPool entry
    [eduid 7457 eduname db2agent]
    17174       | | | | | | sqleFedFMPManager::validateFedFmp entry
    [eduid 7457 eduname db2agent]
    17176       | | | | | | | sqleFedFMPManager::getFmpAppLatch
    entry [eduid 7457 eduname db2agent]
    17179       | | | | | | | sqleFedFMPManager::getFmpAppLatch exit
    17180       | | | | | | sqleFedFMPManager::validateFedFmp data
    [probe 1]
    17183       | | | | | | | sqleFedFMPManager::addLogRecord entry
    [eduid 7457 eduname db2agent]
    17185       | | | | | | | sqleFedFMPManager::addLogRecord exit
    17187       | | | | | | sqleFedFMPManager::validateFedFmp exit
    [rc = 1]
    17189       | | | | | sqleFedFMPManager::getFmpFromPool exit
    17191       | | | | | sqlerSendFmpStart entry [eduid 7457
    eduname db2agent]
    17227       | | | | | | sqlerRtnWriteFencedArgData entry [eduid
    7457 eduname db2agent]
    17232       | | | | | | sqlerRtnWriteFencedArgData exit
    17238       | | | | | sqlerSendFmpStart data [probe 42]
    17243       | | | | | sqlerSendFmpStart exit
    17247       | | | | | sqeAgent::AgentBreathingPoint entry [eduid
    7457 eduname db2agent]
    17249       | | | | | | sqeAgent::QueryInterrupt entry [eduid
    7457 eduname db2agent]
    17253       | | | | | | sqeAgent::QueryInterrupt data [probe 70]
    17255       | | | | | | sqeAgent::QueryInterrupt exit
    17257       | | | | | sqeAgent::AgentBreathingPoint exit [rc =
    1]
    17259       | | | | | sqlerInterruptFmp entry [eduid 7457
    eduname db2agent]
    17261       | | | | | | sqlerInterruptThreadedFmp entry [eduid
    7457 eduname db2agent]
    17264       | | | | | | | sqlerMasterThreadReq entry [eduid 7457
    eduname db2agent]
    17268       | | | | | | | sqlerMasterThreadReq exit [rc =
    0xFFFFFBEE = -1042]
    17270       | | | | | | sqlerInterruptThreadedFmp error [probe
    10] [ ZRC = 0xFFFFFBEE = -1042]
    17272       | | | | | | sqlerInterruptThreadedFmp exit [rc =
    0xFFFFFBEE = -1042]
    17274       | | | | | sqlerInterruptFmp exit [rc = 0xFFFFFBEE =
    -1042]
    17276       | | | | sqlerFedInvokeFencedRoutine error [probe 70]
    17281       | | | | sqlerFedInvokeFencedRoutine exit [rc =
    0xFFFFFBEE = -1042]
    17284       | | | sqlriFedInvokeInvoker exit [rc = 0x8012006D =
    -2146303891 = SQLR_CA_BUILT]
    
    
    Note that this might look similar to APAR IC83795, but with the
    fix for IC83795 this problem can still occur.
    
    This APAR only applies to Federated environments.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * ALL                                                          *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Upgrade to DB2 Version 9.7 and Fix Pack 7                    *
    ****************************************************************
    

Problem conclusion

  • Problem was first fixed in DB2 Version 9.7 and Fix Pack 7
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC84086

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    970

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2012-06-12

  • Closed date

    2012-10-20

  • Last modified date

    2012-10-20

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • R970 PSN

       UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEPGG","label":"DB2 for Linux, UNIX and Windows"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"9.7","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
20 October 2012