IBM Support

PH27200: A BIG SQL WORKER NODE MAY CRASH WITH AN FMP ERROR DETECTING A MEMORY CORRUPTION WHEN RUNNING A STORED PROCEDURE.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as fixed if next.

Error description

  • The memory corruption is detected in a free memory node in
    the free tree.  The size of the free block has been overwritten
    with an invalid value, 0x000000a3.  For example:
    .
    Corrupt node address: 0x00000008401de0a8
    -- UNKNOWN NODE TYPE --
    00000008401de0a8 : c0 01 84 00 00 00 b0 fa e8 c0 1b 40 08 00 00
    00 ...........@....
    00000008401de0b8 : e8 c0 4b 40 08 00 00 00 a3 00 00 00 00 00 00
    00 ..K@............
    .
    The db2diag.log will have messages similar to the following.
    2020-05-11-10.37.47.829376-300 I34710150E5236 LEVEL:
    Warning
    PID : 855037 TID : 140174205904640 PROC : db2sysc
    0
    INSTANCE: bigsql NODE : 000 DB : BIGSQL
    APPHDL : 0-15740
    APPID: *N0.DB2.200511158888
    UOWID : 1 ACTID: 1
    AUTHID : bigsql
    HOSTNAME: hostname.com
    EDUID : 38275 EDUNAME: db2agent (BIGSQL)
    0
    FUNCTION: DB2 UDB, routine_infrastructure,
    sqlerReturnFmpToPool, probe:754
    DATA #1 : String, 35
    bytes
    Unstable FMP re-marked as unstable.
    DATA #2 :
    sqlerUnstableReason, PD_TYPE_SQLER_UNSTABLE_REASON, 4
    bytes
    SQLER_UNSTABLE_ABORT_FORCED (3) - FMP is forced or
    aborted
    DATA #3 : sqlerFmpRow, PD_SQLER_TYPE_FMP_ROW, 872
    bytes
     fmpPid: 938299
     fmpPoolList Ptr:
    0x0000000203dcb940	fmpForcedList Ptr: 0x0000000000000000
    
    nextFmpCB Ptr: 0x0000000202bcf5a0	prevFmpCB Ptr:
    0x0000000202c4fbe0
     fmpIPCList Ptr: 0x0000000203d1f580
    
    stateFlags: 0x00802013	numFmp32Attaches: 0
     numActiveThreads:
    2	numPoolThreads: 4
     fmpCodePage: 1208	fmpRowUseCount: 2
    
    active: 0x01 	rowLoaderValidate: 0x00
     startTimestamp:
    2020-05-04-18.39.44
     unstableTimestamp: 2020-05-11-10.37.47
    
    unstableReason: SQLER_UNSTABLE_INIT_COMMS_FAILED (8) - Failed
    to communicate with master thread
     ipcLatch:
    0x0000000202BCF924
    : 0000 3F00 ..?.
     rowLatch:
    0x0000000202BCF928 : 0000 E501
    ....
     fmpAgentList:
    0x0000000202BCF930 : 0800 0700 3300 0000
    0000 0000 0000 0000 ....3...........
    0x0000000202BCF940 : 0000
    0000 0000 0000 0000 0000 0000 0000
    ................
    ...>>
    CALLSTCK: (Static functions may not be
    resolved correctly, as they are resolved to the nearest
    symbol)
     [0] 0x00007F8A595945AF
    _Z20sqlerReturnFmpToPoolccP14sqlerFmpHandleP8sqeAgent + 0x2CF
    
    [1] 0x00007F8A59584AEA
    _Z24sqlerInvokeFencedRoutineP13sqlerFmpParms + 0x1D7A
     [2]
    0x00007F8A5BEEB9E4 _Z18sqlriInvokeInvokerP10sqlri_ufobb +
    0x4A4
     [3] 0x00007F8A5BEEAB90 _Z9sqlricallP8sqlrr_cb + 0x1D0
    
    [4] 0x00007F8A595C02F5 _Z22sqlerAutonomousSessionPv + 0xA5
     [5]
    0x00007F8A591FBD85 _Z26sqleIndCoordProcessRequestP8sqeAgent +
    0x1065
     [6] 0x00007F8A59221F21 _ZN8sqeAgent6RunEDUEv + 0x5A1
    
    [7] 0x00007F8A5CBF37DE _ZN9sqzEDUObj9EDUDriverEv + 0x1BE
     [8]
    0x00007F8A5B0F438A sqloEDUEntry + 0x57A
     [9] 0x00007F8A63391EA5
    /lib64/libpthread.so.0 + 0x7EA5
     [10] 0x00007F8A512228CD clone
    + 0x6D
    
    ...>>
    
    2020-05-11-10.45.36.642015-300 E34771637E625
    LEVEL: Info
    PID : 855037 TID : 140178563786496 PROC : db2sysc
    0
    INSTANCE: bigsql NODE : 000 DB : BIGSQL
    APPHDL : 0-15756
    APPID: *N0.DB2.200511158888
    AUTHID : BIGSQL HOSTNAME:
    hostname.com
    EDUID : 38272 EDUNAME: db2agent (BIGSQL)
    0
    FUNCTION: DB2 UDB, routine_infrastructure,
    sqlerReturnFmpToPool, probe:8103
    DATA #1 : String, 34
    bytes
    Removing FMP from pool FMP handle:
    DATA #2 :
    sqlerFmpHandle, PD_SQLER_TYPE_FMP_HANDLE, 24 bytes
     fmpPid:
    938299 pFmpEntry:
    0x000000020330fc20
    
    ..>>
    
    2020-05-11-10.37.47.882725-300
    I34729133E491 LEVEL: Info
    PID : 938299 TID : 139690881844992
    PROC : db2fmp (
    INSTANCE: bigsql NODE : 000 DB : BIGSQL
    APPID
    : *N0.DB2.200511158888
    HOSTNAME: hostname.com
    FUNCTION: DB2
    UDB, routine_infrastructure, sqlerFmpListener,
    probe:350
    RETCODE : ZRC=0xFFFFFB95=-1131
     SQL1131N A stored
    procedure process has been terminated abnormally.
     Routine
    name: "". Specific name: "".
    
    2020-05-11-10.45.36.770043-300
    E34780048E1739 LEVEL: Severe
    PID : 855037 TID : 140231625926400
    PROC : db2sysc 0
    INSTANCE: bigsql NODE : 000
    HOSTNAME:
    hostname.com
    EDUID : 11 EDUNAME: db2sysc 0
    FUNCTION: DB2 UDB,
    SQO Memory Management, sqloDiagnoseFreeBlockFailure,
    probe:999
    MESSAGE : Memory validation failure, diagnostic file
    dumped.
    DATA #1 : String, 28 bytes
    Corrupt pool free tree
    node.
    DATA #2 : File name, 42
    bytes
    855037.140231625926400.mem_diagnostics.txt
    CALLSTCK:
    (Static functions may not be resolved correctly, as they are
    resolved to the nearest symbol)
     [0] 0x00007F8A5B0CA345
    _ZN13SQLO_MEM_POOL32diagnoseMemoryCorruptionAndCrashEmPKcb +
    0x285
     [1] 0x00007F8A5B0AB817 sqlofmblkEx + 0x477
     [2]
    0x00007F8A54B512EC _Z13sqlccFreeIPCsP18SQLCC_INITSTRUCT_Tcc +
    0x9C
     [3] 0x00007F8A59593F57
    _Z18sqlerDeallocFmpIPCPP18SQLCC_INITSTRUCT_TP8sqeAgentb + 0x57
    
    [4] 0x00007F8A5958E423
    _Z24sqlerCleanThreadResourceP18sqlerFmpThreadListbcc + 0x423
    
    [5] 0x00007F8A5958DE23 /home/bigsql/sqllib/lib64/libdb2e.so.1 +
    0x6E0BE23
     [6] 0x00007F8A595959F9
    _Z20sqlerReturnFmpToPoolccP14sqlerFmpHandleP8sqeAgent + 0x1719
    
    [7] 0x0000000000416914 _Z14sqleCleanupFmpii + 0x414
     [8]
    0x000000000041507B
    _Z25sqleSyscRequestProcessingjP18SQL_SYSCON_REQUESTPi + 0x1AB
    
    [9] 0x0000000000414775 _Z14sqleRunSysCtlrv + 0x695
     [10]
    0x0000000000413F93 _Z11sqleSysCtlrv + 0x1893
     [11]
    0x00007F8A5B0EF974 /home/bigsql/sqllib/lib64/libdb2e.so.1 +
    0x896D974
     [12] 0x00007F8A5B0EE3A8 sqloRunInstance + 0x928
    
    [13] 0x000000000040D2A6 DB2main + 0x1236
     [14]
    0x00007F8A5B0F3F2B sqloEDUEntry + 0x11B
     [15]
    0x00007F8A63391EA5 /lib64/libpthread.so.0 + 0x7EA5
     [16]
    0x00007F8A512228CD clone + 0x6D
    

Local fix

  • NA
    

Problem summary

  • Please see problem description.
    

Problem conclusion

Temporary fix

Comments

APAR Information

  • APAR number

    PH27200

  • Reported component name

    IBM BIG SQL

  • Reported component ID

    5737E7400

  • Reported release

    504

  • Status

    CLOSED FIN

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2020-07-07

  • Closed date

    2020-09-09

  • Last modified date

    2020-09-09

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

Applicable component levels

  • R504 PSY

       UP

[{"Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Big SQL"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"504"}]

Document Information

Modified date:
17 September 2021