IBM Support

Avoid setting CF_NUM_CONNS to a fixed and small number

Preventive Service Planning


Abstract

Avoid setting CF_NUM_CONNS to a fixed and small number

Content

When setting CF connections(CF_NUM_CONNS) to a fixed and small number and if it is not inadequate for the workload DB2 server may crash with following stack trace:

1.
0x0900000000852154 pthread_kill + 0xD4
0x0900000005CC2494 sqloDumpEDU + 0x6C
0x09000000084D10C8 sqle_panic__Fv + 0x218
0x09000000066B31EC SAL_ReadAndRegisterPage__14SAL_GBP_HANDLEFCP12SQLB_GLOBALSCP8SQLB_BPDCPC10PsPageNameCPvCUlCUiT3T5CP19SAL_DIRTY_PAGE_INFOT6T5 + 0x108
0x09000000067F0498 sqlbgbRAR__FP8SQLB_BPDP11SQLB_FIX_CBP13SQLB_PAGE_KEYb + 0x358
0x09000000095BF96C sqlbgbRefreshPage__FP8SQLB_BPDP11SQLB_FIX_CBPUlPb + 0xD70
0x09000000068585F8 sqlbfix__FP11SQLB_FIX_CB + 0x4C4
0x09000000068C101C sqlifix__FP7SQLI_CBP14SQLI_PAGE_DESCUii + 0x80
0x0900000006CFFDC8 sqlischa__FP7SQLI_CBP11SQLI_SAGLOBiUi + 0x3E4
0x0900000006CFF960 @159@next_level__FP7SQLI_CBP11SQLI_SAGLOBiUi + 0x5C
0x0900000006D00038 sqlischa__FP7SQLI_CBP11SQLI_SAGLOBiUi + 0x654
0x0900000006CFF960 @159@next_level__FP7SQLI_CBP11SQLI_SAGLOBiUi + 0x5C
0x0900000006D000A4 sqlischa__FP7SQLI_CBP11SQLI_SAGLOBiUi + 0x6C0
0x0900000006CFD47C sqliaddk__FP8sqeAgentP9SQLD_IXCBP8SQLD_KEY12SQLI_KEYDATAP14SQLP_LOCK_INFOP8SQLP_LRBUlP10SQLI_IXPCRPPv + 0x85C
0x090000000654D068 sqldKeyInsert__FP13SQLD_DFM_WORKP16SQLD_TABLE_CACHET2P13SQLD_TDATARECP15SQLD_TDATAREC32iUl + 0x800
0x0900000006CF8564 sqldRowInsert__FP8sqeAgentUsT2UcUliPP10SQLD_VALUEP8SQLZ_RIDPPv + 0x66C
0x0900000006CF8E08 sqlrinsr__FP8sqlrr_cbUsT2iT2PP10SQLD_VALUEQ3_10sqlri_iudo11t_iudoFlags17t_iudoFlagsKernelP8SQLZ_RIDPPv + 0xC0
0x0900000006CF90B8 sqlriisr__FP8sqlrr_cb + 0x1D8
0x09000000068D57DC sqlriSectInvoke__FP8sqlrr_cbP12sqlri_opparm + 0x40C
0x0900000006C6F9B0 sqlrr_execute__FP14db2UCinterfaceP9UCstpInfo + 0x2A6C
0x0900000006C6D73C sqlrr_execute__FP14db2UCinterfaceP9UCstpInfo + 0x7F8
0x0900000006323BDC sqljsParseRdbAccessed__FP13sqljsDrdaAsCbP13sqljDDMObjectP14db2UCinterface + 0xE84
0x09000000063287A8 @72@sqljsSqlam__FP14db2UCinterfaceP8sqeAgentb + 0xB34
0x09000000063287A8 @72@sqljsSqlam__FP14db2UCinterfaceP8sqeAgentb + 0xB34
0x09000000063283A4 @72@sqljsSqlam__FP14db2UCinterfaceP8sqeAgentb + 0x730
0x0900000006017E7C @72@sqljsDriveRequests__FP8sqeAgentP14db2UCconHandle+ 0xA8
0x090000000601896C @72@sqljsDrdaAsInnerDriver__FP18SQLCC_INITSTRUCT_Tb +0x5E4
0x0900000005DF55D0 RunEDU__8sqeAgentFv + 0x463CC
0x0900000005E2C75C RunEDU__8sqeAgentFv + 0xDC
0x09000000071D2E58 EDUDriver__9sqzEDUObjFv + 0x128
0x09000000061F78D8 sqloEDUEntry + 0x3A0

2.
0x0900000000852154 pthread_kill + 0xD4
0x0900000005CC2494 sqloDumpEDU + 0x6C
0x09000000084D10C8 sqle_panic__Fv + 0x218
0x0900000005D650D0 SAL_GetMaxLSN__14SAL_GBP_HANDLEFCPUlCbCUiCUl + 0x198C
0x0900000008FE096C sqleLSNSyncUpdateFromCA + 0xC8
0x0900000005F0D598 sqleRPCSync + 0x3FC
0x0900000005EE6FEC sqleRPCSync + 0xEC
0x090000000594C210 sqleIndCoordProcessRequest__FP8sqeAgent + 0x1C028
0x090000000591C60C sqleIndCoordProcessRequest__FP8sqeAgent + 0xCD8
0x0900000005E2C770 RunEDU__8sqeAgentFv + 0xF0
0x09000000071D2E58 EDUDriver__9sqzEDUObjFv + 0x128
0x09000000061F78D8 sqloEDUEntry + 0x3A0


In db2diag.log there are following messages:
-----------------------------------------------
2014-09-04-11.31.03.740614+480 I34484534A526 LEVEL: Event
PID : 5702610 TID : 772 PROC : db2sysc 0
INSTANCE: db2sdin1 NODE : 000
HOSTNAME: b_mem1
EDUID : 772 EDUNAME: db2castructevent 0
FUNCTION: DB2 UDB, Shared Data Structure Abstraction Layer for CF, sqleRocmNotifEdu::ROCM_StateFailoverMonitor, probe:8300
DATA #1 : String, 60 bytes
ROCM_StateFailoverMonitor: Member completed primary failover
DATA #2 : Database Partition Number, PD_TYPE_NODE, 2 bytes
129

...

2014-09-04-11.47.06.260457+480 I34486222A551 LEVEL: Error
PID : 5702610 TID : 772 PROC : db2sysc 0
INSTANCE: db2sdin1 NODE : 000
HOSTNAME: b_mem1
EDUID : 772 EDUNAME: db2castructevent 0
FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876
DATA #1 : <preformatted>
xport_close: dat_pz_free failed: 0x80070053
DATA #1 : <preformatted>
If a CF return code is displayed above and you wish to get
more information then please run the following command:

db2diag -cfrc <CF_errcode>

2014-09-04-11.47.06.261086+480 I34486774A552 LEVEL: Error
PID : 5702610 TID : 772 PROC : db2sysc 0
INSTANCE: db2sdin1 NODE : 000
HOSTNAME: b_mem1
EDUID : 772 EDUNAME: db2castructevent 0
FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876
DATA #1 : <preformatted>
xport_close: dat_ia_close failed: 0x80070050
DATA #1 : <preformatted>
If a CF return code is displayed above and you wish to get
more information then please run the following command:

db2diag -cfrc <CF_errcode>

2014-09-04-11.47.06.261528+480 I34487327A544 LEVEL: Error
PID : 5702610 TID : 772 PROC : db2sysc 0
INSTANCE: db2sdin1 NODE : 000
HOSTNAME: b_mem1
EDUID : 772 EDUNAME: db2castructevent 0
FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876
DATA #1 : <preformatted>
ClientXport.close failed: 0x400b0002
DATA #1 : <preformatted>
If a CF return code is displayed above and you wish to get
more information then please run the following command:

db2diag -cfrc <CF_errcode>

2014-09-04-11.47.06.261971+480 I34487872A1296 LEVEL: Severe
PID : 5702610 TID : 772 PROC : db2sysc 0
INSTANCE: db2sdin1 NODE : 000
HOSTNAME: b_mem1
EDUID : 772 EDUNAME: db2castructevent 0
FUNCTION: DB2 UDB, Shared Data Structure Abstraction Layer for CF, SQLE_SINGLE_CA_HANDLE::sqleSingleCaTerminate, probe:3128
MESSAGE : CA RC= 1074462722
DATA #1 : String, 33 bytes
PsClose: close connection failed.
DATA #2 : String, 6 bytes
0 1530
CALLSTCK: (Static functions may not be resolved correctly, as they are resolved to the nearest symbol)
[0] 0x090000000598D0E4 sqleSingleCaTerminate__21SQLE_SINGLE_CA_HANDLEFbT1 + 0x854
[1] 0x0900000007C005CC ROCM_CompleteDepart__16sqleRocmNotifEduFsUib + 0x1810
[2] 0x0900000007BFF1C8 ROCM_CompleteDepart__16sqleRocmNotifEduFsUib + 0x40C
[3] 0x0900000007BFEBEC ROCM_StateS3Arrive__16sqleRocmNotifEduFP16ROCM_DB2_REQUEST + 0x38C
[4] 0x09000000078428A4 ROCM_StateMachineIteration__16sqleRocmNotifEduFP16ROCM_DB2_REQUESTUl + 0x38C
[5] 0x09000000071CC0DC RunEDU__16sqleRocmNotifEduFv + 0x4C60
[6] 0x09000000071D30F0 EDUDriver__9sqzEDUObjFv + 0x3C0
[7] 0x09000000061F78D8 sqloEDUEntry + 0x3A0
[8] 0x0900000000839E10 _pthread_body + 0xF0
[9] 0xFFFFFFFFFFFFFFFC ?unknown + 0xFFFFFFFF

The root cause of the issue is CF-connections contention that happens between the different EDU's performing catchup.

The solution is to increase the number of CF_NUM_CONNS or set CF_NUM_CONNS to AUTOMATIC which is the default behaviour and has the advantage that new connections will be created if db2 detects that more are needed. With a fixed connection pool size the number of connections will not be altered by db2.

Related Information

[{"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"Database Objects\/Config - Database","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"10.5","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

More support for:
Db2 for Linux, UNIX and Windows

Software version:
10.5

Operating system(s):
AIX, HP-UX, Linux, Solaris, Windows

Document number:
518783

Modified date:
16 June 2018

UID

swg21689824

Manage My Notification Subscriptions