Fixes are available
APAR status
Closed as program error.
Error description
When a RoCE port that is configured for HA encounters issues, it may result in one of the members going down. In this case, the db2diag.log shows the following entries: 2018-09-07-05.27.43.138008+540 I2379A709 LEVEL: Severe PID : 15597810 TID : 139862 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : MYDB1 APPHDL : 0-10933 APPID: *N0.db2inst1.180905095247 AUTHID : DB2INST2 HOSTNAME: host21 EDUID : 139862 EDUNAME: db2agent (MYDB1) 0 FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876 DATA #1 : <preformatted> xport_send: dat_ep_post_rdma_write of the MCB failed: 0x80040000. EP: 0x1111177d0 DATA #1 : <preformatted> If a CF return code is displayed above and you wish to get more information then please run the following command: ... 2018-09-07-05.27.43.152685+540 I8731A746 LEVEL: Error PID : 15597810 TID : 102875 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : MYDB1 HOSTNAME: host21 EDUID : 102875 EDUNAME: db2XInot GBP 2-0 (MYDB1) 0 FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876 DATA #1 : <preformatted> link_status_write: do_dequeue for link status Buffer FAILED dest Address: 0x111b86f68 RKEY = 0x4ee00 len = 4, src Address: 0x 121146ac LKEY = 0x36700 len = 4 status = 0x80090020, ep = 0x12114c50 DATA #1 : <preformatted> If a CF return code is displayed above and you wish to get more information then please run the following command: db2diag -cfrc <CF_errcode> ... 2018-09-07-05.27.43.154096+540 I10195A6128 LEVEL: Event PID : 15597810 TID : 102875 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : MYDB1 HOSTNAME: host21 EDUID : 102875 EDUNAME: db2XInot GBP 2-0 (MYDB1) 0 FUNCTION: DB2 UDB, Shared Data Structure Abstraction Layer for CF, SAL_GBP_HANDLE::SAL_CheckXiLink, probe:204 MESSAGE : CA RC= 2148073504 DATA #1 : String, 59 bytes Detected broken XI connection, attempt reset operation now. DATA #2 : Codepath, 8 bytes 7:15 DATA #3 : unsigned integer, 8 bytes 1 DATA #4 : SAL CF Index, PD_TYPE_SAL_CF_INDEX, 8 bytes 2 DATA #5 : SAL CF Node Number, PD_TYPE_SAL_CF_NODE_NUM, 2 bytes 129 DATA #6 : String, 49 bytes current xi cf-server/member-devname/adapter-index DATA #7 : SAL CF Server Name, PD_TYPE_SAL_CF_SERVER_NAME, 13 bytes host22-en1 DATA #8 : SAL Member Device Name, PD_TYPE_SAL_MEMBER_DEVICE_NAME, 4 bytes hca0 DATA #9 : Connection pool link adapter number, PD_TYPE_SAL_ADAPTER_NUMBER, 8 bytes 0 ... 2018-09-07-05.27.43.156303+540 I17603A738 LEVEL: Error PID : 15597810 TID : 101309 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : MYDB1 HOSTNAME: host21 EDUID : 101309 EDUNAME: db2LLMn2 (MYDB1) 0 FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876 DATA #1 : <preformatted> link_status_write: do_dequeue for link status Buffer FAILED dest Address: 0x111b882e8 RKEY = 0x10500 len = 4, src Address: 0x 185ac29c LKEY = 0x16800 len = 4 status = 0x80090020, ep = 0x185bd5d0 DATA #1 : <preformatted> If a CF return code is displayed above and you wish to get more information then please run the following command: db2diag -cfrc <CF_errcode> ... 2018-09-07-05.27.43.161216+540 I21396A630 LEVEL: Error PID : 15597810 TID : 101309 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : MYDB1 HOSTNAME: host21 EDUID : 101309 EDUNAME: db2LLMn2 (MYDB1) 0 FUNCTION: DB2 UDB, RAS/PD component, pdLogCaPrintf, probe:876 DATA #1 : <preformatted> notify_disconnect(close): dat_ep_disconnect failed: 0x80030000, EP: 0x1185bd5d0 Token: 0x1a000 DATA #1 : <preformatted> If a CF return code is displayed above and you wish to get more information then please run the following command: db2diag -cfrc <CF_errcode> ... 2018-09-07-05.27.43.167388+540 I30439A4907 LEVEL: Event PID : 15597810 TID : 102106 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : MYDB1 HOSTNAME: host21 EDUID : 102106 EDUNAME: db2XInot SCA 2-0 (MYDB1) 0 FUNCTION: DB2 UDB, Shared Data Structure Abstraction Layer for CF, SAL_GBP_HANDLE::SAL_CheckXiLink, probe:204 MESSAGE : CA RC= 2148073504 DATA #1 : String, 59 bytes Detected broken XI connection, attempt reset operation now. ... 2018-09-07-05.27.43.185042+540 E53804A4857 LEVEL: Error PID : 15597810 TID : 139862 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : MYDB1 APPHDL : 0-10933 APPID: *N0.db2inst1.180905095247 AUTHID : DB2INST2 HOSTNAME: host21 EDUID : 139862 EDUNAME: db2agent (MYDB1) 0 FUNCTION: DB2 UDB, Shared Data Structure Abstraction Layer for CF, SAL_MANAGEMENT_PORT_HANDLE::SAL_ManagementQueryKillConnection, probe:12678 MESSAGE : ECF=0x94C6004D=-1798963123 DATA #1 : CF RC, PD_TYPE_SD_CF_RC, 4 bytes 2147876941 The stack files shows following stack of functions: <StackTrace> -------Frame------ ------Function + Offset------ 0x090000000057FF14 pthread_kill + 0xD4 0x090000000057F764 _p_raise + 0x44 0x0900000000039E68 raise + 0x48 0x0900000000056864 abort + 0xC4 0x0900000004A59CF8 sqloExitEDU + 0x298 0x0900000004ABE0DC sqle_panic__Fi + 0x71C 0x090000000534DC54 SAL_ResetXiConnection__14SAL_GBP_HANDLEFR17SAL_XI_RECONN_EDU + 0x3D54 0x090000000B4C985C SAL_CheckXiLink__14SAL_GBP_HANDLEFR17SAL_XI_RECONN_EDU + 0xC9C 0x090000000B4C9CF4 RunEDU__17SAL_XI_RECONN_EDUFv + 0x34 0x0900000004B5EFA0 EDUDriver__9sqzEDUObjFv + 0x2E0 0x0900000004A53694 sqloEDUEntry + 0x374 </StackTrace>
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to Db2 11.1 Mod 4 Fixpack 5 or higher * ****************************************************************
Problem conclusion
First fixed in Db2 11.1 Mod 4 Fixpack 5
Temporary fix
Comments
APAR Information
APAR number
IT29277
Reported component name
DB2 FOR LUW
Reported component ID
DB2FORLUW
Reported release
B10
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2019-05-28
Closed date
2020-01-16
Last modified date
2022-03-29
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
DB2 FOR LUW
Fixed component ID
DB2FORLUW
Applicable component levels
RB10 PSN
UP
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"DB2 for Linux- UNIX and Windows"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"11.1","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Document Information
Modified date:
04 May 2022