APAR status
Closed as program error.
Error description
After network issues caused HDR to be shutdown, the act of reconnecting the primary to hdr led to continually hung threads on both the primary and secondary that never released and prevented HDR from syncing back up. On the primary, the dr_prsend thread was observed waiting for dr_gettype thread that was stuck: Thread CPU Info: tid name vp Last Run CPU Time #scheds status 147 dr_prsend 8cpu 10/12 10:03:51 65.1068 1977253 join wait 3679913 3679913 dr_gettype 9cpu 10/12 19:11:09 0.2838 33084 cond wait smx pipe1 The stack thread of dr_gettype thread showed it stuck in smx_connect: Stack for thread: 3679913 dr_gettype base: 0x000000015098d000 len: 69632 pc: 0x00000001012c63c0 tos: 0x000000015099cdd1 state: cond wait vp: 10 oninit :: yield_processor_mvp oninit :: mt_wait oninit :: smx_connect oninit :: SC_smx_sporadic_connect oninit :: SC_maxmsg_ping oninit :: GetServerVersionInfo oninit :: verify_server_version oninit :: dr_whattype oninit :: startup At the same time, on the HDR side, there was a dr_accept thread that was also hung and it's stack was: Stack for thread: 47176 dr_accept oninit :: yield_processor_mvp oninit :: mt_wait oninit :: net_buf_get oninit :: recvtli oninit :: slSQIrecv oninit :: pfRecv oninit :: asfRecv oninit :: ASF_Call oninit :: rsasf_recv_buf oninit :: rsasf_recv_with_timeout oninit :: dr_asf_recv_with_timeout oninit :: dr_session_recv_with_timeout oninit :: dr_acceptInt oninit :: dr_accept oninit :: listen_verify oninit :: spawn_thread oninit :: th_init_initgls In this scenario, both the HDR dr_accept thread and the primary dr_gettype threads had been hung for over 9 hours. Restarting the HDR secondary did not release the dr_gettype thread on the primary and the primary had to be restarted to sync HDR back up.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: * * Users of IDS 12.10.xC10 and earlier versions. * **************************************************************** * PROBLEM DESCRIPTION: * * HDR primary dr_gettype thread and secondary dr_accept * * threads can hang attempting to reconnect after DR:Turned off * * on primary server. * **************************************************************** * RECOMMENDATION: * ****************************************************************
Problem conclusion
Fixed in IDS 12.10.xC11.
Temporary fix
Comments
APAR Information
APAR number
IT27523
Reported component name
INFORMIX SERVER
Reported component ID
5725A3900
Reported release
C10
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2018-12-24
Closed date
2019-10-07
Last modified date
2019-10-07
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
INFORMIX SERVER
Fixed component ID
5725A3900
Applicable component levels
[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSGU8G","label":"Informix Servers"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"C10","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
07 October 2019