Fixes are available
9.0.0.5: WebSphere Application Server traditional V9.0 Fix Pack 5
9.0.0.6: WebSphere Application Server traditional V9.0 Fix Pack 6
9.0.0.7: WebSphere Application Server traditional V9.0 Fix Pack 7
9.0.0.8: WebSphere Application Server traditional V9.0 Fix Pack 8
9.0.0.9: WebSphere Application Server traditional V9.0 Fix Pack 9
9.0.0.10: WebSphere Application Server traditional V9.0 Fix Pack 10
9.0.0.11: WebSphere Application Server traditional V9.0 Fix Pack 11
9.0.5.0: WebSphere Application Server traditional Version 9.0.5 Refresh Pack
9.0.5.1: WebSphere Application Server traditional Version 9.0.5 Fix Pack 1
9.0.5.2: WebSphere Application Server traditional Version 9.0.5 Fix Pack 2
9.0.5.3: WebSphere Application Server traditional Version 9.0.5 Fix Pack 3
9.0.5.4: WebSphere Application Server traditional Version 9.0.5 Fix Pack 4
9.0.5.5: WebSphere Application Server traditional Version 9.0.5 Fix Pack 5
WebSphere Application Server traditional 9.0.5.6
9.0.5.7: WebSphere Application Server traditional Version 9.0.5 Fix Pack 7
9.0.5.8: WebSphere Application Server traditional Version 9.0.5.8
9.0.5.9: WebSphere Application Server traditional Version 9.0.5.9
9.0.5.10: WebSphere Application Server traditional Version 9.0.5.10
9.0.5.11: WebSphere Application Server traditional Version 9.0.5.11
APAR status
Closed as program error.
Error description
Segfault when high traffic coming to the Intelligent Management Enabled plugin and a Liberty member is stopped
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: All users of IBM WebSphere Application * * Server Intelligent Managed Plugin * **************************************************************** * PROBLEM DESCRIPTION: Segfault when high traffic coming to * * the Intelligent Management Enabled * * plugin and a Liberty member is stopped * **************************************************************** * RECOMMENDATION: * **************************************************************** Segfault 11 when high traffic is coming to the Intelligent Management Enabled plugin at the same time that a Liberty collective member is stopped. The problem is due to having clusters with only one member. A request arrives and we associate the request to a cluster that we save in the context. If the Liberty member is stopped the active request is unable to be processed because the member is no longer available. This causes us to retry the request and in doing so we must find another cluster that includes the application to receive the request since the previous member is no longer available. Only the context is not updated with the new cluster on the retry. This results in an attempt to free an already freed request when the context is later freed. Here is the call stack from the seg fault to assist in diagnosis of the problem: #0 0x00007fbcaf0591d7 in raise () from /lib64/libc.so.6 #1 0x00007fbcaf05a8c8 in abort () from /lib64/libc.so.6 #2 0x00007fbcaf098f07 in __libc_message () from /lib64/libc.so.6 #3 0x00007fbcaf0a0503 in _int_free () from /lib64/libc.so.6 #4 0x00007fbcad3d34cd in odrFree (p=0x7fbc9c0f7ac0, file=0x7fbcad3fe360 "/home/ibmadmin/odrbuildx/NATV/ws/code/plugins.http/odrlib/src/o drTargetSelector.c", line=999) at /home/ibmadmin/odrbuildx/NATV/ws/code/plugins.http/odrlib/src/od rLibUtil.c:105 #5 0x00007fbcad3dcbda in clusterDelete (cluster=0x7fbc9c10fb80) at /home/ibmadmin/odrbuildx/NATV/ws/code/plugins.http/odrlib/src/od rTargetSelector.c:999 #6 0x00007fbcad3dec5a in tsTargetInfoDecrementRefCnt (ctx=0x70fc20) at /home/ibmadmin/odrbuildx/NATV/ws/code/plugins.http/odrlib/src/od rTargetSelector.c:1847 #7 0x00007fbcad3c927f in clean (ctx=0x70fc20) at /home/ibmadmin/odrbuildx/NATV/ws/code/plugins.http/odrlib/src/od rHttpContext.c:238 #8 0x00007fbcad3ccbbf in odrHttpContextRelease (ctx=0x70fc20) at /home/ibmadmin/odrbuildx/NATV/ws/code/plugins.http/odrlib/src/od rHttpContext.c:1449 #9 0x00007fbcad96af39 in odrHandleRequest (request=0x7fbc94ff0ba0) at /home/ibmadmin/odrbuildx/NATV/ws/code/plugins.http/src/common/od r/lib_odr.c:279 #10 0x00007fbcad97b8f2 in websphereHandleRequest (reqInfo=0x7fbc68006628) at /home/ibmadmin/odrbuildx/NATV/ws/code/plugins.http/src/common/ws _common.c:4753 #11 0x00007fbcad960c7f in as_handler (req=0x7fbc68004978) at /home/ibmadmin/odrbuildx/NATV/ws/code/plugins.http/src/apache_22 /mod_was_ap22_http.c:1625 #12 0x000000000042844e in ihs_run_handler () #13 0x000000000043a47c in ap_invoke_handler () #14 0x0000000000444ac8 in ap_process_request () #15 0x0000000000441dfc in ap_process_http_connection () #16 0x000000000043e152 in ap_run_process_connection () #17 0x000000000044993f in worker_thread () #18 0x00007fbcaf83cdc5 in start_thread () from /lib64/libpthread.so.0 #19 0x00007fbcaf11b73d in clone () from /lib64/libc.so.6
Problem conclusion
Code was corrected to properly update the cluster associated with the request when a retry is necessary because a member became unavailable and changed the target cluster. The fix for this APAR is currently targeted for inclusion in fix pack 9.0.0.5. Please refer to the Recommended Updates page for delivery information: http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980
Temporary fix
No workaround aside from limiting traffic when stopping Liberty members in the collective and alternatively, ensuring that only some (and not all) members of a cluster are stopped concurrently when there may be inflight requests such that there will be other members in the same cluster that can service the request and thereby avoid the issue.
Comments
APAR Information
APAR number
PI85618
Reported component name
WEBS APP SERV N
Reported component ID
5724H8800
Reported release
900
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2017-08-08
Closed date
2017-08-16
Last modified date
2018-08-20
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
WEBS APP SERV N
Fixed component ID
5724H8800
Applicable component levels
R900 PSY
UP
Document Information
Modified date:
04 May 2022