A fix is available
APAR status
Closed as program error.
Error description
Customer runs an EJB invocation stress test on a server. The server has one servant region. For some reason, the servant region experiences a timeout. BBOO0327I IIOP REQUEST TIMEOUT: (022D):(FFFFEEEE):(0000081C): (007BD8B0):(STC01545):(2010/05/04 10:21:37.074040): (2010/05/04 10:21:37.075264):(2010/05/04 10:21:37.075315): (com.abc.def.ejb.EJSRemoteStateless_00000000): (evaluateXXX):(hostname=host.name.com port=33333):():():() BBOO0232W A request for Class Name 'com.abc.def.ejb.EJSRemoteStateless_00000000' and Method Name 'evaluateXXX', from hostname=host.name.com port=33333, has timed out. The servant process associated with the request will be terminated. Request Id(FFFFEEEE) Since there is only one servant region, the server issues BBOO0299I SERVER XXXPLEX/YYYY/SERVERA HAS NO SERVANTS. WORK IS BEING REJECTED. 2 seconds later the controller region ABENDs with S0C4-4. The S0C4 psw instruction counter points into method ORB_Request::setRTN_RSN_SevereError(ORB_Request::commRSN_Codes) The call stack looks like this: ORB_Request::construct_and_send_message(SessionHandle*,GIOP_ +00087E3C BBOBOA ORB_Request::comm_outbound_response() +00001194 BBOBOA ORB_Request_Registry::drainRegistry(unsigned int) +000009A2 BBOBOA cleanupForLastSR() +000008DA BBOBOA ACR_ExecutionThread::ProcessSrCleanup(acrwObj*) +00000132 BBOBOA ACR_ExecutionThread::RemoveAndProcessWork(ThreadCleanUp*) +0000132E BBOBOA ACR_ExecutionRoutine +0000053E BBOBOA CEEVROND +000011F4 CEEPLPKA CEEOPCMM +000009A2 CEEOPCMM The S0C4 occurs because the ORBR being used by method comm_outbound_response() has been freed. DSA for comm_outbound_response + 0x840 -> ORBR control block. The ORBR eyecatcher in the first 8 bytes say RBROOOBB. It should be BBOOORBR while the ORBR is allocated. The letters are reversed just before the storage is released. The Heap Pool Prefix for this ORBR indicates the storage has been freed because the free chain pointer is non-zero. hp 3 hp 3 reversed element free | eyecatcher | chain | | ptr | | | | V V V 5C7B7FD8. 00000003 5C7B96F8 D9C2D9D6 D6D6C2C2 | RBROOOBB| 5C7B7FE0. 00000000 00000000 00000000 00000000 |................| 5C7B7FF0. 00000000 00000000 |........ | The eyecatcher should be BBOOORBR. This ORBR control block is associated with a new request coming into the controller. The controller places the ORBR into the ORBR Registry. Since there are no servant regions, the controller returns it to the requestor and removes it from the ORBR Registry . Because the servant region has recently ended due to a timeout, another thread in the controller is running ORB_Request_Registry::drainRegistry() to remove every ORBR in the ORBR Registry. There is a small window where drainRegistry() can obtain an ORBR that is being freed concurrently on another thread. This causes the S0C4.
Local fix
Run with multiple servant regions to minimize the chance that all servants will be unavailable at the same time.
Problem summary
**************************************************************** * USERS AFFECTED: All users of IBM WebSphere Application * * Server V7.0 * **************************************************************** * PROBLEM DESCRIPTION: ABEND0C4/ABENDS0C4 in WebSphere * * Application Server for z/OS * * Controller following a BBOO0299I * * message. * **************************************************************** * RECOMMENDATION: * **************************************************************** Message BBOO0299I is issued when a server has lost its last servant and autopause is enabled (control_region_dreg_on_no_srs set to 1). The ORB_Request registry is drained of work at this time (ORB_Request_Registry::drainRegistry). There exists a timing window where drainRegistry can process a newly arrived request at the same time as the request is being forwarded or rejected because of the server's paused state. Both of these threads will attempt to cleanup the ORB_Request. An additional symptom is a dump issued with ERRNO=C9C212F7. This is a dump taken after detecting an incorrect ORB_Request object.
Problem conclusion
Code has been modifed in the inbound request paths to not register inbound requests unless we are accepting work for servants. APAR PM13797 is currently targeted for inclusion in Service Level (Fix Pack) 7.0.0.13 of WebSphere Application Server V7.0. Please refer to URL: //www.ibm.com/support/docview.wss?rs=404&uid=swg27006970 for Fix Pack availability.
Temporary fix
Comments
APAR Information
APAR number
PM13871
Reported component name
WEBSPHERE FOR Z
Reported component ID
5655I3500
Reported release
700
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2010-05-06
Closed date
2010-07-26
Last modified date
2010-11-03
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
WEBSPHERE FOR Z
Fixed component ID
5655I3500
Applicable component levels
R700 PSY UK61114
UP10/10/21 P F010
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS7K4U","label":"WebSphere Application Server for z\/OS"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.0","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
10 February 2022