PM40650: OUT OF MEMORY WHEN SIP SERVLET CONTAINER STUCK ON MULTIPLE TCP CONNECTIONS DURING SIP LOAD

APAR status

Closed as program error.

Error description

Out of Memory when Session Initiation Protocol (SIP) Servlet
Container is stuck on multiple TCP connections during SIP load

Use of 100 TCP connections caused the WebSphere Application
Server to hang after a long term load running ~ 30 hours on
OutOfMemory ( OOM ).

Local fix

```
n/a
```

Problem summary

****************************************************************
* USERS AFFECTED:  Session Initiation Protocol (SIP) users     *
*                  of IBM WebSphere Application Server Feature *
*                  Pack for Communications Enabled             *
*                  Applications (CEA)                          *
****************************************************************
* PROBLEM DESCRIPTION: There is a deadlock in the SIP          *
*                      container under heavy TCP load.         *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
The problem occurs when the server attempts to send a
multitude of messages over TCP (or TLS) concurrently.

When the container requests to send out a SIP message, it
places the outbound message in a queue, and calls one of the
worker threads to transmit the packet to the network. If there
is no available thread in the pool, the container thread gets
blocked, until some worker thread becomes available.

In some cases, the thread that initiates the transaction, is
a thread that is allocated from the same pool as the worker
threads. Under extremely high load, it is possible to come to
a point where all worker threads are busy, and they all
request to send out a message, concurrently.

In this situation, each thread in the pool is waiting for one
of the others to become available, introducing a deadlock. The
server remains unresponsive even after traffic slows down.

Problem conclusion

The problem is fixed in the SIP container by changing the
code that initiates message sending. With this fix, the
container first attempts to send the message from the
initiating thread, instead of forcing the allocation of a
worker thread. Only if the message cannot be delivered
immediately, a worker thread is allocated for completing the
work later. This reduces the chance of draining the thread
pool. Moreover, this eliminates the deadlock, and allows the
container to recover as soon as traffic slows back down to
normal.

The fix for this APAR is currently targeted for inclusion in
fix pack 1.0.1.11 for the Feature Pack for Communications
Enabled Applications. Please refer to the Recommended Updates
page for delivery information:
http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980

Temporary fix

Comments

APAR Information

APAR number
PM40650
Reported component name
CEA FEATUREPACK
Reported component ID
5724J0855
Reported release
700
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2011-06-02
Closed date
2011-06-09
Last modified date
2011-06-09

APAR is sysrouted FROM one or more of the following:

PM39613
APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name
CEA FEATUREPACK
Fixed component ID
5724J0855

Applicable component levels

R700 PSY
UP

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SUPPORT","label":"IBM Worldwide Support"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"700","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
09 February 2022

Tips

PM40650: OUT OF MEMORY WHEN SIP SERVLET CONTAINER STUCK ON MULTIPLE TCP CONNECTIONS DURING SIP LOAD

Subscribe

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

R700 PSY

Document Information

Share your feedback

Need support?