Fixes are available
APAR status
Closed as program error.
Error description
Environment: All distributed Tivoli Monitoring platforms Problem Description: A Tivoli monitoring process can fail in BaseBind with errno EAGAIN. For Bind, Errno EAGAIN is documented as "Kernel resources to complete the request are temporarily unavailable". The Basic Serices caller retries the BIND request without bound and without any embedded wait. The resultis a loop in Tivoli process initialization which consumes all available CPU cycles and renders the system unusable until the process is killed. Detailed Recreation Procedure: This defect rare: no recreate scenario is possible. Detailed Recreation Procedure: This defect rare: no recreate scenario is possible. Related Files and Output: The following RAS1 log msgs are visible (with default KBB_RAS1=ERROR): "KDEB_BaseBind") Status 1DE00000=KDE1_STC_CANTBIND=11: NULL "KDEB_BaseBind") <0x110C99248,0x10> bind: status 1DE00000, 00020000 00000000 00000000 00000000 "bind_ep") <0x110C99248,0x10> BSD bind details: Status 1DE00000 Errno 0 00000000 00020000 00000000 00000000 00000000 *==============================================================+
Local fix
No Known Workaround
Problem summary
A Tivoli monitoring process can fail in BaseBind with Errno EAGAIN. For Bind, Errno EAGAIN is documented as "Kernel resources to complete the request are temporarily unavailable". The caller of Basic Services retries this failure without bound and without any embedded wait. The result is a loop in Tivoli process initialization which consumes all available CPU cycles and renders the system unusable until the process is killed.
Problem conclusion
Processing has been changed to return socket pairs to the operating system on accept processing failures. This is expected to remedy the transient Kernel resource shortage. Additional diagnostics have been instrumented at all BIND calls for First Failure Data Capture in an attempt to diagnose the unbounded calls to Basic Services. The fix for this APAR is contained in the following maintenance packages: | fix pack | 6.2.1-TIV-ITM-FP0001
Temporary fix
Re-cycle the looping process.
Comments
APAR Information
APAR number
IZ61158
Reported component name
TEMA
Reported component ID
5724C04TE
Reported release
621
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2009-09-21
Closed date
2009-10-15
Last modified date
2009-12-09
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
TEMA
Fixed component ID
5724C04TE
Applicable component levels
R621 PSY
UP
[{"Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSTFXA","label":"Tivoli Monitoring"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"621"}]
Document Information
Modified date:
30 December 2022