A fix is available
APAR status
Closed as program error.
Error description
Problem Description: TEMS crashes if FTO is not configured correctly. If HUB TEMS has a mirror (i.e. configured with FTO), all agents connected directly to HUB TEMS or to the mirror TEMS must be configured with primary and secondary TEMS option, so that they have 2 different entries in CT_CMSLIST and can switch from one TEMS to other. If there is no secondary TEMS to switch to, agent repeatedly reconnects to its primary TEMS, which results in TEMS crash, if that TEMS is currently running as a mirror. This APAR is for the TEMA framework code change - agent should ignore SWITCH command, if there is only one entry in CT_CMSLIST. Agent should write Universal message and an error message to the agent operations log indicating that the SWITCH command was received, but the agent is not properly configured for FTO and unable to handle the SWITCH command. The main intention of this fix is to help users to detect improper FTO configuration, not to prevent TEMS crash. Errors in the TEMS logs: 1. Multiple on-line messages for the same node issued on the same second or minute without corresponding OFF-LINE messages. For example: (48FCB1EE...) Remote node <LTM1:tb202A:ORA> is ON-LINE. (48FCB1EE...) Remote node <LTM1:tb202A:ORA> is ON-LINE. (48FCB1F0...) Remote node <LTM1:tb202A:ORA> is ON-LINE. (48FCB1F3...) Remote node <LTM1:tb202A:ORA> is ON-LINE. (48FCB1F3...) Remote node <LTM1:tb202A:ORA> is ON-LINE. 2. Out of memory errors 1DE00001=KDE1_STC_NOMEMORY 3. KPX errors: kpxreqic.cpp,226,"timeoutHandler") UseHandle is NULL for IRACommandRequest object handle <handle> Recreation steps: The "original" primary TEMS, is restarted. It starts up as a mirror. Agents reconnect to it. Those agents that got connected to secondary TEMS, while the original primary was down, switch to it. KPX calls KFA_InsertNodestatus() frequently, several times per second for every agent node, which generates multiple trace messages: kpxreqhb.cpp,773,"HeartbeatInserter") Remote node <node_name> is ON-LINE kpxreqic.cpp,226,"timeoutHandler") UseHandle is NULL for IRACommandRequest object handle <handle>. TEMS process kdsmain fails in a few minutes after start up. L3 approver : LK
Local fix
To avoid TEMS crashes, use this work-around until the fix is provided: 1) Identify all agents that are configured to connect to primary HUB or to FTO TEMS. 2) Check ini/config files for all of these agents and make sure that CT_CMSLIST has 2 different entries. If not, reconfigure agent and add a secondary TEMS. This APAR was inadvertently closed as a duplicate of IZ44590 hence it is being re-opened.
Problem summary
**************************************************************** * USERS AFFECTED: All TEMS users. * **************************************************************** * PROBLEM DESCRIPTION: Mirror TEMS crashes if FTO is * * incorrectly configured. * **************************************************************** * RECOMMENDATION: Apply this PTF. * **************************************************************** Mirror TEMS crashes, if remote agents reporting directly to primary or to the mirror TEMS are not configured with secondary TEMS option.
Problem conclusion
Code changes were made in TEMA agent framework to ignore "SWITCH" to secondary TEMS command, if agent is not configured with secondary TEMS.
Temporary fix
Comments
APAR Information
APAR number
OA28771
Reported component name
MGMT SERVER DS
Reported component ID
5608A2800
Reported release
620
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2009-04-25
Closed date
2009-05-05
Last modified date
2009-06-01
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Modules/Macros
KRANDREG
Fix information
Fixed component name
MGMT SERVER DS
Fixed component ID
5608A2800
Applicable component levels
R620 PSY UA47228
UP09/05/27 P F905
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSRJ5K","label":"Tivoli Management Server for Distributed Systems on z\/OS"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"620","Edition":"","Line of Business":{"code":"LOB17","label":"Mainframe TPS"}}]
Document Information
Modified date:
01 June 2009