APAR status
Closed as program error.
Error description
1) At about Mar 19 04:06:31, or a little before this time point, a "lsadmin lim restart" was executed on master houcy1-n-sp099a02. 2) But as master's load was high, the original lim had not exited, and not released the port. At meanwhile, the new lim failed to startup due to the port unavailable. So we see the messages in lim log as below: Mar 19 04:06:31 2015 46069 3 1.2.7 initSock(): chanServSocketExt_(). A socket operation has failed on the configured UDP port <7869> on host <houcy1-n-sp099a02>. Reason: <Address already in use>. Fatal error. Either change the port number in lsf.conf (LSF_LIM_PORT) or terminate the other process that is bound to the port. Mar 19 04:06:31 2015 46069 3 1.2.7 initSock: LIM has exited due to a fatal error. 3) As the lim was in abnormal status on master, a failover occurred. During the failover, a job submission was committed, but it failed with messages as below: LSF is down. Please wait ... Connection refused by server. Job not submitted.
Local fix
n/a
Problem summary
When the system is too busy to release a port, it will cause the lim/res restart to fail because the socket failed to initialize. This fix introduces the following parameters in lsf.conf to control the retry behavior:
Problem conclusion
fix it
Temporary fix
Comments
APAR Information
APAR number
P101035
Reported component name
LSF STD LEGACY
Reported component ID
5725G8206
Reported release
911
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2015-04-01
Closed date
2015-06-17
Last modified date
2015-06-17
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
LSF STD LEGACY
Fixed component ID
5725G8206
Applicable component levels
R911 PSY
UP
[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSWRJV","label":"IBM Spectrum LSF"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"911","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}},{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSETD4","label":"Platform LSF"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"911","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Document Information
Modified date:
17 June 2015