APAR status
Closed as program error.
Error description
With JR42993 patch installed and env. var DS_GENERATE_UNIQUE_JOBID=1 a parallel job which fails at an early stage may 'hang'. The job log will include a FATAL log event but the job's DSD.RUN and DSD.OshMonitor processes continue running and the job remains in the Running state.
Local fix
Problem summary
With JR42993 patch installed and env. var DS_GENERATE_UNIQUE_JOBID=1 a parallel job which fails at an early stage may 'hang'. The job log will include a FATAL log event but the job's DSD.RUN and DSD.OshMonitor processes continue running and the job remains in the Running state. The underlying bug in DSD.OshMonitor is that it gets a component error before the component has started and the num_running count is decremented below 0, and the exit test is for num_running = 0. The issue is related to JR42993 only because when we use pid as job id we can test whether the process is still alive.
Problem conclusion
Fixed DSD_OshMonitor to handle component error and terminate in a timely fashion.
Temporary fix
Comments
APAR Information
APAR number
JR44159
Reported component name
INFO SRVR PLATF
Reported component ID
5724Q3612
Reported release
850
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2012-09-21
Closed date
2012-10-05
Last modified date
2012-10-05
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
INFO SRVR PLATF
Fixed component ID
5724Q3612
Applicable component levels
R850 PSY
UP
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSZJPZ","label":"IBM InfoSphere Information Server"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.5","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Document Information
Modified date:
05 October 2012