APAR status
Closed as program error.
Error description
Some users have experienced a corruption of the TRACKER SUBSYSTEM WTRQ in ECSA where the chain pointer of the first queue element points to itself. This causes a loop in the EVENT WRITER task attempting to process the broken queue. . As the EVENTWRITER passes hundreds of copies of the same event to the CONTROLLER, it may experience problems handling the volume of invalid data, leading to many different problems, and possibly forcing the NMM to shut down. . While the cause of this problem is yet to be determined, code has been developed to detect the corrupted WTRQ and capture diagnostic information. The failing TRACKER started task will also be terminated and the WTRQ cleared so that a restart of the TRACKER should correct the problem. This code is currently in SMP/E UMOD format, and will be repackaged as a formal PTF. . CIRCUMVENTION: If this problem should occur prior to availability of the PTFs for this APAR, the user should STOP the looping TRACKER STARTED TASK, then restart it using the BUILDSSX(REBUILD) keyword in the TRACKER OPCOPTS. If the TRACKER does not respond to a P OPCx command, it may be necessary to CANCEL the address space. After the TRACKER has been restarted, the BUILDSSX() keyword should be immediately removed from the init parms to prevent loss of jobtracking information on future restarts.
Local fix
See CIRCUMVENTION, above.
Problem summary
**************************************************************** * USERS AFFECTED: All users of OPC systems. * * * * FUNCTION=EW Event Writer * **************************************************************** * PROBLEM DESCRIPTION: Some OPC customers have experienced * * a loop in the Tracker's subsystem * * WTRQ, the Event Writer queue. * * This APAR introduces a trap that * * is supposed to detect the problem * * and force an abend for diagnostic * * purposes. * **************************************************************** * RECOMMENDATION: * **************************************************************** A problem in the Event Writer queue, a loop in the chain pointers of the first queue element thus pointing to itself, has occurred in a few occasions to some OPC customers. The cause of this problem is still unknown and the introduction of this trap has two main purposes: limit the damages for the customers by intercepting the loop at the beginning and help the collection of useful documentation at the right moment.
Problem conclusion
The Event Writer task has been updated to force an abend and a dump if an event returned by the OPC queue routine is found to point to itself as the next event. The abend is an S0C1 abend in the EQQEWEVH module at offset +318. Please save the dump created by this abend. The Event Writer will terminate when this error occurs. It is expected that a restart of the Event Writer will be successful. 100Y 200Y EQQEWEVH
Temporary fix
Comments
APAR Information
APAR number
PQ21659
Reported component name
TME 10 OPC V2 R
Reported component ID
5697OPC01
Reported release
200
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
1998-11-24
Closed date
1998-12-16
Last modified date
1999-01-03
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UQ24459 UQ24460 PQ22427
Modules/Macros
EQQEWEVH
Fix information
Fixed component name
TME 10 OPC V2 R
Fixed component ID
5697OPC01
Applicable component levels
R100 PSY UQ24459
UP98/12/28 P F812
R200 PSY UQ24460
UP98/12/28 P F812
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSRULV","label":"IBM Workload Scheduler for z\/OS"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"200","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]
Document Information
Modified date:
03 January 1999