APAR status
Closed as program error.
Error description
An IBM MQ 9.2.0.6 or 9.3.0.0 Managed File Transfer agent has been configured with one or more resource monitors defined. In order to investigate an issue, the agent is configured to generate a trace from startup using the steps shown in the: Collecting a Managed File Transfer agent trace from startup topic in the MQ 9.2 and 9.3 sections of IBM Documentation (the URI of this topic in the MQ 9.2 section of IBM Documentation is https://www.ibm.com/docs/en/ibm-mq/9.2?topic=tmftam-collecting-m anaged-file-transfer-agent-trace-from-startup). When the fteStartAgent command is run, the agent process appears to start successfully. However, the agent is unresponsive and is unable to perform any managed transfer requests, and the resource monitors that have been defined on it do not run. A Javacore taken when the issue occurs shows that there is a deadlock between two threads, as shown in the example below: Thread 1: ------------- "RecoveryThread" J9VMThread:0x00000000004F5000, ... ... Blocked on: java/util/logging/FileHandler@0x00000000E01E4300 Owned by: "pool-1-thread-1" (J9VMThread:0x000000000056D000, java/lang/Thread:0x00000000FFBEA770) .... Java callstack: at java/util/logging/FileHandler.publish(FileHandler.java:726(Compi led Code)) (entered lock: java/util/logging/FileHandler@0x00000000E01E4300, entry count: 1) at java/util/logging/Logger.log(Logger.java:749(Compiled Code)) at java/util/logging/Logger.doLog(Logger.java:776(Compiled Code)) at java/util/logging/Logger.logp(Logger.java:996) at com/ibm/msg/client/commonservices/j2se/trace/DefaultTracer.trace Data(DefaultTracer.java:414) at com/ibm/msg/client/commonservices/trace/Trace.traceDataInternal( Trace.java:1277(Compiled Code)) at com/ibm/msg/client/commonservices/trace/Trace.data(Trace.java:11 14) at com/ibm/wmqfte/ras/impl/WMQTraceHandler.publish(WMQTraceHandler. java:289(Compiled Code)) at java/util/logging/Logger.log(Logger.java:749(Compiled Code)) at com/ibm/wmqfte/ras/impl/TraceImpl.dataImplementation(TraceImpl.j ava:176) at com/ibm/wmqfte/ras/Trace.data(Trace.java:321) at com/ibm/wmqfte/utils/FTEPlatformUtils.isEmbeddedAgent(FTEPlatfor mUtils.java:168) at com/ibm/wmqfte/ras/impl/ResourceMonitorLogFormatter.format(Resou rceMonitorLogFormatter.java:96) (entered lock: com/ibm/wmqfte/ras/impl/ResourceMonitorLogFormatter@0x00000000E0 219240, entry count: 1) at java/util/logging/StreamHandler.publish(StreamHandler.java:222(C ompiled Code)) (entered lock: com/ibm/wmqfte/ras/impl/SecureFileHandler@0x00000000E041FF98, entry count: 2) at com/ibm/wmqfte/ras/impl/SecureFileHandler.publish(SecureFileHand ler.java:231) (entered lock: com/ibm/wmqfte/ras/impl/SecureFileHandler@0x00000000E041FF98, entry count: 1) at com/ibm/wmqfte/ras/impl/ResourceMonitorLogHandler.publish(Resour ceMonitorLogHandler.java:213) (entered lock: com/ibm/wmqfte/ras/impl/ResourceMonitorLogHandler@0x00000000E004 F3C8, entry count: 1) at java/util/logging/Logger.log(Logger.java:749(Compiled Code)) at com/ibm/wmqfte/ras/impl/TraceImpl.dataImplementation(TraceImpl.j ava:176) at com/ibm/wmqfte/ras/Trace.data(Trace.java:267) at com/ibm/wmqfte/monitor/management/MonitorManager.startMonitor(Mo nitorManager.java:814) at com/ibm/wmqfte/monitor/management/MonitorManager.registerMonitor (MonitorManager.java:1353) at com/ibm/wmqfte/monitor/management/MonitorManager.recoveryMonitor (MonitorManager.java:1193) at com/ibm/wmqfte/agent/AgentRecoveryRunnable.recoverResourceMonito r(AgentRecoveryRunnable.java:1024) at com/ibm/wmqfte/agent/AgentRecoveryRunnable.run(AgentRecoveryRunn able.java:279) at java/lang/Thread.run(Thread.java:830) at com/ibm/wmqfte/thread/FTEThread.run(FTEThread.java:70) Thread 2: ------------- "pool-1-thread-1" J9VMThread:0x000000000056D000, ... ... Blocked on: com/ibm/wmqfte/ras/impl/ResourceMonitorLogHandler@0x00000000E004 F3C8 Owned by: "RecoveryThread" (J9VMThread:0x00000000004F5000, java/lang/Thread:0x00000000E044A1C0) ... Java callstack: at com/ibm/wmqfte/ras/impl/ResourceMonitorLogHandler.publish(Resour ceMonitorLogHandler.java:93) (entered lock: com/ibm/wmqfte/ras/impl/ResourceMonitorLogHandler@0x00000000E004 F3C8, entry count: 1) at java/util/logging/Logger.log(Logger.java:749(Compiled Code)) at com/ibm/wmqfte/ras/impl/TraceImpl.entryImplementation(TraceImpl. java:79(Compiled Code)) at com/ibm/wmqfte/ras/Trace.entry(Trace.java:121) at com/ibm/wmqfte/monitor/configuration/ResourcesConfiguration.getR esourceConfigurations(ResourcesConfiguration.java:436) at com/ibm/wmqfte/monitor/configuration/ResourcesConfiguration.toSt ring(ResourcesConfiguration.java:470) at com/ibm/wmqfte/monitor/configuration/MonitorConfiguration.toStri ng(MonitorConfiguration.java:465) at java/lang/String.valueOf(String.java:3467(Compiled Code)) at java/lang/StringBuilder.append(StringBuilder.java:537) at com/ibm/wmqfte/monitor/management/MonitorImpl.toString(MonitorIm pl.java:346) at java/lang/String.valueOf(String.java:3467(Compiled Code)) at com/ibm/msg/client/commonservices/j2se/trace/TraceUtils$GenericF ormatAdapter.formatObject(TraceUtils.java:492(Compiled Code)) at com/ibm/msg/client/commonservices/j2se/trace/TraceUtils.formatOb ject(TraceUtils.java:97(Compiled Code)) at com/ibm/msg/client/commonservices/j2se/trace/HumanFormatter$Entr yFormatAdapter.formatRecord(HumanFormatter.java:938(Compiled Code)) at com/ibm/msg/client/commonservices/j2se/trace/HumanFormatter$Mess ageFormatAdapter.formatRecord(HumanFormatter.java:1330(Compiled Code)) at com/ibm/msg/client/commonservices/j2se/trace/HumanFormatter.form at(HumanFormatter.java:355(Compiled Code)) at java/util/logging/StreamHandler.publish(StreamHandler.java:222(C ompiled Code)) (entered lock: java/util/logging/FileHandler@0x00000000E01E4300, entry count: 2) at java/util/logging/FileHandler.publish(FileHandler.java:729(Compi led Code)) (entered lock: java/util/logging/FileHandler@0x00000000E01E4300, entry count: 1) at java/util/logging/Logger.log(Logger.java:749(Compiled Code)) at java/util/logging/Logger.doLog(Logger.java:776(Compiled Code)) at java/util/logging/Logger.logp(Logger.java:1022) at com/ibm/msg/client/commonservices/j2se/trace/DefaultTracer.metho dEntry(DefaultTracer.java:389) at com/ibm/msg/client/commonservices/trace/Trace.methodEntryInterna l(Trace.java:810(Compiled Code)) at com/ibm/msg/client/commonservices/trace/Trace.entry(Trace.java:7 45(Compiled Code)) at com/ibm/wmqfte/ras/impl/WMQTraceHandler.publish(WMQTraceHandler. java:266(Compiled Code)) at java/util/logging/Logger.log(Logger.java:749(Compiled Code)) at com/ibm/wmqfte/ras/impl/TraceImpl.entryImplementation(TraceImpl. java:79(Compiled Code)) at com/ibm/wmqfte/ras/Trace.entry(Trace.java:153) at com/ibm/wmqfte/monitor/management/MonitorTimerTask.run(MonitorTi merTask.java:71) at java/util/concurrent/Executors$RunnableAdapter.call(Executors.ja va:522) at java/util/concurrent/FutureTask.runAndReset(FutureTask.java:319) at java/util/concurrent/ScheduledThreadPoolExecutor$ScheduledFuture Task.access$301(ScheduledThreadPoolExecutor.java:191) at java/util/concurrent/ScheduledThreadPoolExecutor$ScheduledFuture Task.run(ScheduledThreadPoolExecutor.java:305) at java/util/concurrent/ThreadPoolExecutor.runWorker(ThreadPoolExec utor.java:1160) at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExe cutor.java:635) at java/lang/Thread.run(Thread.java:830)
Local fix
Problem summary
**************************************************************** USERS AFFECTED: This issue affects users of: - IBM MQ 9.2.0.6 Managed File Transfer - IBM MQ 9.3.0.0 Managed File Transfer who have agents that: - Have one or more resource monitors defined. - And have been configured with the agent property: trace=<trace_specification> set, to generate trace from startup. Platforms affected: MultiPlatform **************************************************************** PROBLEM DESCRIPTION: The trace mechanism and the resource monitor event log functionality provided by MQ Managed File Transfer makes use of java.util.logging Loggers. When an event needs to be written to either a trace file, or the resource monitor event log (resmonevent0.log), it is passed to each logger in turn. The logger then decides whether to write it out to the appropriate file (either a trace file or the resmonevent0.log file). Now, APAR IT39927, which was included in MQ 9.2.0.6 and 9.3.0.0 Managed File Transfer, added some internal locking to the logger for the resource monitor event log functionality to prevent two threads from writing entries to the log file (resmonevent0.log) at the same time. As a result of this change, if an agent that had resource monitors defined was configured to generate a trace from startup, then when that agent was started the following sequence of events could occur: - The agent's recovery thread (responsible for starting the agent) started a resource monitor on a new thread. - It then obtained an internal lock on the logger for resource monitor event log in order to write a "Monitor started" message to the log file. - A few milliseconds later, the thread running the resource monitor determined that it needed to write a trace statement and obtained an internal lock on the trace file. - While writing out the "Monitor started" message, the agent's recovery thread called a method which was traced. However, the trace file was currently locked by the resource monitor thread, and so the recovery thread became blocked. - Meanwhile, the monitor thread was processing the current trace statement and called into another method that was also traced. - While checking which logger needed to handle this trace statement, the thread tried to call into the logger for the resource monitor event log. However, that logger was locked by the recovery thread. This caused a deadlock within the agent: - The agent's recovery thread had locked the logger for the resource monitor event log, and was waiting for an internal lock on the trace file held by a resource monitor thread. - The resource monitor thread had obtained an internal lock on the trace file, and was waiting for the lock on the logger for the resource monitor event log.
Problem conclusion
To resolve this issue, some changes have been made to internal trace and logging mechanisms to prevent the deadlock reported in this APAR from occurring. --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: Version Maintenance Level v9.2 LTS 9.2.0.7 v9.3 LTS 9.3.0.1 v9.x CD 9.3.1 The latest available maintenance can be obtained from 'WebSphere MQ Recommended Fixes' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037 If the maintenance level is not yet available information on its planned availability can be found in 'WebSphere MQ Planned Maintenance Release Dates' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309 ---------------------------------------------------------------
Temporary fix
Comments
APAR Information
APAR number
IT41603
Reported component name
MQ BASE V9.2
Reported component ID
5724H7281
Reported release
920
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2022-07-25
Closed date
2022-08-25
Last modified date
2022-10-12
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
MQ BASE V9.2
Fixed component ID
5724H7281
Applicable component levels
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"920","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
12 October 2022