APAR status
Closed as program error.
Error description
When using MQ Managed File Transfer (MFT), a managed transfer containing a single transfer item, and a transfer recovery timeout of 20 seconds, is submitted to a source agent. The source disposition for the transfer item is set to "Delete". While processing the transfer item, the managed transfer goes into recovery. 20 second later, the managed transfer is still in recovery, so the source agent stops it and marks it as "Failed". However, even though the "Transfer completed" message for the managed transfer that was published to the SYSTEM.FTE topic using the topic string: /Transfers/<agent name>/<transfer identifier> shows that it failed with result code 69 ("Transfer Recovery Timed out"), the "Transfer progress" message shows that the transfer item was processed successfully and the source item was deleted. Here are some examples of the "Transfer progress" and "Transfer completed" messages that are published to the SYSTEM.FTE topic when this issue occurs: ---- Transfer progress message: <?xml version="1.0" encoding="UTF-8"?><transaction ID="414d51207061756c745639314c5453206950ca6022c09303" agentRole="sourceAgent" version="6.00" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="TransferLog.xsd"> <action time="2021-06-25T09:47:46.283Z">progress</action> <sourceAgent QMgr="paultV91LTS" agent="V91LTSAGENT1" agentType="STANDARD"> <systemInfo architecture="amd64" name="Windows 10" version="10.0"/> </sourceAgent> <destinationAgent QMgr="paultV91LTS" agent="V91LTSPBA" agentType="BRIDGE" bridgeURL="sftp://myserver"> <systemInfo architecture="amd64" name="Windows 10" version="10.0"/> </destinationAgent> <originator> <hostName>172.29.192.1</hostName> <userID>032833866</userID> <mqmdUserID>032833866</mqmdUserID> </originator> <transferSet bytesSent="0" index="0" recoveryTimeout="20" size="1" startTime="2021-06-25T09:46:25.653Z" total="1"> <item mode="binary"> <source disposition="leave" type="file"> <file last-modified="2018-12-10T11:07:40.297Z" size="32500">C:\MFTFiles\Input\agent.properties</file> <checksum method="MD5">3d9aa69800a38e256e5f7f4ec185eaa0</checksum> </source> <destination exist="overwrite" type="file"> <file size="32500">FyreServer:agent.properties</file> </destination> <status resultCode="0"/> </item> </transferSet> </transaction> ---- Transfer completed: <?xml version="1.0" encoding="UTF-8"?><transaction ID="414d51207061756c745639314c5453206950ca6022c09303" agentRole="sourceAgent" version="6.00" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="TransferLog.xsd"> <action time="2021-06-25T09:47:46.287Z">completed</action> <sourceAgent QMgr="paultV91LTS" agent="V91LTSAGENT1" agentType="STANDARD"> <systemInfo architecture="amd64" name="Windows 10" version="10.0"/> </sourceAgent> <destinationAgent QMgr="paultV91LTS" agent="V91LTSPBA" agentType="BRIDGE" bridgeURL="sftp://myserver"> <systemInfo architecture="amd64" name="Windows 10" version="10.0"/> </destinationAgent> <originator> <hostName>172.29.192.1</hostName> <userID>032833866</userID> <mqmdUserID>032833866</mqmdUserID> </originator> <status resultCode="69"> <supplement>BFGSS0081E: Recovery of transfer ID: '414d51207061756c745639314c5453206950ca6022c09303' timed out after 20 seconds. The managed transfer has been terminated.</supplement> </status> <transferSet bytesSent="0" recoveryTimeout="20" startTime="2021-06-25T09:46:25.653Z" total="1"> <metaDataSet> <metaData key="com.ibm.wmqfte.SourceAgent">V91LTSAGENT1</metaData> <metaData key="com.ibm.wmqfte.DestinationAgent">V91LTSPBA</metaData> <metaData key="com.ibm.wmqfte.MqmdUser">032833866</metaData> <metaData key="com.ibm.wmqfte.OriginatingUser">032833866</metaData> <metaData key="com.ibm.wmqfte.OriginatingHost">172.29.192.1</metaData> <metaData key="com.ibm.wmqfte.TransferId">414d51207061756c745639314c545320 6950ca6022c09303</metaData> <metaData key="com.ibm.wmqfte.Priority">0</metaData> </metaDataSet> </transferSet> <statistics> <actualStartTime>2021-06-25T09:46:25.904Z</actualStartTime> <retryCount>1</retryCount> <numFileFailures>0</numFileFailures> <numFileWarnings>0</numFileWarnings> </statistics> </transaction> ----
Local fix
If the source agent transfer recovery timeout is sufficiently large, this may provide the PBA with time to provide the source agent with the audit information needed so that the transfer will not be identified as successful when the source agent recovery timeout expires.
Problem summary
**************************************************************** USERS AFFECTED: This issue affects two categories of user: Category 1: --------------- Users of MQ 9.1 Managed File Transfer, who have who have enabled the transfer recovery timeout functionality for managed transfers. Category 2: --------------- Users of MQ 9.2 Managed File Transfer, who have enabled the transfer recovery timeout functionality for managed transfers where: - The source agent is running that version of the product - And the destination agent is running MQ 9.1 LTS or earlier. Platforms affected: MultiPlatform **************************************************************** PROBLEM DESCRIPTION: The MQ Managed File Transfer transfer recovery timeout functionality allows the source agent for a managed transfer to fail that managed transfer if it enters recovery, and then stays in recovery for more than the timeout period. If a managed transfer fails due to a transfer recovery timeout, the source agent will use local audit information (about the source items) and remote audit information provided by the destination agent (for the destination items) to determine which items were successfully transferred before the managed transfer entered recovery and failed due to the timeout. The source agent will then process the source disposition for those items. The source agent generates the local audit information for a transfer item after it had sent all of the data for the source item to the destination agent. The destination agent will generate remote audit information after it has received the data, and processed it. Now, when the transfer recovery timeout was used with: - Either MQ 9.1 Managed File Transfer. - Or managed transfers where the source agent was running MQ 9.2 Managed File Transfer and the destination agent was running MQ 9.1 Managed File Transfer or earlier. the source agent would incorrectly assume that a transfer item had been processed successfully before the managed transfer entered recovery and timed out: - If there was local audit information for an item. - And there was no remote audit information for that item. If the source disposition for the transfer item was set to "Delete", then the source agent would delete the source item even though it hadn't been transferred to the destination.
Problem conclusion
To resolve this issue, MQ Managed File Transfer has been updated so that If a managed transfer goes into recovery and subsequently times out, the source agent will check for both local and remote audit information for each of the transfer items in the managed transfer. If either of these pieces of audit information is missing, then the source agent will mark the item as "Failed" with the new supplementary message: BFGSS0091E: The item has not been transfered, as the managed transfer has failed due to a transfer recovery timeout. This means that the source disposition for these items will not be processed. NOTE: This does not affect users of MQ 9.2 Managed File Transfer who use the transfer recovery timeout functionality for managed transfers where both the source and destination agents are running that version of the product, due to changes in the way transfer recovery timeouts are handled. --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: Version Maintenance Level v9.1 LTS 9.1.0.9 v9.2 LTS 9.2.0.4 v9.x CD 9.2.4 The latest available maintenance can be obtained from 'WebSphere MQ Recommended Fixes' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037 If the maintenance level is not yet available information on its planned availability can be found in 'WebSphere MQ Planned Maintenance Release Dates' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309 ---------------------------------------------------------------
Temporary fix
Comments
APAR Information
APAR number
IT36702
Reported component name
IBM MQ BASE MP
Reported component ID
5724H7271
Reported release
910
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-04-26
Closed date
2021-07-08
Last modified date
2021-07-08
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
IBM MQ BASE MP
Fixed component ID
5724H7271
Applicable component levels
R910 PSY
UP
[{"Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"910"}]
Document Information
Modified date:
16 July 2021