IBM® Connect:Direct® provides facilities to recover from most errors that occur during Process execution. Recovery from the point of failure is usually accomplished quickly. The following types of errors can occur during normal operation:
- Link failure terminates a session between IBM Connect:Direct systems
- File I/O error occurs during Process execution
- IBM Connect:Direct abends because of a hardware or other error
- TCQ Corruption
IBM Connect:Direct provides the following facilities to address errors:
|Session establishment retry||When one or more Processes run with a node, IBM Connect:Direct establishes a session with
that node and begins execution. If IBM Connect:Direct cannot start the session,
IBM Connect:Direct retries the session
establishment. The initialization parameters, MAXRETRIES and WTRETRIES, determine the number of
retries and the interval between retries.
If IBM Connect:Direct cannot establish a session after all retries are exhausted, the Process is placed in the Hold queue in the TCQ with a status of Waiting for Connection (WC). When a session is established with the other node, all other Processes are scanned and the highest priority Process is executed after the previous Process is finished.
|VTAM automatic session retry||If Process execution is interrupted because of a VTAM session failure, IBM Connect:Direct automatically attempts to
restart the session. This recovery facility uses the same parameter values as the session
establishment retry facility.
If IBM Connect:Direct cannot establish the session, the Process that is executing and any other Processes that are ready to run with the other node are placed in the Hold queue with a status of Waiting for Connection (WC).
|TCQ/TCX Repair Utility||When the TCQ becomes corrupt because of an outage or other circumstance, IBM Connect:Direct may abend in production or during the next DTF initialization. The IBM Connect:Direct administrator can use the TCQ/TCX repair utility to remove ambiguous or corrupt data and avoid having to cold start the DTF and reinitialize the TCQ, thus losing any Processes left in the TCQ.|
|Process step checkpoint||As a Process executes, IBM Connect:Direct records which step is executing in the TCQ. If Process execution is interrupted for any reason, the Process is held in the TCQ. When the Process is available for execution again, IBM Connect:Direct automatically begins execution at that step.|
|COPY statement checkpoint/restart||For physical sequential files and partitioned data sets, IBM Connect:Direct collects positioning
checkpoint information at specified intervals as a COPY statement executes. Checkpoints are taken
for each member that is transferred within a PDS, regardless of the checkpoint interval. If the
copying procedure is interrupted for any reason, you can restart it at the last checkpoint
Note: Whenever a Process step is interrupted and restarted, some data will be retransmitted. Statistics records for the Process step will reflect the actual bytes transferred, and not the size of the file.
The COPY statement checkpoint/restart works in conjunction with step restart. The restart is automatic if IBM Connect:Direct can reestablish a session based on the initialization parameter values for MAXRETRIES and WTRETRIES. See COPY Statement Checkpoint/Restart Facility for more information.
The CHANGE PROCESS command can also invoke the checkpoint/restart facility. See Controlling Processes with Commands for instructions on how to use the CHANGE PROCESS command.
Note: Checkpoint/restart is not supported for I/O exits at this time.