Checking that the other end of the channel is still available
You can use the heartbeat interval, the keep alive interval, and the receive timeout, to check that the other end of the channel is available.
Heartbeats
You can use the heartbeat interval channel attribute to specify that flows are to be passed from the sending MCA when there are no messages on the transmission queue, as is described in Heartbeat interval (HBINT).
Keep alive
On z/OS®, if you are using TCP/IP as
the transport protocol, you can also specify a value for the Keepalive interval
channel attribute (KAINT). You are recommended to give the
Keepalive interval a higher value than the heartbeat interval, and a smaller
value than the disconnect value. You can use this attribute to specify a time-out value for each
channel, as is described in Keepalive Interval (KAINT).
On IBM® i, AIX®, Linux®, and Windows systems, if you are using TCP as your transport
protocol, you can set keepalive=yes. If you specify this option, TCP periodically
checks that the other end of the connection is still available. It is not, the channel is
terminated. This option is described in Keepalive Interval (KAINT).
If you have unreliable channels that report TCP errors, use of the Keepalive option means that your channels are more likely to recover.
You can specify time intervals to control the behavior of the Keepalive option. When you change the time interval, only TCP/IP channels started after the change are affected. Ensure that the value that you choose for the time interval is less than the value of the disconnect interval for the channel.
For more information about using the Keepalive option, see the KAINT parameter in the DEFINE CHANNEL command.
Receive timeout
If you are using TCP as your transport protocol, the receiving end of an idle non-MQI channel connection is also closed if no data is received for a period. This period, the receive time-out value, is determined according to the HBINT (heartbeat interval) value.
- For an initial number of flows, before any negotiation takes place, the receive time-out value is twice the HBINT value from the channel definition.
- After the channels negotiate an HBINT value, if HBINT is set to less than 60 seconds, the receive time-out value is set to twice this value. If HBINT is set to 60 seconds or more, the receive time-out value is set to 60 seconds greater than the value of HBINT.
On z/OS, the receive
time-out value is set as follows:- For an initial number of flows, before any negotiation takes place, the receive time-out value is twice the HBINT value from the channel definition.
- If RCVTIME is set, the timeout is set to one of the following values,
depending on the RCVTTYPE parameter, and subject to any limit imposed by
RCVTMIN if it applies:
- The negotiated HBINT multiplied by a constant
- The negotiated HBINT plus a constant number of seconds
- A constant number of seconds
- If either of the values is zero, there is no timeout.
- For connections that do not support heartbeats, the HBINT value is negotiated to zero in step 2 and hence there is no timeout, so you must use TCP/IP KEEPALIVE.
- For client connections that use sharing conversations, heartbeats can flow across the channel (from both ends) all the time, not just when an MQGET is outstanding.
- For client connections where sharing conversations are not in use, heartbeats are flowed from
the server only when the client issues an MQGET call with wait. Therefore, you are not recommended
to set the heartbeat interval too small for client channels. For example, if the heartbeat is set to
10 seconds, an MQCMIT call fails (with MQRC_CONNECTION_BROKEN) if it takes longer than 20 seconds to
commit because no data flowed during this time. This can happen with large units of work. However,
it does not happen if appropriate values are chosen for the heartbeat interval because only MQGET
with wait takes significant periods of time.
Provided SHARECNV is not zero, the client uses a full duplex connection, which means that the client can (and does) heartbeat during all MQI calls
- Canceling the connection after twice the heartbeat interval is valid because a data or heartbeat flow is expected at least at every heartbeat interval. Setting the heartbeat interval too small, however, can cause problems, especially if you are using channel exits. For example, if the HBINT value is one second, and a send or receive exit is used, the receiving end waits for only 2 seconds before canceling the channel. If the MCA is performing a task such as encrypting the message, this value might be too short.
Suggested settings
IBM MQ for z/OS/cpf ALTER QMGR TCPKEEP(YES) RCVTTYPE(ADD) RCVTIME(60) ADOPTMCA(ALL) ADOPTCHK(ALL)
where cpf is the command prefix for the queue manager subsystem.See ALTER QMGR and IBM MQ network availability for more information on the various parameters.
If the IP address of the sender could translate to more than one address, you might need to set ADOPTCHK to QMNAME rather than ALL.
IBM MQ for MultiplatformsTCP:
KeepAlive=Yes
CHANNELS:
AdoptNewMCA=ALL
AdoptNewMCACheck=ALL
See ALTER QMGR, Configuration file stanzas for distributed queuing, and Channels stanza of the qm.ini file for more information.
If the IP address of the sender could translate to more than one address, you might need to set AdoptNewMCACheck to QMNAME rather than ALL.