Configuring high availability, recovery and restart
You can make your applications highly available by maintaining queue availability if a queue manager fails, and by recovering messages after server or storage failure.
About this task
On z/OS®, high availability is built
into the platform. See Shared queues and queue
sharing groups.
On Multiplatforms, you can improve client application availability by using
client reconnection to switch a client automatically between a group of queue managers, or to the
new active instance of a multi-instance queue manager after a queue manager failure. Automatic
client reconnect is not supported by IBM® MQ classes for Java. A
multi-instance queue manager is configured to run as a single queue manager on multiple servers. You
deploy server applications to this queue manager. If the server running the active instance fails,
execution is automatically switched to a standby instance of the same queue manager on a different
server. If you configure server applications to run as queue manager services, they are restarted
when a standby instance becomes the actively running queue manager instance.
- Microsoft Cluster Server
HA clusters on IBM i![[AIX]](ngaix.gif)
PowerHA® for AIX® (formerly HACMP on AIX)
and other UNIX and Linux® clustering solutions
On Linux systems, you can configure replicated data queue managers (RDQMs) to
implement high availability or disaster recovery solutions. For high availability, instances of the
same queue manager are configured on each node in a group of three Linux servers. One of the three
instances is the active instance. Data from the active queue manager is synchronously replicated to
the other two instances, so one of these instances can take over in the event of some failure. For
disaster recovery, a queue manager runs on a primary node at one site, with a secondary instance of
that queue manager located on a recovery node at a different site. Data is replicated between the
primary instance and the secondary instance, and if the primary node is lost for some reason, the
secondary instance can be made into the primary instance and started.
Native HA is a high availability solution aimed at containers.
Native HA uses log replication to keep three instances of a queue manager running on different nodes
up to date. One instance is active at any one time and processes messages. The active queue manager
send its log updates to the other two instances to keep them updated. If the active instance fails,
one of the replica instances automatically takes over the active role.
Another option for a high availability or disaster recovery solution is to
deploy a pair of IBM MQ appliances. See High Availability and Disaster Recovery in the IBM MQ Appliance documentation.
- Restart recovery, when you stop IBM MQ in a planned way.
- Failure recovery, when a failure stops IBM MQ.
- Media recovery, to restore damaged objects.