[Linux]

Operating in a disaster recovery environment

There are a number of situations in which you might want to switch over to the secondary queue manager in a disaster recovery configuration.

Disaster recovery
Following the complete loss of the primary queue manager at the main site, you start the secondary queue manager at the recovery site. Applications reconnect to the queue manager at the recovery site and the secondary queue manager processes application messages. The steps taken to revert to the previous configuration depend on the cause of the failure. For example, complete loss of main node versus temporary loss.

For steps to take following a temporary loss of the main site, see Switching over to a recovery node. For steps to take following permanent failure, see Replacing a failed node in a disaster recovery configuration.

Disaster recovery test support
You can test the disaster recovery configuration by temporarily switching over to the secondary instance and checking that applications can successfully connect. You follow the same procedure as when you switch over following a temporary failure of the primary node, see Switching over to a recovery node.
Reverting to snapshot
If you suffer a failure in the primary node while a synchronization is in progress, you can revert to the snapshot taken of the secondary queue manager data just before the synchronization started. The secondary is then restored to a consistent state and can be run as the primary. To revert to the snapshot, you make the secondary into the primary, as described in Switching over to a recovery node. You must check that the revert to snapshot has completed (by using the rdqmstatus command) before you start the queue manager.