IBM® WebSphere® MQ V6 introduces many new features, including improvements to WebSphere MQ queue manager clustering (hereafter referred to as clustering), especially in the area of workload balancing. The WebSphere MQ Migration Information manual contains information on migrating queue managers to WebSphere MQ V6. In addition to the issues related to migrating an individual queue manager, system administrators and architects who use clustering must consider issues relating to migrating a cluster as a whole, including:
- Minimising application outages
- Measuring and verifying migration success and planning for backward migration in the event of migration problems
- Taking advantage of new WebSphere MQ features
- Managing the migration of a cluster within the context of the wider WebSphere MQ network and the organisation's systems architecture
This article concentrates on the clustering specifics of migrating a cluster of queue managers from WebSphere MQ V5.3 to WebSphere MQ V6.0, but the general advice is applicable regardless of the WebSphere MQ version.
Migrating queue managers is generally a simple process, because WebSphere MQ is designed to automatically migrate objects and messages, and support mixed version clusters. However, when planning the migration of a cluster, you need to consider a number of issues, which are described below.
Forward migration involves upgrading an existing queue manager to a later version (such as WebSphere MQ V5.3 to WebSphere MQ V6.0) and is supported on all platforms. You may wish to forward migrate in order to take advantage of new features or because the old version is nearing its end-of-service date.
The control blocks that represent cluster objects can change from one version of WebSphere MQ to another, so when a cluster queue manager is started using the new version for the first time, the objects on the SYSTEM.CLUSTER.REPOSITORY.QUEUE are converted by the repository manager process. In the case of z/OS, this process is part of the channel initiator address space, so this conversion takes place when the queue manager's associated channel initiator is started. Once a queue manager has been forward migrated, all objects and persistent messages owned by the queue manager will still be available, and existing applications should continue to work without the need to rebuild them. Objects or attributes relating to features in the new version will be set to default values that ensure a minimum change in behaviour.
Backward migration is the process of downgrading the version of WebSphere MQ used by an existing queue manager. You may want to do a backward migration if you encounter problems after a forward migration. To return to a queue manager on a non-z/OS platform, a backup (taken when the queue manager was running at that version) must be restored, so only objects and messages stored in the backup will be available. On z/OS, with appropriate maintenance, WebSphere MQ is designed to backward migrate objects, so a backup is not required (although it is still good practice to take one).
When starting a z/OS channel initiator for the first time at the back-level version (with the backward migration PTF applied), repository manager will convert cluster objects on the SYSTEM.CLUSTER.REPOSITORY.QUEUE. After a z/OS queue manager has been backward migrated, all objects and attributes supported by the back-level version will be available, as will persistent messages. Objects or attributes relating only to the new version will be lost after a successful backward migration, and therefore you should not backward migrate after running the new version for an extended period.
It is important when making any system changes to test the changes in a test or QA environment before rolling out the changes in production, especially when migrating software from one version to another. Ideally, an identical migration plan would be executed in both test and production to maximise the chance of finding potential problems in test rather than production. In practice, test and production environments are unlikely to be architected or configured identically or to have the same workloads, so it is unlikely that the migration steps carried out in test will exactly match those carried out in production. Whether the plans and environments for test and production differ or not, it is always possible to find problems when migrating the production cluster queue managers. Techniques for minimising planned and unplanned outages in a migration scenario are detailed in the sections below.
When creating the migration plan, you need to consider general queue manager migration issues, clustering specifics, wider system architecture, and change control policies. Document and test the plan before migrating production queue managers. Here is an example of a basic migration plan for a cluster queue manager:
- Suspend queue manager from the cluster.
SUSPEND CLUSTER(<cluster name>)
- Monitor traffic to the suspended queue manager. The cluster workload algorithm can choose a suspended queue manager if there are no other valid destinations available or an application has affinity with a particular queue manager.
- Save a record of all cluster objects known by this queue manager. This data will be used after migration to check that objects have been migrated successfully.
DISPLAY CLUSQMGR(*)to view cluster queue managers.
DISPLAY QC(*)to view cluster queues.
- Save a record of the full repositories view of the cluster objects owned by this queue manager.
This data will be used after migration to check that objects have been migrated successfully.
DISPLAY CLUSQMGR(<migrated queue manager name>)on the full repositories.
DISPLAY QC(*) WHERE(CLUSQMGR EQ <migrated queue manager name>)on the full repositories.
- Stop queue manager.
- Take a backup of the queue manager.
- Install the new version of WebSphere MQ.
- Restart queue manager.
- Ensure that all cluster objects have been migrated successfully.
DISPLAY CLUSQMGR(*)to view cluster queue managers and check output against the data saved before migration.
DISPLAY QC(*)to view cluster queues and check output against the data saved before migration.
- Ensure that the queue manager is communicating with the full repositories correctly. Check that cluster channels to full repositories can start.
- Ensure that the full repositories still know about the migrated cluster queue manager and its cluster queues.
DISPLAY CLUSQMGR(<migrated queue manager name>)on the full repositories and check output against the data saved before migration.
DISPLAY QC(*) WHERE(CLUSQMGR EQ <migrated queue manager name>)on the full repositories and check output against the data saved before migration.
- Test that applications on other queue managers can put messages to the migrated cluster queue manager's queues.
- Test that applications on the migrated queue manager can put messages to the other cluster queue manager's queues.
- Resume the queue manager.
RESUME CLUSTER(<cluster name>)
- Closely monitor the queue manager and applications in the cluster for a period of time.
Administrators who are confident in the migration process may want to simplify the process by removing the steps to suspend and resume the queue manager. Conversely, administrators who are less confident in the migration process may want to remove queue the manager from the cluster before migrating.
A backout plan should be documented before migrating. It should detail what constitutes a successful migration, the conditions that trigger the backout procedure, and the backout procedure itself. The procedure could involve removing or suspending the queue manager from the cluster, backwards migrating, or keeping the queue manager offline until an external issue is resolved.
Applications can take advantage of the workload balancing features of clustering to reduce the risk of outages caused by queue manager downtime. Outages can be planned (such as migrating a queue manager) or unplanned (such as a disk failure). If a queue manager is the only host for a particular cluster queue, stopping that queue manager can obviously cause application outages. You can avoid these outages by defining the queue on other queue managers in the cluster, so that whilst migrating one queue manager, applications can use the alternate queues. Using multiple queues in this manner does not help if applications have affinities with a particular queue. Two common reasons for affinities are that queues are opened bind on open, or that the queue manager name has been specified on MQOPEN calls. You need to be aware of affinities during normal operations and even more so during migration, when the risk of extended queue manager downtime is increased.
Synchronising migration with application downtime
If there are no alternate destinations available, you can minimise downtime by synchronising queue manager migration with application downtime. Ideally, the cluster queue manager will also be tested before bringing applications back online. In a cluster it can be difficult to detect if remote applications are using local cluster queues, so careful monitoring of queues and channels may be required.
When migrating a cluster, it is sensible to carry out a staged migration, where queue managers are migrated one at a time. It is reasonable to leave days between migrating each queue manager in the cluster, in order to test applications before wholly committing all queue managers to run at the new version.
Cluster queue managers are designed to partake in clusters with queue managers running at different versions, which is why a staged a migration is possible. It is advisable, but not required, to migrate full repositories first. If all full repositories are migrated before partial repositories, new features will be exploitable by all queue managers running at the new version. Queue managers at the back-level version cannot take advantage of new version features, and will appear in the repositories of new version queue managers with the default values for any new version attributes.
If full repositories are not migrated before partial repositories, then the cluster will continue to work, but not all queue managers will be able to take advantage of new features. This is because of the manner in which cluster objects are published around the cluster, via full repositories, as shown in this example:
|Queue Manager Name||Full or partial repository||Cluster queues|
Start with a cluster of five queue managers as shown in Table 1, with all queue managers running at WebSphere MQ V5.3 and then migrate all the partial repositories (QM3, QM4 and QM5) to WebSphere MQ V6.0. One of the reasons for migrating is to use the new V6.0 feature cluster workload priority (CLWLPRTY) on the channels, so the cluster receiver channel on QM3 is altered as follows:
ALTER CHL(TO.QM3) CHLTYPE(CLUSRCVR) CLWLPRTY(5)
When this command is executed, QM3 publishes the change to the full repositories. Each full repository updates its local repository and forwards the original update from QM3 to queue managers that have subscribed to TO.QM3. QM5 has subscribed to TO.QM3, as applications connected to QM5 have previously put messages to Q1. At this point, the full repositories now hold QM3's changed cluster receiver record, but as the full repositories are running at V5.3, the record is a V5.3 record, and therefore does not contain the CLWLPRTY attribute.
QM5 holds QM3's changed cluster receiver record, which includes the CLWLPRTY attribute because QM5 is running at V6.0. Despite QM5 receiving the updated record via the back-level full repositories, it has received the V6.0 attribute data. Full repositories are able to forward publications, so they can publish records with attributes that they themselves do not support. Executing DIS CLUSQMGR(*) CLWLPRTY on QM5 returns:
QM1 CLWLPRTY(0) QM2 CLWLPRTY(0) QM3 CLWLPRTY(5) QM4 CLWLPRTY(0) QM5 CLWLPRTY(0)
This means that all the partial repositories can be upgraded before the full repositories and still take advantage of new function.
Now suppose we add a new V6.0 queue manager (QM6) to the cluster and an application connects to it and opens Q1. At this point QM6 needs to ask the full repositories about Q1 and where it is hosted. The full repositories publish information for Q1 and the queue managers that host it (QM3 and QM4) to QM6. Since the full repositories hold only V5.3 records, the records they send to QM6 do not contain CLWLPRTY attributes. At this point, executing DIS CLUSQMGR(*) CLWLPRTY on QM6 would return:
QM1 CLWLPRTY(0) QM2 CLWLPRTY(0) QM3 CLWLPRTY(0) QM4 CLWLPRTY(0) QM6 CLWLPRTY(0)
QM3's CLWLPRTY (0) does not match that on the actual TO.QM3 object (5). This mismatch will be resolved the next time QM3 publishes its cluster receiver, as the full repositories will then forward the publish (including V6.0 attribute data) from QM3 straight on to QM6. Mismatches like this, caused by the back-level full repositories publishing direct from their repositories, can also be caused by opening a queue for the first time or by issuing the REFRESH CLUSTER on a partial repository.
There are also issues with mixed version clusters if IPv6-only capable systems are involved. For more information, see the WebSphere MQ Migration Information manual.
Using new WebSphere MQ V6 features
Once a queue manager has been migrated, you should verify existing function and applications. After this verification, you can introduce new features to the test and then the production environment. WebSphere MQ V6.0 introduced the following new workload balancing features:
- Rank and prioritization of queues and channels
- Use-queue -- Allows workload balancing if a local queue exists
- Most recently used channels -- Reduces the number of active outbound channels
- Channel weighting
These features enable greater out-of-the-box flexibility in cluster workload balancing and reduce the requirement for a number of commonly used cluster workload exits. If the attributes relating to the new features are set to default values, workload balancing behaviour will be the same as in previous versions of WebSphere MQ. In a mixed version cluster, attributes for new features in cluster records that represent back-level queues and queue managers will be set to default values. For example, imagine that queue manager QM53 is running at WebSphere MQ V5.3 and the following command is issued at a queue manager QM6, which is running at WebSphere MQ V6.0:
DISPLAY CLUSQMGR(QM53) CLWLRANK AMQ8441: Display Cluster Queue Manager details. CLUSQMGR(QM53) CHANNEL(TO.QM53) CLUSTER(DEMO) CLWLRANK(0)
The actual channel definition for TO.QM53 on QM53 does not have the CLWLRANK attribute, but the cluster queue manager object for QM53 in QM6's repository does, so the repository record contains the default rank (zero).
Since introducing new workload balancing features adds complexity to the cluster workload algorithm, you should introduce new features one-by-one.
WebSphere MQ for z/OS V6 also introduces dynamic cache (already available on distributed platforms), which can extend the size of the cluster cache dynamically at runtime. A dynamic cache improves on the static cache used in previous versions of WebSphere MQ for z/OS, which required a queue manager restart to resize the cache. It also avoids the application errors and error messages associated with a full static cache. Cache type is set using the CLCACHE parameter in the CSQ6SYSP system parameter macro. The default type is static, because dynamic cache is not compatible with some cluster workload exits. To change the cache type, update the system parameter module and restart the queue manager.
Full repository availability
In general, the availability of full repository queue managers is important because all object publications (due to definition changes or the normal 27-day republish cycle) are sent via the full repositories. If there are no full repositories available, publications cannot be propagated around the cluster. Application traffic over existing cluster channels between partial repositories is not affected by full repository availability. It may be acceptable to have all full repositories unavailable if there are few or no definitional changes occurring during an outage.
During migration, when a queue manager is started at the new version, its local cluster objects are changed, since they now have new attributes. The definition change causes the queue manager to publish the changes to the full repositories. If the full repositories are available, they will publish the records to partial repositories that are subscribed to the changed objects. If the full repositories are not available, other partial repositories will not be informed of the change to the object, but they will still have any existing records for the changed object and therefore continue to function.
This article explained how to deal with some of the major issues involved when migrating cluster queue managers. To minimize the impact of these issues, it is important to invest in pre-migration planning and post-migration testing. WebSphere MQ's support of mixed-version clusters, which allow staged migrations, is a powerful asset to companies wishing to move the next version of WebSphere MQ.
The author would like to thank IBM Developers Andrew Banks and Gavin Beardall for their help in reviewing this article.
WebSphere MQ V6 trial download
A no-charge trial download of WebSphere MQ V6. Includes limited online support for Windows® and Linux® installations at no charge during the trial period.
WebSphere MQ V6 information center
A single Eclipse-based Web portal to all WebSphere MQ V6 documentation, with conceptual, task, and reference information on installing, configuring, and using your WebSphere MQ environment.
WebSphere MQ documentation library
WebSphere MQ product manuals.
WebSphere MQ product page
Product descriptions, product news, training information, support information, and more.
WebSphere MQ SupportPacs
Downloadable code, documentation, and performance reports for the WebSphere MQ family of products.
WebSphere MQ public newsgroup
A non-IBM forum where you can get answers to your WebSphere MQ technical questions and share your WebSphere MQ knowledge with other users.
WebSphere Business Integration products page
For both business and technical users, a handy overview of all WebSphere Business Integration products
developerWorks WebSphere Business Integration zone
For developers, access to WebSphere Business Integration how-to articles, downloads, tutorials, education, product info, and more.
Most popular WebSphere trial downloads
No-charge trial downloads for key WebSphere products.
Trial downloads for IBM software products
No-charge trial downloads for selected IBM® DB2®, Lotus®, Rational®, Tivoli®, and WebSphere® products.
developerWorks technical events and Webcasts
Complimentary half-day technical briefings in cities worldwide.
Safari Bookshelf: e-library designed for developers
Complete search and download access to thousands of technical books for a one-time subscription fee. Free trial for new subscribers.
Product-specific forums where you can get answers to your technical questions and share your expertise with other WebSphere users.
Ongoing, free-form columns by software experts, to which you can add your comments. Check out Grady Booch's blog on software architecture.