High Availability for ODM Rule Execution Server Console using VIPA

by James Taylor, IBM ODM z/OS developer.

Introduction

This article is designed for z/OS ODM system administrators.
The Rule Execution Server console (RES console) has two purposes:

Provides a web user interface and API for deploying rules to the runtime
Notifies execution units (XUs) that the rules they are using have changed

For most customers, having high availability (HA) for the RES console is not critical. This is because users who notice an absence of the console could ask a system administrator restart it. Furthermore, the RES console is not required for rules to run, so the rules will still execute even if the RES console is down. Also, the deployed rules won’t change, because the console could not have been used to update them; therefore, XUs will not miss any important rule updates.
However, because there may be a high cost associated with a delay to rule updates, some business environments require reliable live changes. The configuration described in the following sections can help ensure high availability.

What is VIPA?

Virtual IP Address (VIPA) support on z/OS allows multiple logical partitions (LPARs) on a system to communicate with the world via TCPIP by using a single host name and the standard range of ports. This contrasts with the usual approach by using two distinct IP addresses and acting as separate entities.
When VIPA is enabled and both LPARs host an application/service on the same port, how this is exposed to the world depends on your VIPA configuration. Configurations fall into two groups:

Work load management: With this configuration, when a connection request is made, to the shared VIPA hostname and port, which LPAR the request is routed to is decided by a workload algorithm – for example round robin.
Failover: With this configuration when a connection request is made, to the shared VIPA hostname and port, the request is always routed to the same LPAR until such a time as that LPAR, or the service/application being provided by the LPAR, becomes unavailable. When that happens the requests get routed to an alternative LPAR. How requests are routed when the primary LPAR recovers depends on the configuration.

How does VIPA apply to the RES Console?

The RES console does not support workload management. This is not required because of the relatively small amount of work required for rule updates. Only a failover configuration of VIPA is supported.
With a failover configuration, the XU connections and the web UI console requests coming in to the VIPA shared hostname are routed to a primary LPAR, until the RES console application becomes unavailable on that LPAR. At that time, VIPA switches to providing the service from a secondary LPAR.
At the time of failover, all XU will automatically reconnect to the secondary LPAR, because they attempt to reconnect to the shared hostname when the connection is lost.
Users of the RES console web user interface will be logged out by the failover, and need to re-authenticate. This is because sessions cannot persist between the two LPARs after the switch. I believe this to be a small inconvenience and failover should be very rare.
To ensure correct VIPA configuration I recommend exactly following the configuration described in this document, as other variations of Failover configuration do not perform as expected due to the interaction of the Console with the XUs. Also disregard the configuration given in the Redbook, as it is incorrect.
For the greatest ease and simplicity, I recommend hosting your VIPA enabled consoles on a WebSphere Liberty profile server. The advanced features of a traditional WebSphere Application Server are not required and, in the case of clustering, are unsupported.

Supported VIPA configuration

VIPA configuration is made in the TCPIP profile dataset members of each LPAR.
Both LPARs TCPIP profiles must contain an entry for each of the two ports used by the console. These ports must be the same on each LPAR. One port for the web UI and the other for notification to XUs. Do not attempt to provide VIPA for just one of the ports.

Figure 1: Two additional lines to the TCPIP profile of each LPAR
34190 TCP OMVS SHAREPORT 44190 TCP OMVS SHAREPORT

Having defined the ports you wish to use, on both the primary and secondary LPAR, next you need to add the VIPA configuration – just to the primary LPAR:

Figure 2: VIPA TCPIP profile configuration on the primary LPAR
VIPADYNAMIC VIPADEFINE MOVE WHENIDLE 255.255.255.252 9.20.9.53 VIPADISTRIBUTE DISTM HOTSTANDBY NOAUTOSWITCHBACK SYSPLEXPORTS 9.20.9.53 PORT 34190 44190 DESTIP 192.168.5.81 PREFERRED 192.168.5.82 BACKUP ENDVIPADYNAMIC

The items shown in bold are explained as follows:

HOTSTANDBY specifies a setup using failover rather than workload management.
NOAUTOSWITCHBACK prevents the situation where a failed primary LPAR console restarts with some persistent ‘XU notification’ connections remaining on the backup host. This is vital, as without this setting those XU left on the backup will miss all future notifications.

IMPORTANT NOTE:
The two consoles being used in the VIPA configuration must have a different SSID.

Making use of the VIPA configuration

The following are some noteworthy points in making use of the configuration once you have set it up.

Do not directly access the LPAR consoles once VIPA is enabled

Once you have configured and enabled VIPA, you must not allow users to directly access the LPARs via their own unique IP addresses/hostnames (for console access or XU notification). Users must only access the Console via the VIPA address. Mixing and matching between direct access and VIPA access to the LPARs will cause unpredictable results.

Selecting your primary and secondary LPARs

With the above configuration browser console sessions and XU connections will be serviced by whichever console you start first (preferred and backup are ignored when NOAUTOSWITCHBACK is used). So simply start your preferred primary LPAR console, followed by your secondary back up one.

Sending traffic back to your primary LPAR after it recovers from failure

When the primary LPAR console is restarted, after failure, VIPA remains bound to the backup server. All existing and future traffic continues to be routed to the backup. This is intentional to ensure that persistent XU connections remain on the same LPAR as the console sessions.
When you wish to move traffic back to your primary LPAR, simply stop and restart your backup Console. This period of backup server unavailability will instantly switch VIPA back to serving from the Primary LPAR console. It will also disconnect all the long-standing XU connections and force their reconnection.
This is the only supported way to switch back.

XU access via JMX or TCPIP?

When using VIPA to provide High Availability of the console, all XU connect for console notification using TCPIP, rather than using JMX. Details of how to configure this are provided in the knowledge centre in topic: “Configuring execution units (XU) to connect to a TCP/IP management server”

IBM Automation Community

High Availability for ODM Rule Execution Server Console using VIPA

Introduction

What is VIPA?

How does VIPA apply to the RES Console?

Supported VIPA configuration

Making use of the VIPA configuration

Do not directly access the LPAR consoles once VIPA is enabled

Selecting your primary and secondary LPARs

Sending traffic back to your primary LPAR after it recovers from failure

XU access via JMX or TCPIP?

Learn more:

Leave a Reply Cancel reply

Pages

Archives

Categories