Configuring z/OSMF for high availability

In general, z/OSMF cannot be implemented in a high-availability cluster. Each z/OSMF server control access to its own persistence data store, which cannot be shared with other z/OSMF servers. However, it is possible to create a configuration of multiple servers to provide a level of redundancy for a subset of z/OSMF functions. This topic describes an example use case.

For applications that rely z/OSMF REST services, it is possible to configure extra z/OSMF servers to ensure availability of those services. The goal of this configuration is to ensure that one z/OSMF server is always available to provide the REST services to your applications.

The following z/OSMF REST services are supported for high availability:

z/OS jobs REST interface services
z/OS data set and file REST service "Retrieve the contents of a z/OS data set or member."

These services are described in IBM z/OS Management Facility Programming Guide.

To ensure that these services remain available to your applications, you can establish a pair of z/OSMF servers on different systems in your sysplex, with each server defined to its own autostart group. z/OSMF includes one autostart group by default. To have more z/OSMF servers autostarted in a sysplex, you must associate each server—and the systems it serves—with a unique autostart group name. Information about defining autostarted servers is provided in Autostart concepts in z/OSMF.

Use a workload router to switch between the z/OSMF servers if an outage occurs. Figure 1 shows how a z/OSMF workload can be distributed between multiple z/OSMF servers.

In a sysplex, you can run multiple servers of z/OSMF, and distribute requests to specific z/OSMF servers. In this type of configuration, critical REST services remain available to your applications across system outages. — Figure 1. You can run multiple servers of z/OSMF, and use Sysplex Distributor to route requests to specific z/OSMF servers.

In this scenario, a Sysplex Distributor (or a similar workload router) is used to route application requests to a high availability (HA) server on a particular LPAR. If the HA server at system IP address 172.1.1.1 becomes unavailable, the Sysplex Distributor can redirect REST requests to the HA server at system IP address 172.1.1.2. Each HA server maintains its own copy of the z/OSMF data file system, which contains persistence data. No persistence data is shared between the HA instances.

How to set up z/OSMF for high availability

Assume that two z/OS systems in a sysplex, SYS1 and SYS2, are updated, as follows:

Define an IP address for both systems:
1. Add the following statement to the TCPIP profile for the SYS1 system:
```
IPCONFIG DYNAMICXCF 172.1.1.1 255.255.255.0 3
```
2. Add the following statement to the TCPIP profile for the SYS2 system:
```
IPCONFIG DYNAMICXCF 172.1.1.2 255.255.255.0 3
```

Define a dynamic VIPA (DVIPA) for both SYS1 and SYS2:

VIPADYNAMIC
...                                                    
:                                                              
:-------------------------------------------------------------
: Test HA for zOSMF                                      
:                                                              
 VIPADEFINE 255.255.255.0 10.1.1.1                         
 VIPADISTRIBUTE DEFINE DISTM HOTSTANDBY 10.1.1.1 PORT 34111
   DESTIP 172.1.1.1 PREFERRED                                
          172.1.1.2 BACKUP                                   
ENDVIPADYNAMIC

In this example, the VIPADEFINE statement is used to define the DVIPA 10.1.1.1. The VIPADISTRIBUTE statement with PREFERRED and BACKUP settings is used to enable automatic dynamic VIPA takeover to occur, if needed. The system SYS1 is defined as the preferred system and the system SYS2 is defined as the backup system.

These statements are added to the TCP profiles for both SYS1 and SYS2.

Register the DVIPA with one hostname so that z/OSMF can be bound to that hostname. Define the z/OSMF hostname in your name server, for example 10.1.1.1.
In the active IZUPRMxx parmlib member for SYS1 and SYS2, define the z/OSMF hostname and port, for example:
```
HOSTNAME(‘zosmfha.yourcompany.com’)                                 
   HTTP_SSL_PORT(34111)   
```

Now assume that both SYS1 and SYS2 are active. Each system has an active z/OSMF server with its own data directory (sometimes called the user directory). Both z/OSMF servers are bound to the DVIPA 10.1.1.1. With both z/OS systems active in the sysplex, the preferred z/OSMF server receives all new incoming requests. If the SYS1 system fails, new work requests for z/OSMF are routed to the server on SYS2. When SYS1 resumes normal operations, new work requests for z/OSMF are routed to SYS1 again. This behavior occurs because the IP parameter AUTOSWITCHBACK is in effect by default.

For more information about network configuration, see the following documents: