Configuring z/OSMF for high availability

Normally, z/OSMF cannot be implemented in a high-availability cluster. However, for applications that use a critical subset of z/OSMF REST services, it is possible to configure a second z/OSMF server to ensure availability of those services. In this special configuration, one z/OSMF server is always available to provide the REST services to your applications.

The following z/OSMF REST services are supported for high availability:
  • z/OS jobs REST interface services
  • z/OS data set and file REST service "Retrieve the contents of a z/OS data set or member."

These services are described in IBM z/OS Management Facility Programming Guide.

To ensure that these services remain available to your applications, you can establish a pair of z/OSMF servers on different systems in your sysplex, and use a workload router to switch between them if an outage occurs. Figure 1 shows how a z/OSMF workload can be distributed between multiple instances.
Figure 1. You can run multiple servers of z/OSMF, and use Sysplex Distributor to route requests to specific z/OSMF servers.
In a sysplex, you can run multiple servers of z/OSMF, and distribute requests to specific z/OSMF servers. In this type of configuration, critical REST services remain available to your applications across system outages.

In this scenario, a Sysplex Distributor (or a similar workload router) is used to route application requests to a high availability (HA) server on a particular LPAR. If the HA server at system IP address 172.1.1.1 becomes unavailable, the Sysplex Distributor can redirect REST requests to the HA server at system IP address 172.1.1.2. Each HA server maintains its own copy of the z/OSMF data file system, which contains persistence data. No persistence data is shared between the HA instances.

How to set up z/OSMF for high availability

Assume that two z/OS systems in a sysplex, SYS1 and SYS2, are updated, as follows:
  1. Define an IP address for both systems:
    1. Add the following statement to the TCPIP profile for the SYS1 system:
      IPCONFIG DYNAMICXCF 172.1.1.1 255.255.255.0 3
    2. Add the following statement to the TCPIP profile for the SYS2 system:
      IPCONFIG DYNAMICXCF 172.1.1.2 255.255.255.0 3
  2. Define a dynamic VIPA (DVIPA) for both SYS1 and SYS2:
    VIPADYNAMIC
    ...                                                    
    :                                                              
    :-------------------------------------------------------------
    : Test HA for zOSMF                                      
    :                                                              
     VIPADEFINE 255.255.255.0 10.1.1.1                         
     VIPADISTRIBUTE DEFINE DISTM HOTSTANDBY 10.1.1.1 PORT 34111
       DESTIP 172.1.1.1 PREFERRED                                
              172.1.1.2 BACKUP                                   
    ENDVIPADYNAMIC
    

    In this example, the VIPADEFINE statement is used to define the DVIPA 10.1.1.1. The VIPADISTRIBUTE statement with PREFERRED and BACKUP settings is used to enable automatic dynamic VIPA takeover to occur, if needed. The system SYS1 is defined as the preferred system and the system SYS2 is defined as the backup system.

    These statements are added to the TCP profiles for both SYS1 and SYS2.

  3. Register the DVIPA with one host name so that z/OSMF can be bound to that host name. Define the z/OSMF host name in your name server, for example 10.1.1.1.
  4. In the active IZUPRMxx parmlib member for SYS1 and SYS2, define the z/OSMF host name and port, for example:
    HOSTNAME(‘zosmfha.yourcompany.com’)                                 
       HTTP_SSL_PORT(34111)   
    

Now assume that both SYS1 and SYS2 are active. Each system has an active z/OSMF server with its own data directory (sometimes called the user directory). Both z/OSMF servers are bound to the DVIPA 10.1.1.1. With both z/OS systems active in the sysplex, the preferred z/OSMF server receives all new incoming requests. If the SYS1 system fails, new work requests for z/OSMF are routed to the server on SYS2. When SYS1 resumes normal operations, new work requests for z/OSMF are routed to SYS1 again. This behavior occurs because the IP parameter AUTOSWITCHBACK is in effect by default.

For more information about network configuration, see the following documents: