Recommended setup for high availability connections between client and server

This section describes the recommended setup aspects for OSPF and subnet configuration and for VIPA and source VIPA functions on remote application servers.

OSPF and subnet configuration aspects

In an SAP on IBM Z environment, transparent recoveries from NIC failures with OSPF can be achieved only if:

  • All NICs on a machine belong to different subnets and
  • VIPAs are set up on all machines in the system, on the database servers as well as on the application servers.

A host in a subnet is either directly accessible in its local subnet or it is in a remote subnet and the first gateway in the path to that subnet is directly accessible. OSPF does not change a subnet route if a host in a directly accessible subnet becomes inaccessible but other hosts in the subnet are still accessible.

OSPF changes a route to a subnet only in the following two cases:

  • Case A, where both OSA adapters/NICs are in the same subnet: If OSPF's own primary NIC connecting to a directly accessible subnet fails, it switches the route to that subnet to the backup (secondary) NIC. For OSPF, the primary NIC connecting to a subnet is the adapter, which is used to exchange OSPF data. If the secondary NIC fails, OSPF will not have to change its current route to that subnet as OSPF can still happily talk to the subnet over its primary NIC. However, in a 'one subnet' environment with VIPA support and two separate connection paths, OSPF's primary NIC may not be the NIC over which the SAP database traffic flows:

    The problem can be solved if OSPF recognizes each adapter on a machine as its primary NIC to a subnet. This can be achieved by running each NIC on a machine in its own subnet.

  • Case B, where both OSA adapters/NICs are in the same subnet: OSPF recalculates the route to a subnet/host, which is not directly accessible ('remote'), if its 'gateway' to the remote subnet/host is down.

    Therefore, if the NIC on an non-z/OS® application server fails, OSPF on z/OS does not recalculate its routing table because the directly accessible subnet, to which the failed NIC belongs, is still reachable (case A) and this subnet has no gateway to another remote subnet.

    However, on the application server OSPF does recalculate the route for the outbound traffic to the z/OS VIPA subnet because its gateway to the remote z/OS VIPA subnet has failed. As a result, the routing tables on the two sides differ and the users who are connected to this application server experiences a downtime.

    The problem can be solved where a remote subnet/host becomes inaccessible when the NIC on the application server fails. This can be achieved by defining a VIPA on the non-z/OS application server. Then, OSPF on z/OS also recalculates its routing table and the routing tables converge.

For the configuration shown in Figure 1, this means, that six different subnets are needed to exploit VIPA on both sides, on the z/OS database server and on the applications servers on AIX® and Linux® on IBM Z.

Optionally you can run with four subnets only, with two subnets for the OSAs, one for the z/OS VIPAs and one for the remote application server VIPAs, if you define the VIPAs as hosts (/32 bit mask) to OSPF.

VIPA and source VIPA functions on remote application servers

Because each SAP work process on an application server initiates a TCP/IP connection to the z/OS database server and due to the way TCP/IP handles connection establishment and so on, an additional feature of VIPAs, the so-called Source VIPA function, is needed on the application server side:

  • Without Source VIPA: When the Source VIPA function is not used and a request to set up a connection is processed on the application server, the IP address of the NIC of the application server is put into the 'request' IP packet as source IP address before it is sent to z/OS. z/OS sends its response to exactly that source IP address. This behavior does not allow the exploitation of VIPAs on the application server side because this means that – viewed from the z/OS side – the application server VIPA never shows up as the IP address of a connection that 'originates' on the application server. This makes transparent recoveries from adapter failures on the application server impossible.
  • With Source VIPA: When the Source VIPA function is used, the VIPA is put into the IP header of an IP packet as source IP address, and the exploitation of VIPA on the application server allows transparent recoveries from NIC failures on the application server.

The VIPA function is available in AIX. The administrator can control for which interface(s) the VIPA is used as source address for outgoing packets (source VIPA).

The VIPA function is available on Linux on IBM Z via the so-called dummy device. For detailed information concerning the definition of a VIPA under Linux on IBM Z, see VIPA - minimize outage due to adapter failure in Linux on IBM Z - Device Drivers, Features, and Commands, SC33-8411, available from https://www.ibm.com/docs/en/linux-on-systems?topic=linuxonibm/liaaf/lnz_r_devdd.htm.

With RHEL 8.x, you may use the NetworkManager CLI to define a dummy interface. Read Static VIPA definitions required for Red Hat Enterprise Linux for a sample on how to create such a dummy interface.

It is not recommended that you use the Source VIPA utility as described because of its dependency on the LD_PRELOAD feature, which for security reason is disabled for any processes running with UID=0.

In the Device Drivers manual, the section on Standard VIPA is relevant. Of special importance is the qethconf command, which must be used to register any Linux VIPAs into any OSA ports that are used as gateways for the VIPAs on the local interfaces. For example, if you have a dummy0 interface with a VIPA of 10.1.100.1 and two local OSA interfaces eth0 and eth1, then the following commands must be issued:

qethconf vipa add 10.1.100.1 eth0
qethconf vipa add 10.1.100.1 eth1
Note: The qethconf command supports only layer 3 OSA interfaces, not layer 2. So the two interfaces that are mentioned in the example must be defined as layer 3. You might check this via running the lsqeth command under root authority.
Note: The interface names are from a SLES system. For RHEL 8.x, replace the interface names like eth<x> or hsi<x> with enc<device number> as the qeth device driver assigns the same interface name for Ethernet and HiperSockets devices: enc<device number>.

Failure to issue the two shown commands results in inbound IP packets with a destination-IP-address of the VIPA 10.1.100.1 being dropped by the OSA card(s). This is because the OSA card is operating at the Layer 3 level and supports multiple IP addresses with a single MAC address. If the VIPA is not registered, the OSA does not know the device numbers to which the IP packet should be forwarded. See Static VIPA definitions required for SUSE for a solution about how to issue the qethconf command when an eth<x> interface is displayed at boot time (in this solution the setvipa script was used).

Since SLES 12 and RHEL 5.1 or higher Quagga provides the ability to set the source IP entry in any routes that it adds to the IP stacks routing table.

Starting with RHEL 8.x quagga is deprecated. It has been replaced by Free Range Routing (FRR) https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/8.1_release_notes/rhel-8_1_0_release.

Enable FRR for RHEL 8.x:

  1. Install FRR package on the system (yum install frr).
  2. Edit the /etc/frr/daemons configuration file to enable the required daemons (ospfd=yes) as the default is 'no' for all daemons.
  3. Enable 'zebra=yes' to use FRR, if zebra is in the list of daemons.
  4. Create ospfd.conf and zebra.conf files in /etc/frr directory. For sample contents of the files under Linux on IBM Z, see Linux on IBM Z network settings. For sample contents of the files under Linux on IBM® Power® Systems, see Tips when using OSPF routing in Red Hat Enterprise Linux on IBM Power Systems.
  5. Enable the frr services: systemctl enable frr.
  6. Start the frr services: systemctl start frr.
  7. Check that frr daemon services are running: systemctl status frr.

All routes that are learned first by the ospfd daemon are passed to the zebra daemon, which can process them before passing them to the IP stack via the NETLINK interface.

The zebra daemon has a well-established route-map and prefix list filter feature, to which has now been added the ability to set a source IP via a set src subcommand.

For this example, assume that all z/OS VIPAs are in the subnet 10.1.100.0/24, and that you want only to set the source IP of our own VIPA address 10.1.200.1 for routes to these z/OS VIPAs.

Recommended setup for a high availability network

Figure 1 shows the recommended setup for a high availability network between the SAP application server and the z/OS database server (or NFS, SCS, and so on) that results from the previous considerations in this topic:

  • Db2® data sharing (for DB server)
  • Duplicate network hardware components
  • Db2 connection failover (for ABAP and Java™ application servers)
  • Different subnets for OSPF
  • VIPA exploitation on z/OS
  • VIPA and Source VIPA exploitation on the application server side.
Important:

This recommended HA network setup allows transparent recovery of most kinds of network outages. If you use OSPF, any failure in the network path is detected and can be handled transparently to the highest degree, this is because failing OSPF heartbeats, which probe the network path. If you do not have such demanding requirements for your network availability, read Alternative network setup, which describes a simpler setup.

Figure 1. Recommended setup for a high availability network
Graphic shows an example of a setup for an HA network

In this configuration, all NICs on one machine (z/OS and remote application server) and all VIPAs belong to different subnets. This generates the following routing alternatives:

  • VIPA 10.96.1.1 (of subnet 10.96.1.x) on z/OS A can be reached from SAP application server A by normal IP routing over subnet 10.1.1.x (10.1.1.3 - Switch 1 - 10.1.1.1) or subnet 10.1.2.x (10.1.2.3 - Switch 2 - 10.1.2.1).
  • Source VIPA 10.98.1.1 (of subnet 10.98.1.x) on SAP application server A can be reached from z/OS A by normal IP routing over subnet 10.1.1.x (10.1.1.1 - Switch 1 - 10.1.1.3) or subnet 10.1.2.x (10.1.2.1 - Switch 2 - 10.1.2.3), accordingly.

Alternatively, you can run with four subnets only, if you define the VIPAs as hosts (/32 bit mask) to OSPF. Two subnets for the OSAs, one for the z/OS static VIPAs and the remote application server VIPAs and one for the dynamic z/OS VIPAs.

The following table shows the recovery attributes of the recommended setup.

Table 1. Recovery attributes of the recommended setup
Failing network component Recovery mechanism Impact on SAP users
NIC on application server OSPF/VIPA Transparent
NIC on z/OS, switch, cable OSPF/VIPA Transparent
z/OS TCP/IP stack Db2 connection failover Reconnect (directly or after one connect timeout)

The remote application server detects the failure of the switch not later than the end of the OSPF's dead router interval, which is 40 seconds by default. If a shorter interval is required, it is recommended to use a value of 10 seconds (or a different value, which fits your requirements after careful investigation).

Additional considerations

Assume that the primary DB server (static VIPA 10.96.1.1) of application server (AS) A is the data-sharing member DSN1 running in the z/OS A host, and its secondary DB server (static VIPA 10.97.1.1) is the z/OS B host running DSN2. Also, assume that AS A's DB connections go to DSN1.

If the z/OS A host is down or not reachable, the AS A running OSPF quickly detects that there is no longer a route to the VIPA 10.96.1.1. Reconnection requests initiated by the application server Db2 connection failover functionality are routed or forwarded to the default gateway of the AS. If the z/OS VIPA belongs to a private network, such as the 10.x.x.x, the gateway does not forward such a packet, it deletes it and sends a connection refused reply.

Because the Db2 connection failover functionality does three retries to the VIPA 10.96.1.1 of its primary DB server, it quickly receives three connection refused replies and almost immediately tries to connect to the VIPA 10.97.1.1 of its secondary DB server, the z/OS B host (which runs the DSN2 data-sharing member).

A failover to the secondary DB server does not happen immediately. It takes minutes with the TCP default settings, if the static VIPA of z/OS A belongs to a subnet, which is forwardable by the default gateway. This means that if the connect IP packet is not dropped by the default gateway it does not generate a 'connection refused' reply.

Because there is no reply, the AS reconnection attempt has a connection timeout, which is per default about 75 seconds on AIX and 180 under Linux. The Db2 connection failover functionality starts three retries until it eventually tries to connect to the static VIPA of the secondary DB server.

If you cannot implement a setup where the default gateway drops the connection request packet, you should first consider to adapt the CLI connection timeout parameters ConnectionTimeout and tcpipConnectTimeout as described in subsection Recommended Usage in topic Setup of CLI Failover with the SAP Failover Configuration Tool of the Database Administration Guide for SAP on IBM Db2 for z/OS. This is the recommended way. If this is not an option, then you must adapt the following TCP parameter to achieve an acceptable time for an application server Db2 connection failover:

Supported Linux platforms and distributions:
sysctl -w net.ipv4.tcp_syn_retries=2
or adapt the setting permanently in:
/etc/sysctl.conf: net.ipv4.tcp_syn_retries=2
AIX supported releases:
no -p -o tcp_keepinit=40
make the setting permanent with an entry in /etc/tunables/nextboot.
Note: The described settings are valid for all established outgoing TCP V4 connections on the application server system.

An extended failover time is also observed if OMPROUTE under z/OS is set up in a way to advertise a subnet route for the static VIPA to its neighbors and the static VIPA fails but the TCP stack remains operational, which is an unlikely scenario.