Step 1: Configure the Advisor and Agents to automatically restart in case of application or system failure (optional)
Although this step is optional, performing it will provide high availability to your target applications. In the event that an Agent fails, the Advisor would indicate that it has no information for any applications running on that system. As a result, target applications on the failing system would cease to receive new workload requests, in most cases, until the Agent is restarted. Automatically restarting the Agent on the same system would minimize this perceived outage. This can be accomplished using automation software or by defining an automatic restart manager (ARM) policy. For more information on defining ARM policies, see z/OS MVS Setting Up a Sysplex. In a sysplex subplexing environment, this step requires additional actions. For information about the changes to this step, see Considerations for automatic restart in a subplexing environment.
The Agent registers with ARM using the following values:
ELEMTYPE=SYSTCPIP
ELEMNAME=EZBsyscloneLBAGENT
TERMTYPE=ELEMTERM
where sysclone is a 1- or 2-character shorthand notation for the name of the MVS™ system. For example, if the sysclone value is 02, the resulting ELEMNAME value is EZB02LBAGENT. For a complete description of the SYSCLONE static system symbol, see z/OS MVS Initialization and Tuning Reference.
This indicates that if the Agent fails on this system, it should be restarted on this system only.
If the Advisor or its underlying system were to fail, the load balancer might continue to distribute workload requests according to the last set of information received from the Advisor, it might resort to preconfigured weights, or it might even stop distributing new work requests to the cluster. (The behavior depends upon the load balancer implementation; consult the load balancer documentation for details.) Therefore, it is important that the Advisor be restarted as soon as possible when a failure occurs, so that it can begin communicating with the load balancer and workload request distribution can resume normally. This restart capability should cover scenarios where the Advisor itself fails, and where the system that the Advisor is running on fails. The Advisor can run on any system in the sysplex and thus can be restarted on any system in the sysplex, as long as it is configured to use dynamic VIPAs and dynamic routing is in effect. The Advisor registers with ARM using the following values:
ELEMTYPE=SYSTCPIP
ELEMNAME=EZBLBADV
TERMTYPE=ALLTERM
This indicates that the Advisor should be restarted only on the same system in cases where the Advisor itself fails, and also restarted on a different system if the system fails. Using an ARM policy, you can indicate which systems are eligible for running the Advisor in the case of system failures. You also need to ensure that the specified backup systems have all the necessary configuration in place to enable the Advisor to be restarted there.
Some special considerations exist for scenarios where ARM is used and the TCP/IP stack address space terminates, as the result of a failure or of a planned operation. When the TCP/IP stack becomes unavailable, the Advisor also terminates, as it can no longer establish any TCP/IP communications. An ARM restart of the Advisor will likely fail, as the TCP/IP protocol stack will not be available when the restarts occur. You can handle these scenarios in the following ways:
- Planned outages of the TCP/IP stack
Manually start the Advisor on another system, as soon as the Advisor terminates on the system where TCP/IP is stopped.
- Unplanned outages of the TCP/IP stack
Ensure that an ARM policy (or other automation) is in place to quickly restart the TCP/IP stack on the same system. The Advisor also needs to be quickly restarted on the same system. This can be done by using an automation software package, or by using the TCP/IP profile AUTOLOG statement.
The AUTOLOG statement also has some important considerations:
- You should place the Advisor in the AUTOLOG statement list to ensure that it is started when TCP/IP is started on that system. However, you should specify the NOAUTOLOG parameter on the PORT reservation statements for the Advisor ports in the TCP/IP profile. This prevents TCP/IP from monitoring and attempting to restart the Advisor, as that could interfere with your automation logic or the ARM policy that you have put in place.
- The AUTOLOG function works best on systems where a single TCP/IP stack is active (INET environment). For CINET considerations, see Considerations for automatic restart in a CINET environment.
RDEFINE FACILITY IXCARM.SYSTCPIP.EZBLBADV UACC(NONE)
RDEFINE FACILITY IXCARM.SYSTCPIP.EZBLBAGENT UACC(NONE)
PERMIT IXCARM.SYSTCPIP.EZBLBADV CLASS(FACILITY) ID(LBADV) ACCESS(UPDATE)
PERMIT IXCARM.SYSTCPIP.EZBLBAGENT CLASS(FACILITY) ID(LBAGENT) ACCESS(UPDATE)
SETROPTS RACLIST(FACILITY) REFRESH
- If using AUTOLOG for the Agent, code the NOAUTOLOG parameter on the PORT reservation statement for the Agent port in the TCP/IP profile. This prevents the Agent from automatically being cancelled and restarted because the Agent does not listen on the port.
- If the Advisor is using IPv6 for the load balancer connections, or if any Agents are using IPv6 to connect to the Advisor, movement of the Advisor is limited to IPv6–enabled TCP/IP stacks.