Running the rolling kernel switch while automation is active

The rolling kernel switch (RKS) itself is capable of restarting SAP with a new kernel by smoothly restarting one SAP instance at a time without any automation solution being active. Therefore, one option is to switch off automation during RKS operations.

Nevertheless, you might want not to lose the failover capabilities that are offered by SA z/OS® during RKS. The following paragraphs describe the conditions under which you can keep SA z/OS automation active while running RKS. See also SAP Note 2131873: z/OS: Automated Rolling Kernel Switch in HA environment for further prerequisites and restrictions.

RKS with automation for Central Services

If your SAP central services are automated via an SA z/OS policy, you can keep this policy active during the runtime of the RKS, if you provide for the following:

  1. Use EnqCF replication with a matching SA z/OS policy (see Option 2: Using EnqCF replication only).
  2. Use the HA interface (see SAP HA Interface for SA z/OS).

RKS restarts the SAP central services (including the SAP enqueue server) in-place. The SA z/OS *SAPSRV add-on policy allows this restart in-place only if you use EnqCF replication and have no ERS instance configured.

Using the HA interface ensures that RKS commands to stop and start the Central Services are processed by SA z/OS.

If you use the traditional replication mechanism with a separate ERS instance, the SA z/OS policy does not allow a restart in-place. A restart of the ASCS instance takes place on the LPAR where ERS was running in order to avoid loosing SAP enqueue locks. While this restart is active, RKS cannot run successfully. Therefore, you must disable automation before starting RKS. For this purpose, either suspend or change the automation flag in SA z/OS to NO for the SAP<SID>ASCSX and SAP<SID>AER_X sysplex move groups. After RKS successfully finished, make sure that you reset the automation flags to YES again.

RKSwith Automation for SAP application servers

If your SAP application servers are automated using SA z/OS (see SAP application servers as proxy resources) then you should take the following into account before running RKS.

SAP recommends that you should not have automation active for your SAP application server instances when running RKS (see SAP Note 2077934: Rolling kernel switch in HA environments).

One of the reasons for this recommendation is the fact that RKS uses specialized soft shutdown procedures for stopping the application server instances, instead of using the standard (hard) shutdown that the automation uses to stop them. So in this scenario you should disable automation for the SA z/OS proxy resources by setting the automation flag to NO for the sysplex move groups SAP<SID>RM0_X, SAP<SID>RM1_X, …, and so on. After RKS successfully finished its operation and stopped and restarted the Application Server instances, you should reset the automation flag to YES. SA z/OS then recognizes that the application servers have been restarted already and the status of the SA resources are AVAILABLE.

If you have no special requirements for a soft shutdown, you can choose to keep automation for application server instances active during RKS. In this case, you should configure the HA interface for SAP application servers (see SAP HA Interface for SA z/OS).

Resolving PROBLEM / ZOMBIE states at restart of sapstartsrv

When RKS restarts SAP instances, then the new SAP kernel is copied from the global executable directory to the local instance executable directory. While the SAP instance is now started using the updated executables, the sapstartsrv process continues to run with the old level. In order to pick up the new code level, SAP has the following automatic restart mechanism that is built in to the sapstartsrv executable:

Whenever a running sapstartsrv detects that the file in the local instance executable directory from which it was started has changed, the running sapstartsrv process restarts itself after five minutes - using the new executable. This restart takes place in-place and is rather fast. Under certain timing conditions, SA z/OS is not quick enough to detect this. As a consequence, the SAP<SID>A_SRV resource is first shown with an observed status of STOPPING and is later shown with a PROBLEM status (SA z/OS agent status is ZOMBIE). To resolve this, check that the z/OS UNIX sapstartsrv process is running. If so set the SA z/OS state of the resource to UP.