Running the rolling kernel switch while automation is active
The rolling kernel switch (RKS) itself is capable of restarting SAP with a new kernel by smoothly restarting one SAP instance at a time without any automation solution being active. Therefore, one option is to switch off automation during RKS operations.
Nevertheless, you might want not to lose the failover capabilities that are offered by SA z/OS® during RKS. The following paragraphs describe the conditions under which you can keep SA z/OS automation active while running RKS. See also SAP Note 2131873: z/OS: Automated Rolling Kernel Switch in HA environment for further prerequisites and restrictions.
RKS with automation for Central Services
If your SAP central services are automated via an SA z/OS policy, you can keep this policy active during the runtime of the RKS, if you provide for the following:
- Use EnqCF replication with a matching SA z/OS policy (see Option 2: Using EnqCF replication only).
- Use the HA interface (see SAP HA Interface for SA z/OS).
RKS restarts the SAP central services (including the SAP enqueue server) in-place. The SA z/OS *SAPSRV add-on policy allows this restart in-place only if you use EnqCF replication and have no ERS instance configured.
Using the HA interface ensures that RKS commands to stop and start the Central Services are processed by SA z/OS.
If you use the traditional replication mechanism with a separate ERS instance, the SA z/OS policy does not allow a restart in-place. A
restart of the ASCS instance takes place on the LPAR where ERS was running in order to avoid loosing
SAP enqueue locks. While this restart is active, RKS cannot run successfully. Therefore, you must
disable automation before starting RKS.
For this purpose, either suspend or change the automation flag in SA z/OS to NO
for the
SAP<SID>ASCSX and SAP<SID>AER_X sysplex move groups. After RKS successfully finished, make sure that you
reset the automation flags to YES again.
RKSwith Automation for SAP application servers
If your SAP application servers are automated using SA z/OS (see SAP application servers as proxy resources) then you should take the following into account before running RKS.
SAP recommends that you should not have automation active for your SAP application server instances when running RKS (see SAP Note 2077934: Rolling kernel switch in HA environments).
One of the reasons for this recommendation is the fact that RKS uses specialized soft shutdown procedures for
stopping the application server instances, instead of using the standard (hard) shutdown that the
automation uses to stop them. So in this scenario you should disable automation for the SA z/OS proxy resources by setting the automation
flag to NO
for the sysplex move groups SAP<SID>RM0_X,
SAP<SID>RM1_X, …, and so on. After RKS successfully finished its operation and
stopped and restarted the Application Server instances, you should reset the automation flag to
YES
. SA z/OS then
recognizes that the application servers have been restarted already and the status of the SA
resources are AVAILABLE.
If you have no special requirements for a soft shutdown, you can choose to keep automation for application server instances active during RKS. In this case, you should configure the HA interface for SAP application servers (see SAP HA Interface for SA z/OS).
Resolving PROBLEM / ZOMBIE states at restart of sapstartsrv
When RKS restarts SAP instances, then
the new SAP kernel is copied from the global executable directory to the local instance executable
directory. While the SAP instance is now started using the updated executables, the
sapstartsrv
process continues to run with the old level. In order to pick up the
new code level, SAP has the following automatic restart mechanism that is built in to the
sapstartsrv
executable:
Whenever a running sapstartsrv
detects that the file in the local instance
executable directory from which it was started has changed, the running sapstartsrv
process restarts itself after five minutes - using the new executable. This restart takes place
in-place and is rather fast. Under certain timing conditions, SA z/OS is not quick enough to detect this. As a
consequence, the SAP<SID>A_SRV resource is first shown with an observed
status of STOPPING and is later shown with a PROBLEM status (SA z/OS agent status is ZOMBIE). To resolve this,
check that the z/OS UNIX
sapstartsrv
process is running. If so set the SA z/OS state of the resource to UP.