How To
Summary
In a scale-up SLES for SAP cost-optimized configuration, there are non-production HANA databases running on the secondary node. If there is a problem with the production database resource on the primary node, it fails over to the secondary node. The non-production database resources are then stopped, until the production database can be failed back to the primary node.
Objective
The aim is to return to normal operation, with all the resources running in the usual location.
Environment
- SLES for SAP cost-optimized configuration
- Production HANA database resource runs on both nodes for high availability
- One or more non-production HANA databases runs on secondary node
Steps
- Check that both cluster nodes are online, and that the resources are failed over. Here is an example. You can run the command as root on either node.
# crm_mon -A -1 Stack: corosync Current DC: saphana-node4-kvm (version 1.1.19+20181105.ccd6b5b10-3.16.1-1.1.19+20181105.ccd6b5b10) - partition with quorum Last updated: Thu Jul 2 08:05:59 2020 Last change: Thu Jul 2 08:05:27 2020 by root via crm_attribute on saphana-node4-kvm 2 nodes configured 7 resources configured Online: [ saphana-node3-kvm saphana-node4-kvm ] Active resources: rsc_ip_HA1_HDB10 (ocf::heartbeat:IPaddr2): Started saphana-node4-kvm Master/Slave Set: msl_SAPHana_HA1_HDB10 [rsc_SAPHana_HA1_HDB10] Masters: [ saphana-node4-kvm ] Slaves: [ saphana-node3-kvm ] Clone Set: cln_SAPHanaTopology_HA1_HDB10 [rsc_SAPHanaTopology_HA1_HDB10] Started: [ saphana-node3-kvm saphana-node4-kvm ] stonith-sbd (stonith:external/sbd): Started saphana-node3-kvm Node Attributes: * Node saphana-node3-kvm: + hana_ha1_clone_state : DEMOTED + hana_ha1_op_mode : logreplay + hana_ha1_remoteHost : saphana-node4-kvm + hana_ha1_roles : 4:S:master1:master:worker:master + hana_ha1_site : SITEA + hana_ha1_srmode : syncmem + hana_ha1_sync_state : SOK + hana_ha1_version : 2.00.040.00.1553674765 + hana_ha1_vhost : saphana-node3-kvm + lpa_ha1_lpt : 30 + maintenance : off + master-rsc_SAPHana_HA1_HDB10 : 100 * Node saphana-node4-kvm: + hana_ha1_clone_state : PROMOTED + hana_ha1_op_mode : logreplay + hana_ha1_remoteHost : saphana-node3-kvm + hana_ha1_roles : 4:P:master1:master:worker:master + hana_ha1_site : SITEB + hana_ha1_srmode : syncmem + hana_ha1_sync_state : PRIM + hana_ha1_version : 2.00.040.00.1553674765 + hana_ha1_vhost : saphana-node4-kvm + lpa_ha1_lpt : 1593691527 + master-rsc_SAPHana_HA1_HDB10 : 150
In this case, node3 is the primary node and node4 is the secondary.
Both nodes are online, but the "master" database resource is running on the secondary node, and the "slave" resource is running on the primary.
Check the hana_<sid>_sync_state. In this case, it is now PRIM on the failover node, and SOK on the primary node.Note: If the sync state is not SOK, fail back is not possible. -
Also, you need to check that AUTOMATED_REGISTER is set to "true" for the production database resource primitive. Run this command as root on either node, and look for AUTOMATED_REGISTER.
/usr/sbin/crm configure show
-
If the status is SOK on the intended primary node, put the failover node into standby mode. You can run this command on either cluster node. The -w option waits for the resources to move before the command finishes.
crm -w node standby saphana-node4-kvm
-
Next, run this command as root on either node to monitor the resources and watch the database resource fail back over to the primary node.
crm_mon -r -n
- Check the sr_state once more to confirm that it now says PRIM on the primary node and SOK on the secondary.
- When that is done, you can take the secondary node out of standby.
crm node online saphana-node4-kvm
- Check the status of the resources once more.
crm_mon -A -1
- We expect these results:
The production database msl resource is master on the primary node.
The production database is slave on the secondary node.
The primary node is PROMOTED and the secondary is DEMOTED.
The sr_state is PRIM on the primary node and SOK on the secondary.
Additional Information
Here are some general tips:
- Read this page carefully.
- Make sure you understand the concepts before you make changes to a production system.
- Make one change at a time.
- Allow time for each action to finish.
- Monitor the cluster resources for some time to make sure there are no unintended consequences, before you continue with the next step.
Document Location
Worldwide
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SGMV168","label":"SUSE Linux Enterprise Server"},"ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)","Line of Business":{"code":"LOB57","label":"Power"}}]
Was this topic helpful?
Document Information
More support for:
SUSE Linux Enterprise Server
Software version:
All Version(s)
Document number:
6243390
Modified date:
01 April 2021
UID
ibm16243390
Manage My Notification Subscriptions