How To
Summary
How to check Shared Storage Pool (SSP) node types and initial checks and steps before doing a software or hardware upgrade on a VIOS node.
Objective
In a SSP cluster, some nodes perform key roles handled in certain layers.
These layers are classified as:
1. Cluster Aware AIX (CAA)
2. Database (DBN)
3. Message Format Service (MFS)
There are other layers, such as RSCT, which are in part of every node.
On a single node cluster, all the three roles above are performed by that node.
In a cluster with more than one node, the roles can change over a period of time when a node re-election is triggered. For example:
a) If there is a problem with a VIOS node
b) If the cluster services are stopped on a VIOS
c) If the VIOS node is rebooted
These are some of the tasks that can trigger a new node to become the CAA "Leader Node", "DBN Node" or "MFS Node".
When planning a hardware or software upgrade, including installing interim fixes on a VIOS node, cluster services on that VIOS node must be stopped before the change implementation and started back afterwards, using clstartstop command.
How to determine current CAA LEADER node
$ oem_setup_env
# pooladm dump node | grep -i leader
whichever one returns "amILeader=1" is the leader node.
How to determine current MFS node
$ oem_setup_env
# pooladm pool lsmfs /var/vio/SSP/[CLUSTER_NAME]/D_E_F_A_U_L_T_061310
How to determine DBN node
$ cluster -status -verbose | grep -p DBN
How to stop cluster services on node to be upgraded
$ clstartstop -stop -n clustername -m vios_hostname
If the VIOS is the DBN, LEADER, and/or MFS Node, this will cause a new node to take over the role(s).
If an issue is encountered, stop and contact your IBM Support Representative.
To restart cluster services:
$ clstartstop -start -n clustername -m vios_hostname
NOTE: All above commands will work when the cluster services are healthy. If there is a problem, it may not be possible to determine if the VIOS is the DBN, LEADER, or MFS node. In such case, it is best to capture snap from all node and contact your IBM Support Representative for investigation. Do not proceed with the hardware/software change.
There is a cluster wide snap that can be collected when there are many nodes. This can be done using clffdc command from oem_setup_env shell:
# clffdc -c FULL -p 3
If you notice any FFDC logs in the error logs, these should also be saved in case of RCA or to resolve the problem.
To collect padmin snap, run:
$ snap
Data will be saved in /home/padmin/snap.pax.Z
Environment
PowerVM VIOS 3.1
Document Location
Worldwide
[{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSPHKW","label":"PowerVM Virtual I\/O Server"},"ARM Category":[{"code":"a8m50000000L0KiAAK","label":"SSP"}],"ARM Case Number":"TS012518378","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"3.1.3;3.1.4"}]
Was this topic helpful?
Document Information
Modified date:
29 August 2024
UID
ibm16982661