Performing offline upgrade or excluding nodes from upgrade using installation toolkit
Starting in the IBM Spectrum Scale 5.0.2 release, the installation toolkit can upgrade and tolerate nodes being in an offline state, and it can exclude some nodes from the upgrade.
Upgrade when nodes are unhealthy
By using the IBM Spectrum Scale offline upgrade, you can upgrade your cluster even if one or more nodes are unhealthy. A node is called unhealthy when the services are down but it is reachable through ping commands.
When you designate a node as offline in the cluster configuration, during the upgrade run, the installation toolkit upgrades all installed packages. However, there is no attempt made to stop or restart the respective services. You must manually restart the previously offline services by using these commands: mmces service start for protocol components and mmstartup for GPFS daemon.
For example, you try to upgrade a 5-node cluster whose nodes are node1, node2, node3, node4, and node5 (protocol node).
- node3 is reachable but NFS is down
- node5 is reachable but SMB is down
- node2 is reachable but all services including GPFS are down
- Designate node3 as offline. This means that, during or after the upgrade, the installation toolkit does not restart any services including NFS on this node, but all installed packages including NFS (GPFS, SMB, and OBJ, and others) are upgraded on node3.
- Designate node5 as offline. This means that, during or after the upgrade, the installation toolkit does not restart any services including SMB on this node, but all installed packages including SMB (GPFS, NFS, SMB, OBJ, and others) are upgraded on node5.
- Designate node2 as offline. This means that all installed packages are upgraded, but none of the services are tried to be restarted.
- If you designate all nodes in a cluster as offline, then a full offline upgrade is performed on all nodes, leading to an upgrade of all installed packages without any services being started or stopped.
- If you try to designate a node that is already excluded as offline, then the exclude designation
of the node will be cleared, and the offline designation will be added. For
example,
[ INFO ] The node vm1.ibm.com was added in excluded list previously. Clearing this from excluded list. [ INFO ] Adding vm1.ibm.com as smb offline./spectrumscale upgrade config offline -N vm1
- After an offline upgrade, you must ensure that all unhealthy services are manually started (using mmces service start for protocol components, mmstartup for GPFS) .
Designating nodes as offline in the upgrade configuration
- To designate a node as offline, issue this
command:
An offline upgrade is performed on this node, which means that all installed packages are upgraded without any services being restarted../spectrumscale upgrade config offline -N nodename
Important: Before designating a node as offline, you must ensure that none of the components are active and if the node is a protocol node, then it must be suspended.- To check the status of the GPFS daemon, issue the mmgetstate command.
- To stop the GPFS daemon, issue the mmshutdown command.
- To check the status of protocol components, issue the mmces service list command.
- To suspend the protocol node and stop the protocol services, issue
the mmces node suspend --stop command. If you are upgrading from IBM Spectrum Scale version 5.0.2.0 or earlier, issue the following commands to suspend the protocol node and stop the protocol services:
mmces node suspend mmces service stop Protocol
- To designate all nodes as offline and do a full offline upgrade across the cluster, issue this
command:
All installed packages are upgraded on all the nodes in the cluster, but no services are restarted on any of the nodes../spectrumscale upgrade config offline -N node1,node2,node3.....,noden
- Clearing the offline designations
- To clear all offline designations from a specific node, issue this
command:
./spectrumscale upgrade config offline -N nodename --clear
- To clear all the offline designations from all the nodes, issue this
command:
./spectrumscale upgrade config offline --clear
- To clear all offline designations from a specific node, issue this
command:
- To clear both the offline and exclude configurations, issue this
command:
./spectrumscale upgrade config clear
- To view all configurations that are done for offline upgrade, issue this
command:
This includes the nodes that are excluded and the nodes where the components are designated as offline. An offline upgrade is initiated based on this configuration. For example,./spectrumscale upgrade config list
[ INFO ] GPFS Node SMB NFS OBJ GPFS [ INFO ] [ INFO ] Phase1: Non Protocol Nodes Upgrade [ INFO ] nsd001st001 - - - [ INFO ] nsd002st001 - - - [ INFO ] nsd003st001 - - - [ INFO ] nsd004st001 - - - [ INFO ] [ INFO ] Phase2: Protocol Nodes Upgrade [ INFO ] prt002st001 [ INFO ] prt003st001 [ INFO ] prt004st001 [ INFO ] prt006st001 [ INFO ] prt008st001 [ INFO ] prt009st001 [ INFO ] prt011st001 [ INFO ] [ INFO ] Excluded Nodes : prt007st001,prt001st001,prt010st001,prt005st001 [ INFO ]./spectrumscale upgrade config list
Upgrade when nodes are not reachable
- It is not recommended to exclude a subset of protocol nodes. For example, if you have 3 protocol
nodes, then you must exclude all 3 nodes together. It is not recommended to exclude only a subset (1
or 2) of nodes. For
example,
[ INFO ] Adding node vm1.ibm.com in excluded list. [ WARN ] Protocol nodes should all be upgraded together if possible, since mixed versions of the code are not allowed in CES components (SMB/OBJ). You may add the remaining protocol node(s) : vm2.ibm.com in the excluded list or clear node(s): vm1.ibm.com with the ./spectrumscale config exclude --clear option so that no protocol nodes are excluded../spectrumscale upgrade config exclude -N vm1
- Ensure that not all admin nodes are excluded and that at least one admin node is available in the non-excluded list. For example, if you have 3 admin nodes in a cluster that you want to upgrade, then you can exclude a maximum of 2 admin nodes only. If you have only one admin node, then it must not be excluded.
Excluding nodes from the upgrade configuration
- To exclude one or more nodes from the upgrade configuration, issue this
command:
This ensures that the installation toolkit does not perform any action on node1 and node2 during upgrade../spectrumscale upgrade config exclude -N node1,node2
- Clearing the exclude designations
- To clear the exclude configuration from specific nodes, issue this
command:
./spectrumscale upgrade config exclude -N node1,node2 --clear
Note: It is not recommended to clear only a subset of the protocol nodes that are designated as offline. - To clear the exclude configuration from all nodes, issue this
command:
./spectrumscale upgrade config exclude --clear
- To clear the exclude configuration from specific nodes, issue this
command:
Upgrading the excluded nodes or offline designated nodes
- Ensure that the nodes on which you want to perform offline upgrade are reachable through ping commands.
- For nodes that are designated as excluded, clear the exclude designation of the nodes in the
cluster definition file by using this
command:
./spectrumscale upgrade config offline -N node1,node2 --clear
- For nodes that were earlier designated as excluded, designate them as offline if all the
required services are not running by using this
command:
./spectrumscale upgrade config offline -N node1,node2
- Run the upgrade procedure on the offline designated nodes by using this
command:
The installation toolkit upgrades the packages and restarts the services only for an online upgrade. For an offline upgrade, the installation toolkit only upgrades the packages that are currently installed on the offline designated nodes../spectrumscale upgrade run
- After the upgrade procedure is completed, do the following:
- Restart the GPFS daemon by using the mmstartup command on each offline designated node.
- If the object protocol is configured, perform the post-upgrade object
configuration by using the following command from one of the protocol
nodes.
mmobj config manage --version-sync
- Resume the protocol node and restart the protocol services by using
the mmces node resume --start command for every offline designated node that is a
protocol node.If you are upgrading from IBM Spectrum Scale version 5.0.2.0 or earlier, issue the following commands to resume the protocol node and start the protocol services:
mmces node resume mmces service start Protocol
Populating cluster configuration when nodes are designated as offline in the upgrade configuration
- Extract the installation image that you want to use for doing the upgrade.
- Use the ./spectrumscale config populate command or copy the old cluster definition file or create a cluster definition file using the ./spectrumscale command.
- Shut down the node(s).
- Use the ./spectrumscale upgrade config command to designate the node(s) as offline in the upgrade configuration.
Limitations
- You cannot exclude a node if SMB service is running on it. This is not applicable for NFS or
object. For
example,
[ FATAL ] In order to exclude a protocol node running SMB from the current upgrade, the SMB service must first be stopped on that node. Please stop SMB using the mmces command and retry../spectrumscale upgrade config exclude -N vm1