SevOne NMS Upgrade Process Guide
ABOUT
This document describes SevOne NMS upgrade process. If you are performing an upgrade, you may use the Command Line Interface or the GUI installer.
As of SevOne NMS 7.0.0, SevOne is distributed using container technology, allowing a more confident deployment of the software. To run administrative commands on a SevOne appliance, the administrator must now execute commands in the context of the intended container.
By default, the container deployment of SevOne is set to be read-only.
DO NOT upgrade to SevOne NMS 7.0.0 or above (regardless of which prior SevOne NMS version you are on) until you have reviewed the list of Deprecated / Removed Features & Functions. For guidance, please reach out to your IBM Technical Account Team, IBM SevOne Support, or IBM Expert Labs.
To upgrade to SevOne NMS 7.1.0, the minimum mandatory version you must be on is SevOne NMS 7.0.1.
Upgrade to SevOne NMS 7.1.x - when performing an upgrade to SevOne NMS 7.1.0, you must be on SevOne NMS 7.0.1+. If you have SevOne NMS prior to SevOne NMS 7.0.1, you must first upgrade to SevOne NMS 7.0.1; before continuing with an upgrade to SevOne NMS 7.1.0.
Downgrade from SevOne NMS 7.1.0 - can only downgrade to the SevOne NMS version 7.0.1. For example, if you upgraded from SevOne NMS 7.0.2 to SevOne NMS 7.1.0 and now would like to downgrade, you will downgrade to SevOne NMS 7.0.1 and not to SevOne NMS 7.0.2.
This document provides details on how to upgrade/downgrade using:
- Command Line Interface
- for Upgrade - please refer to Upgrade using Command Line Interface.
- for Downgrade - please refer to Downgrade using Command Line Interface.
- GUI Installer - please refer to section Upgrade using GUI Installer.
For all platforms, if / is greater than 80GB, then 45GB of free disk space is required.
In addition to this,
- on a cluster without Openstack,
- free disk space on / must be greater than 20GB.
- on a cluster with Openstack,
- free disk space on /data must be greater than 20GB.
In this guide if there is,
- [any reference to master] OR
- [[if a CLI command contains master] AND/OR
-
[its output contains master]],
it means leader.
And, if there is any reference to slave, it means follower.
Ansible can be utilized to upgrade SevOne NMS. Please make a request on SevOne Support Portal for the latest forward / reverse migration tarball files along with the signature tools required for the upgrade.
- all the required ports are open. Please refer to SevOne NMS Port Number Requirements Guide for details.
- Port 60006 is required during the upgrade pre-checks.
- you have the required CPUs, total vCPU cores, RAM (GB), Hard Drives, etc. based on SevOne NMS
Installation Guide - Virtual Appliance > section Hardware Requirements.Note: Due to technology advancements, resource requirements may change.
Performing a SevOne NMS version upgrade on a virtual machine deployment means that the deployment on the virtual machine was done on a prior SevOne NMS version. As such, the current virtual machine's hardware specifications would be on SevOne NMS version the virtual machine that was previously deployed on or upgraded to. If the target SevOne NMS version of this upgrade has different hardware specifications than the current configuration on the virtual machine, it is important that the hardware resources are aligned prior to the upgrade based on the current documented requirements of target version of SevOne NMS upgrade.
If CPUs, total vCPU cores, and RAM (GB) do not match the target version requirements, please discuss with your infrastructure team to align the resources required for your virtual machine.
If the Hard Disk space requires an increase to align with SevOne NMS' target version requirements, the following must be considered.
- contact your infrastructure team to increase the disk space on your virtual machine at the hypervisor end.
- once the disk space has been resized successfully at the hypervisor level, before starting the upgrade, the following procedure must be completed with instructions in section expand logical volume.
If your virtual machine has any custom requirements/specifications higher than the documented specifications in SevOne NMS Installation Guide - Virtual Appliance > section Hardware Requirements, please contact your Technical Account Manager to discuss the details of your custom requirements.
PREPARE FOR UPGRADE
You must be on SevOne NMS 7.0.1 to upgrade to SevOne NMS 7.1.0.
Example# 1: SevOne NMS 7.0.1 to SevOne NMS 7.1.0 Upgrade
- Cluster Setup
- Cluster consists of 4 x vPAS 20Ks, no DNCs, no HSAs
- Cluster is monitoring 800 devices, 79,200 objects, and 548,800 indicators
- Backfilled with 3 months of data
- SDB export configured
- Timings
- Upgrade takes ~30 minutes
Total Time of Upgrade Time Taken Instant Graph Downtime 5 minutes Device Manager Availability 5 minutes Collection Outage (SNMP / ICMP) - Data Loss 5 minutes Alerting Outage 5 minutes Trap Generation 5 minutes SDB Publishing Outage 5 minutes Reboot Cluster 3-5 minutes - Polling outage is ~2-5 minutes
- Upgrade takes ~30 minutes
Example# 2: SevOne NMS 7.0.1 to SevOne NMS 7.1.0 Upgrade
- Cluster Setup
- Cluster consists of 20 x (PAS 200Ks, DNCs, HSAs)
- Cluster is monitoring 800 devices, 79,200 objects, and 548,800 indicators
- Backfilled with 3 months of data
- SDB export configured
- Timings
- Upgrade takes ~2 hours 15 minutes
Total Time of Upgrade Time Taken Instant Graph Downtime 3-5 minutes Device Manager Availability 3-5 minutes Collection Outage (SNMP / ICMP) - Data Loss 0 seconds Alerting Outage 3-5 minutes Trap Generation Undetected SDB Publishing Outage n/a Reboot Cluster 8-10 minutes - Polling outage is ~2-5 minutes
- Upgrade takes ~2 hours 15 minutes
NOTE: Timings may vary based on your cluster.
As of SevOne NMS 6.5.0, if your flow template contains field 95, then you will see the following name changes in the flow template.
Field # | Pre-existing Field Name | New Field Name |
---|---|---|
95 | Application Tag | Application ID |
45010 | Engine ID-1 | Application Engine ID |
45011 | Application ID | Application Selector ID |
- Using ssh, log in to SevOne NMS appliance (Cluster Leader of SevOne NMS cluster) as
support.
ssh support@<NMS appliance>
- You need to run as root. Enter the following command to run/switch to
root.
sudo su
- Go to /data directory.
cd /data
- Check if upgrade directory exists. If not, create it.
mkdir upgrade
- Change directory to /data/upgrade
cd /data/upgrade
- Check the version your NMS appliance is running on. For example, the output below indicates that
your NMS appliance is on SevOne NMS 7.0.1. You must be on SevOne NMS 7.0.1 or
above to proceed with the upgrade to SevOne NMS 7.1.0. This command will take you to
your NMS container and retrieve the SevOne NMS version you are currently on.
Examplepodman exec -it nms-nms-nms SevOne-show-version Output: SevOne version: 7.0.1 kernel version: 4.18.0-553.8.1.el8_10.x86_64 #1 SMP Fri Jun 14 03:19:37 EDT 2024 nginx version: 1.14.1 MySQL version: 10.6.18-MariaDB PHP version: 8.3.9 SSH/SSL version: OpenSSH_8.0p1, OpenSSL 1.1.1k FIPS 25 Mar 2021 REST API version: 2.1.47, Build time 2024-05-16T06:53:00+0000, Hash 07f225e Intel(R) Xeon(R) CPU 2 cores @ 2199.998MHz 8GB RAM 4GB SWAP 150 GB / Partition 150 GB /data partition
- Verify the OS version. Confirm that version is Red Hat Enterprise Linux (RHEL).
cat /etc/redhat-release Output: Red Hat Enterprise Linux release 8.10 (Ootpa)
- Using curl -kO, copy the signature tools checksum file (signature-tools-<latest version>-build.<###>.tgz.sha256.txt) received from SevOne to /data/upgrade directory. For example, signature-tools-2.0.3-build.1.tgz.sha256.txt.
- Using curl -kO, copy the signature tools file (signature-tools-<latest version>-build.<###>.tgz) received from SevOne to /data/upgrade directory. For example, signature-tools-2.0.3-build.1.tgz.
- Verify the signature tools
checksum are in /data/upgrade directory.
cd /data/upgrade
sha256sum --check signature-tools-<latest version>-build.<###>.tgz.sha256.txt
Example
sha256sum --check signature-tools-v2.0.3-build.1.tgz.sha256.txt Output: signature-tools-v2.0.3-build.1.tgz: OK
- Extract the signature tools tar
file.
tar -xzvf signature-tools-<latest version>-build.<###>.tgz -C /
Example
tar -xzvf signature-tools-v2.0.3-build.1.tgz -C /
- Using curl -kO, copy the forward tarball file (<forward SevOne NMS tarball>.tar.gz) received from SevOne to /data/upgrade directory. For example, tarball file for SevOne NMS 7.1.0.
- Using curl -kO, copy the checksum file (<forward SevOne NMS tarball>.sha256.txt) received from SevOne to /data/upgrade directory.
- Change directory.
cd /data/upgrade
- (optional) Validate the signature for the forward tarball.
Ensure valid & trusted certificate is used,
/usr/local/bin/SevOne-validate-image -i v7.1.0-build<enter build number>.tar.gz -s v7.1.0-build<enter build number>.tar.gz.sha256.txt Output:
INFO: Extracting code-signing certificates from image file... Image signed by SevOne Release on Tue, 15 Oct 2024 16:50:34 +0000. The certificate is trusted. Certificate subject= commonName = International Business Machines Corporation organizationalUnitName = IBM CCSS organizationName = International Business Machines Corporation localityName = Armonk stateOrProvinceName = New York countryName = US Certificate issuer= commonName = DigiCert Trusted G4 Code Signing RSA4096 SHA384 2021 CA1 organizationName = DigiCert, Inc. countryName = US INFO: Checking the signature of the image The image can be installed.Please contact SevOne Support Team if the certificate is not trusted or the signature does not match.
Once you have prepared the upgrade setup, you can either continue with the upgrade Upgrade using Command Line Interface or Upgrade using GUI Installer.
➤ Upgrade using Command Line Interface
- Before starting the upgrade, clean up /data/upgrade directory.
rm /data/upgrade/installRPMs.tar rm /data/upgrade/DigiCert-cs.crt rm /data/upgrade/ibm-sevone-cs.crt rm -rf /data/upgrade/ansible
- Change directory to /data/upgrade
cd /data/upgrade
- Extract the upgrade tar file.
tar -xzvf v7.1.0-build<enter build number>.tar.gz
- Change directory to /data/upgrade/ansible.
cd /data/upgrade/ansible
- Run the upgrade script to initiate the upgrade from SevOne NMS 7.0.1 to SevOne NMS
7.1.0.
./upgrade.sh
Important: Depending on the size of the cluster, the upgrade time may vary; please wait until the upgrade completes successfully.Note: The following flags are case-sensitive parameters. If more than one flag / parameter is passed, you must pass them in single / double-quotes.IMPORTANT: For pre-upgrade flags -e and -f,
Flag -e can be added to skip the pre-upgrade errors.
Flag -f can be added to skip the pre-upgrade checks and to force the install.
However, there are certain pre-checks that do not get skipped even if -e or -f flag is passed. For example, when performing an upgrade, your current SevOne NMS version must be a version that is prior to the SevOne NMS version you are upgrading to. Otherwise, you will get 'Starting version of NMS should be less than the forward version of NMS.' message.
- -a: Avoid playbook tags to run.
- -c: Prevents hosts.ini from being automatically regenerated. If this flag is not passed as an option, hosts.ini will be automatically regenerated.
- -e: Skip pre-upgrade errors if found, applicable only when run WITHOUT -f option.
- -f: Skip pre-upgrade checks and force install.
- -n: Don't start in a screen session. Used for automated builds.
- -s: Run the upgrade without the UI logger.
- -x: Run pre-upgrade checks with --hub-spoke-network option.
- -h: Show this help.
For example,
To run the upgrade by skipping pre-upgrade checks all together,
./upgrade.sh -f
To run the upgrade with the hub-spoke flag,
./upgrade.sh -x
- The installer starts a screen session named ansible-{version}.Important: The screen session on the terminal must not be detached as the ongoing process is in the memory and packages may no longer be available on the appliance during the upgrade.
After the ansible-playbook execution completes in the screen, you must exit the screen session.
Using a separate ssh connection to the boxes while upgrading, may show some PHP / other warnings on the terminal. SevOne recommends you to wait until after the upgrade process completes successfully.
- Packages on all peers / hosts are updated at the same time. [tag: prepare_rpm, install_rpm, docker_setup]
- Database migrations are run on the clustermaster (mysqlconfig and mysqldata) and active peers (mysqldata only). [tag: database]
- System patches are run on all peers at the same time. [tag: systempatches]
- Cleanup actions on all hosts is performed last. [tag: cleanup]
- SevOne NMS 7.1.0 is installed on your machine.
Upgrade opens a TUI (Text-based User Interface) or Terminal User Interface window which splits the progress into 3 columns.
Important: If you are unable to interact with the screen session, you must still be able to view the progress in the Column 1. Allow the upgrade to complete.- Column 1: Host panel - shows the progress per host.
- Column 2: Tasks panel - shows the tasks being executed.
- Column 3: Logs panel - shows the logs associated with each task.Important: FYI
- Press F1 to show / hide HELP.
- Ctrl+C - kills ansible execution.
To navigate between panel / column, press:
- 1 - to select Column 1 (Host panel)
- 2 - to select Column 2 (Tasks panel)
- 3 - to select Column 3 (Logs panel)
- up arrow - to move cursor / log up
- down arrow - to move cursor / log down
- left arrow - to move cursor / log left
- right arrow - to move cursor / log right
To detach from screen mode, Ctrl+A followed by the letter d. To attach to the screen mode, enter screen r.
Upgrade
Click on F1 to show the HELP menu in the logger
- After successfully upgrading to SevOne NMS 7.1.0, check the version to ensure that you are on
SevOne NMS 7.1.0. i.e., SevOne NMS 7.1.0 as shown in the example below.
Example
podman exec -it nms-nms-nms SevOne-show-version Output: SevOne version: 7.1.0 kernel version: 4.18.0-553.8.1.el8_10.x86_64 #1 SMP Fri Jun 14 03:19:37 EDT 2024 nginx version: 1.24.0 MySQL version: 10.6.19-MariaDB PHP version: 8.3.12 SSH/SSL version: OpenSSH_8.0p1, OpenSSL 1.1.1k FIPS 25 Mar 2021 REST API version: 2.1.47, Build time 2024-08-26T14:05:20+0000, Hash 6d6a9c5 Intel(R) Xeon(R) CPU 2 cores @ 2199.998MHz 8GB RAM 4GB SWAP 150 GB / Partition 150 GB /data partition
For post-Upgrade steps, please refer to section post-Upgrade Stage below.
➤ Upgrade using GUI Installer
- From a web browser of your choice, enter the URL of your SevOne NMS appliance. For example, https://<SevOne NMS appliance IP address>
Click Cluster in the cluster hierarchy on the left and select the Cluster Upgrade tab on the right to upgrade the cluster using the graphical user interface. This tab will contain all the details for the SevOne NMS Graphical User Interface installer and the upgrade history.
- If you have already executed the steps in section
PREPARE FOR UPGRADE, you have the required upgrade
files and you may skip the fields under Step 1: Get Upgrade Artifact via SFTP Server.
If not and you do not have the required upgrade files in /data/upgrade folder, either
execute the steps in section PREPARE FOR UPGRADE
or enter the values in the following fields for the SevOne NMS being upgraded.
- Server IP - The IP Address or hostname of the SFTP server for SevOne NMS to use.
- Port - The port number on which the SFTP server is running on the remote server. The default value is port 22. SevOne NMS will send the reports to this port.
- Username - The username for copying the artifact from the remote server.
- Password - The password SevOne NMS needs to authenticate onto the SFTP server.
- FilePath to upgrade artifact - The path to the artifact on the remote SFTP server from where you wish to download the tar file. The user must have read permissions to the artifact.
- Click on Get Upgrade Artifact button to get the artifact to be used by SevOne NMS for the
upgrade. The artifact is put in /data/upgrade directory of the Cluster Leader.
Note:If you have already configured the SFTP server on Cluster Manager > Cluster Settings tab > SFTP subtab, the same will be fetched except for the path. You may use the same server or configure a different one here.
Depending on the size of the artifact, this step may take some time.
In case you do not have SFTP, you may copy the artifacts directly to Cluster Leader's /data/upgrade directory.
- Under section Step 2: Add Domain name, enter the value in the following
field for the SevOne NMS being upgraded.
- Domain Names - Enter comma separated domain names without https://. For example, test.sevone.com,test2.sevone.com.
- Click Save Domain Names button to save the domain names.
- Under section Step 3: Run Installer to use newly downloaded Upgrade Artifact and view
URL, click Run Installer button to upgrade the SevOne NMS with the latest version
available in the artifact. The following is processed in the background.
- The latest installer from the artifact is extracted.
- The installer is upgraded to the latest version.
- A URL for the installer is generated.
You may proceed to the generated URL to initiate the upgrade via the Graphical User Interface. Follow the steps in section Upgrade Stages.
- Section Cluster Upgrade History, shows the cluster upgrade history for
all the previous upgrades done using the Graphical User Interface installer.
The following details are available.- Starting Version - The SevOne NMS version of the cluster prior to the upgrade.
- Forward Version - The SevOne NMS version of the cluster that it is upgraded to.
- Status - The status of the upgrade. i.e., it indicates whether the upgrade is in progress, successful, or has failed.
- Upgrade completion time - This field shows the time it took to complete the upgrade.
Upgrade Stages
- Using the browser of your choice, enter the URL that the installer has returned in the step above. For example, https://10.49.10.156:9443/.
- Enter the login credentials to launch the upgrade stages.
Check for Upgrade Stage
This stage checks whether an update is available for the SevOne NMS cluster.
- - denotes that you can toggle to dark theme.
- Current Version - denotes the current version of your SevOne NMS cluster.
- Upgrade Available - denotes the version of the available update. The upgrade artifact (tarball) must be located in /data/upgrade directory of the Active Cluster Leader for this stage to detect an available update.
- Read the release notes - provides the link to the release notes of SevOne NMS version you are upgrading to.
- Limited functionality - provides the upgrade statistics for the cluster and testing
parameters such as,
- Total upgrade time
- Estimated disruption to polling
- Estimated disruption to netflow
- Estimated disruption to alerting
- Estimated disruption to SURF UI (user interface)
- Estimated disruption to SURF reporting
- Estimated disruption to reporting from DI (Data Insight)
- Provided testing parameters for the testing environment of each estimation aboveNote: Statistics may vary based on your cluster size and testing environment variables.
- If there is an artifact for a version higher than your current SevOne NMS version, you may proceed to the next stage by clicking on Continue to Pre-upgrade.
pre-Upgrade Stage
The Pre-Upgrade stage runs only pre-upgrade checks against your SevOne NMS cluster to ensure your system is ready for the upgrade.
Click Run Pre-Upgrade to ensure that SevOne NMS cluster is in good health for the upgrade. Some of the checks include:
- Interpeer connectivity
- MySQL replication and overall NMS health
- Free disk spaceNote: Running the Pre-upgrade checks may take a few minutes.
- The top part shows the overall state and progress of the pre-upgrade checks. The status can be in progress, successful, or failed.
- Under Peers, is a list of peers, peer-wise status, and the completion progress. The
status of a peer can be:
- Unreachable - denotes the peer is unreachable while running the pre-checks.
- Failed - denotes that some checks have failed on the peer.
- Completed - denotes that the checks have completed
successfully.
Example
Note: By selecting a row in the Peers section, you can view the status of each task on the individual peer. The search box allows you to search on the tasks in the list.Each in the screenshot above performs different downloads. You may download:
- a peer log
- log for each task in the peer
- all logs in the cluster
Click Download System Log to download the system log to a file.
All downloaded files are saved in your default download folder.
When completed, you can view and download logs, , for each task.
When you click , you get a Log Viewer pop-up. Click Copy to clipboard to copy the contents in the log viewer and paste it to a file.
Log Viewer
[ { "content": { "changed": true, "cmd": "podman exec nms-nms-nms SevOne-act check checkout", "delta": "0:00:33.941077", "end": "2024-10-16 04:16:01.286613", "rc": 0, "start": "2024-10-16 04:15:27.345536", "stderr_lines": [ "" ], "stdout_lines": [ "[ OK ] No Errors Detected" ] }, "ended": "2024-10-16 04:16:01.405758+00:00", "peer_name": "127.0.0.1", "started": "2024-10-16 04:15:26.609710+00:00", "status": "ok", "task_name": "Run SevOne-act check checkout" } ]
- Summary - the bottom summary of the stage indicates the breakup of the tasks for the
selected peer and the entire cluster.
- Total - denotes the total number of tasks on the peer or overall tasks on the cluster.
- Ok - denotes the number of tasks which have run successfully.
- Skipped - denotes the number of tasks skipped. Not all tasks may run on all peers. Some tasks may run only on the Cluster Leader or the Active appliances and some may not run on certain appliance types such as, DNC. In such cases, there may be skipped tasks for certain peers.
- Failed - denotes the number of tasks which have failed. You can see individual logs for each task for any selected peer.
- Ignored : denotes the tasks/checks for which failures are ignored. Failure of these tasks/checks will not cause the stage to fail.
- Unreachable - denotes the number of tasks which have failed because the peer was unreachable. This is the first task after the peer has become unreachable and the remaining tasks will not be executed.
- Unexecuted - denotes the number of tasks that were not executed. This can be because the
peer was unreachable and/or the checks were stopped in between.Important: Some checks such as md5-hack and lsof may fail. At this time, the results from these two checks are being ignored in the overall check status. If either of these two checks fail, the pre-upgrade stage will still show as Passed. However, if any other check is failing and an upgrade needs to be forced, then the upgrade can be performed using the CLI.
It is highly recommended to contact the SevOne Support if any pre-check is failing.
After the pre-upgrade stage has completed successfully, click Continue to go to the Backup stage.
Backup Stage
The backup is run before the actual upgrade is performed. Click Run Backup and wait until the backup has completed successfully. This stage executes a few scripts to backup the database and a few folders critical to the system. This stage runs on the Cluster Leader and is optional.
When a peer is selected, it displays the list of tasks for the selected peer. The search box provides the capability to search in the task list.
When backup has completed successfully, click Continue.
Upgrade Stage
At this stage, the actual NMS upgrade is performed to the latest version. You will have limited functionality while upgrade is in progress. Click Run Upgrade. The User Interface workflow is identical to the pre-Upgrade Stage. You are allowed to run the upgrade from the User Interface if and only if the pre-checks have succeeded. Most of the User Interface components are the same as the Pre-Upgrade checks. You can view individual peers and the overall cluster status. The bottom summary panel shows the peer and overall status. In case of upgrade, the execution will stop on a peer at the first failed task. The remaining tasks will show as Unexecuted in the bottom summary panel.
The search box under Tasks provides the capability to search in the task list.
After the upgrade stage has completed successfully, you are now ready to perform the Health Check. Click Continue.
Health Check Stage
IMPORTANT
The target release must have the new kernel version. Prior to performing the health check, you must reboot the machine to load the new kernel and to start all services. Please do not skip this step.
podman exec -it nms-nms-nms /bin/bash
SevOne-shutdown reboot
Click Run Health Check to run SevOne NMS health checks after a successful upgrade. The checks are identical to the pre-upgrade checks.
When a peer is selected, it displays the list of tasks for the selected peer. The search box provides the capability to search in the task list.
From Command Line Interface, confirm that NMS appliance is running version SevOne NMS 7.1.0.
Example
podman exec -it nms-nms-nms SevOne-show-version
Output:
SevOne version: 7.1.0
kernel version: 4.18.0-553.22.1.el8_10.x86_64 #1 SMP Wed Sep 11 18:02:00 EDT 2024
nginx version: 1.24.0
MySQL version: 10.6.19-MariaDB
PHP version: 8.3.12
SSH/SSL version: OpenSSH_8.0p1, OpenSSL 1.1.1k FIPS 25 Mar 2021
REST API version: 2.1.47, Build time 2024-08-26T14:05:20+0000, Hash 6d6a9c5
Intel(R) Xeon(R) CPU
2 cores @ 2199.998MHz
8GB RAM
4GB SWAP
150 GB / Partition
150 GB /data partition
post-Upgrade cleanup
Clean up /data/upgrade directory.
rm /data/upgrade/installRPMs.tar
rm /data/upgrade/DigiCert-cs.crt
rm /data/upgrade/ibm-sevone-cs.crt
rm -rf /data/upgrade/ansible
systemctl restart sevone-installer-gunicorn.service
post-Upgrade Stage
After all the upgrade stages have completed successfully, please refer to section POST-UPGRADE STEPS below.
PREPARE FOR DOWNGRADE
When downgrading from SevOne NMS 7.1.0, you can only downgrade to SevOne NMS 7.0.1.
Example# 1: SevOne NMS 7.1.0 to SevOne NMS 7.0.1 Downgrade
- Cluster Setup
- Cluster consists of 4 x vPAS 20Ks, no DNCs, no HSAs
- Cluster is monitoring 800 devices, 79,200 objects, and 548,800 indicators
- Backfilled with 3 months of data
- SDB export configured
- Timings
- Downgrade takes ~44 minutes
- System is unavailable for ~6 minutes for the pods to restart
Example# 2: SevOne NMS 7.1.0 to SevOne NMS 7.0.1 Downgrade
- Cluster Setup
- Cluster consists of 20 x (PAS 200Ks, DNCs, HSAs)
- Cluster is monitoring 800 devices, 79,200 objects, and 548,800 indicators
- Backfilled with 3 months of data
- SDB export configured
- Timings
- Downgrade takes ~1 hour 30 minutes
- System is unavailable for ~6 minutes for the pods to restart
NOTE: Timings may vary based on your cluster.
- Using ssh, log in to SevOne NMS appliance (Cluster Leader of SevOne NMS cluster) as
support.
ssh support@<NMS appliance>
- You need to run as root. Enter the following command to run/switch to
root.
sudo su
- Go to /data directory.
cd /data
- Check if upgrade directory exists. If not, create it.
mkdir upgrade
- Change directory to /data/upgrade
cd /data/upgrade
- Check the version your NMS appliance is running on. For example, the output below indicates that
your NMS appliance is on SevOne NMS 7.1.0. When you proceed with the downgrade, you will
downgrade to SevOne NMS 7.0.1. For example, if you upgrade from SevOne NMS 7.0.2 and now
want to proceed with the downgrade, you will downgrade to SevOne NMS 7.0.1 and not SevOne NMS 7.0.2.
This command will take you to your NMS container and retrieve the SevOne NMS version you are currently on.
Examplepodman exec -it nms-nms-nms SevOne-show-version Output: SevOne version: 7.1.0 kernel version: 4.18.0-553.8.1.el8_10.x86_64 #1 SMP Fri Jun 14 03:19:37 EDT 2024 nginx version: 1.24.0 MySQL version: 10.6.19-MariaDB PHP version: 8.3.12 SSH/SSL version: OpenSSH_8.0p1, OpenSSL 1.1.1k FIPS 25 Mar 2021 REST API version: 2.1.47, Build time 2024-08-26T14:05:20+0000, Hash 6d6a9c5 Intel(R) Xeon(R) CPU 2 cores @ 2199.998MHz 8GB RAM 4GB SWAP 150 GB / Partition 150 GB /data partition
- Verify the OS version. Confirm that version is Red Hat Enterprise Linux (RHEL).
cat /etc/redhat-release Output: Red Hat Enterprise Linux release 8.10 (Ootpa)
- Using curl -kO, copy the signature tools checksum file (signature-tools-<latest version>-build.<###>.tgz.sha256.txt) received from SevOne to /data/upgrade directory. For example, signature-tools-2.0.3-build.1.tgz.sha256.txt.
- Using curl -kO, copy the signature tools file (signature-tools-<latest version>-build.<###>.tgz) received from SevOne to /data/upgrade directory. For example, signature-tools-2.0.3-build.1.tgz.
- Verify the signature tools
checksum are in /data/upgrade directory.
cd /data/upgrade
sha256sum --check signature-tools-<latest version>-build.<###>.tgz.sha256.txt
Example
sha256sum --check signature-tools-v2.0.3-build.1.tgz.sha256.txt Output: signature-tools-v2.0.3-build.1.tgz: OK
- Extract the signature tools tar
file.
tar -xzvf signature-tools-<latest version>-build.<###>.tgz -C /
Example
tar -xzvf signature-tools-v2.0.3-build.1.tgz -C /
- Using curl -kO, copy the reverse tarball file (<reverse SevOne NMS tarball>.tar.gz) received from SevOne to /data/upgrade directory. For example, tarball file for SevOne NMS 7.1.0.
- Using curl -kO, copy the checksum file (<reverse SevOne NMS tarball>.sha256.txt) received from SevOne to /data/upgrade directory.
- Change directory.
cd /data/upgrade
- (optional) Validate the signature for the reverse tarball.
Ensure valid & trusted certificate is used,
/usr/local/bin/SevOne-validate-image -i v7.1.0-to-v7.0.1-build<enter build number>.tar.gz -s v7.1.0-to-v7.0.1-build<enter build number>.tar.gz.sha256.txt Output:
INFO: Extracting code-signing certificates from image file... Image signed by SevOne Release on Tue, 15 Oct 2024 16:11:10 +0000. The certificate is trusted. Certificate subject= commonName = International Business Machines Corporation organizationalUnitName = IBM CCSS organizationName = International Business Machines Corporation localityName = Armonk stateOrProvinceName = New York countryName = US Certificate issuer= commonName = DigiCert Trusted G4 Code Signing RSA4096 SHA384 2021 CA1 organizationName = DigiCert, Inc. countryName = US INFO: Checking the signature of the image The image can be installed.Please contact SevOne Support Team if the certificate is not trusted or the signature does not match.
Once you have prepared the downgrade setup, you can continue with the downgrade Downgrade using Command Line Interface.
➤ Downgrade using Command Line Interface
- Before starting the downgrade, clean up /data/upgrade directory.
rm /data/upgrade/installRPMs.tar rm /data/upgrade/DigiCert-cs.crt rm /data/upgrade/ibm-sevone-cs.crt rm -rf /data/upgrade/ansible
- Change directory to /data/upgrade
cd /data/upgrade
- Extract the downgrade tar file.
tar -xzvf v7.1.0-to-v7.0.1-build<enter build number>.tar.gz
- Change directory to /data/upgrade/ansible.
cd /data/upgrade/ansible
- Run the downgrade script to initiate the downgrade from SevOne NMS 7.1.0 to SevOne NMS
7.0.1.
./reverse.sh
Important: Depending on the size of the cluster, the downgrade time may vary; please wait until the downgrade completes successfully.Note: The following flags are case-sensitive parameters. If more than one flag / parameter is passed, you must pass them in single / double-quotes.- -a: Avoid playbook tags to run.
- -e: Skip pre-upgrade errors if found, applicable only when run WITHOUT -f option.
- -f: Skip pre-upgrade checks and force install.
- -n: Don't start in a screen session. Used for automated builds.
- -s: Run the upgrade without the UI logger.
- -x: Run pre-upgrade checks with --hub-spoke-network option.
- -h: Show this help.
For example,
To run the downgrade by skipping pre-upgrade checks all together,
./reverse.sh -f
To run the downgrade with the hub-spoke flag,
./reverse.sh -x
To run the downgrade by skipping pre-upgrade check errors,
./reverse.sh -e
- The installer starts a screen session named ansible-{version}.Important: The screen session on the terminal must not be detached as the ongoing process is in the memory and packages may no longer be available on the appliance during the downgrade.
After the ansible-playbook execution completes in the screen, you must exit the screen session.
Using a separate ssh connection to the boxes while downgrading, may show some PHP / other warnings on the terminal. SevOne recommends you to wait until after the downgrade process completes successfully.
- Packages on all peers / hosts are updated at the same time. [tag: prepare_rpm, install_rpm, docker_setup]
- Reverse database migrations are run on the clustermaster (mysqlconfig and mysqldata) and active peers (mysqldata only). [tag: database]
- System patches are run on all peers at the same time. [tag: systempatches]
- Cleanup actions on all hosts is performed last. [tag: cleanup]
- If you were on SevOne NMS 7.1.0, SevOne NMS 7.0.1 is now installed on your machine as the
upgrade to SevOne NMS 7.1.0 was from SevOne NMS 7.0.1.
Downgrade opens a TUI (Text-based User Interface) or Terminal User Interface window which splits the progress into 3 columns.
Important: If you are unable to interact with the screen session, you must still be able to view the progress in the Column 1. Allow the downgrade to complete.- Column 1: Host panel - shows the progress per host.
- Column 2: Tasks panel - shows the tasks being executed.
- Column 3: Logs panel - shows the logs associated with each task.Important: FYI
- Press F1 to show / hide HELP.
- Ctrl+C - kills ansible execution.
To navigate between panel / column, press:
- 1 - to select Column 1 (Host panel)
- 2 - to select Column 2 (Tasks panel)
- 3 - to select Column 3 (Logs panel)
- up arrow - to move cursor / log up
- down arrow - to move cursor / log down
- left arrow - to move cursor / log left
- right arrow - to move cursor / log right
To detach from screen mode, Ctrl+A followed by the letter d. To attach to the screen mode, enter screen r.
Downgrade
Click on F1 to show the HELP menu in the logger
- After successfully downgrading to SevOne NMS 7.0.1, check the version to ensure that you are on
SevOne NMS 7.0.1 as shown in the example below.
Example
podman exec -it nms-nms-nms SevOne-show-version Output: SevOne version: 7.0.1 kernel version: 4.18.0-553.22.1.el8_10.x86_64 #1 SMP Wed Sep 11 18:02:00 EDT 2024 nginx version: 1.14.1 MySQL version: 10.6.18-MariaDB PHP version: 8.3.9 SSH/SSL version: OpenSSH_8.0p1, OpenSSL 1.1.1k FIPS 25 Mar 2021 REST API version: 2.1.47, Build time 2024-05-16T06:53:00+0000, Hash 07f225e Intel(R) Xeon(R) CPU 2 cores @ 2199.998MHz 8GB RAM 4GB SWAP 150 GB / Partition 150 GB /data partition
- Check the kernel packages installed.
Check kernel version
podman exec -it nms-nms-nms SevOne-show-version OR rpm -qa | grep kernel OR uname -r
Important: Kernel automatically gets updated as part of the downgrade and not every NMS release has a new kernel.Depending on the NMS release, kernel versions must be:
SevOne NMS Version Kernel Version NMS 7.0.1 4.18.0-553.el8_10.x86_64 NMS 7.1.0 4.18.0-553.8.1.el8_10.x86_64 Kernel version is based on the NMS release.
Kernel version must be based on the NMS release. If not, you must reboot the entire cluster by executing the step below to apply the new kernel otherwise, reboot is not required.
SevOne-shutdown reboot
- Confirm that NMS appliance is running version SevOne NMS 7.0.1.
podman exec -it nms-nms-nms SevOne-show-version Output: SevOne version: 7.0.1 kernel version: 4.18.0-553.8.1.el8_10.x86_64 #1 SMP Fri Jun 14 03:19:37 EDT 2024 nginx version: 1.14.1 MySQL version: 10.6.18-MariaDB PHP version: 8.3.9 SSH/SSL version: OpenSSH_8.0p1, OpenSSL 1.1.1k FIPS 25 Mar 2021 REST API version: 2.1.47, Build time 2024-05-16T06:53:00+0000, Hash 07f225e Intel(R) Xeon(R) CPU 2 cores @ 2199.998MHz 8GB RAM 4GB SWAP 150 GB / Partition 150 GB /data partition
- Clean up /data/upgrade directory.
rm /data/upgrade/installRPMs.tar rm /data/upgrade/DigiCert-cs.crt rm /data/upgrade/ibm-sevone-cs.crt rm -rf /data/upgrade/ansible
- Execute the following command to identify errors, if
any.
SevOne-act check checkout --full-cluster --verbose
Log Files
Log File can be found in /var/log/SevOne/ansible-reverse/<toVersion>/<timestamp>/<peerIP>.log.
For example, /var/log/SevOne/ansible-reverse/v7.1.0/<timestamp>/<peerIP>.log
- Each peer will have its own log file located on the cluster leader
- A new log file will be created for each run of the downgrade and is split by the timestamp folder
POST-UPGRADE STEPS
Option 1: Reboot appliances by performing failover between Primary / Secondary
Option 2: Reboot appliances without failover
IMPORTANT: In either option, please make sure to successfully restart all the other appliances first before performing the restart operation on the Cluster Leader active appliance.
If the updater process is running, the command to shutdown/reboot an appliance will not proceed and you will get a message suggesting you to use the --force option.
It is not recommended to use the force operation as it can lead to short-term data loss. Updater is scheduled to start running every 2 even hours at 30 minutes past the hour. For example, starting at 00:30, 02:30, 04:30, etc. The updater process is expected to run for approximately 1800 seconds (30 minutes). However, on very large and busy appliances, it can sometimes take a few minutes longer. Due to this, plan to reboot the appliances at times when the updater process is not running.
Find Cluster Leader
Prior to rebooting the appliances in the cluster using one of the options mentioned above, you must make note of the appliance that is the Cluster Leader in the cluster. From Administration > Cluster Manager > Cluster Overview tab > field Cluster Leader provides the name of the appliance that is the leader. For example, pandora-01 as shown in the screenshot below, is the Cluster Leader.
(Option 1) Reboot appliances by performing failover between Primary / Secondary
It is not necessary to perform failover / failback while performing reboot of appliances, but to minimize the polling downtime due to a reboot, this option can be used to perform failover, reboot, and failback between the primary / secondary appliances.
The failover operation must be done manually on a per peer basis, and it is not recommended to perform failover on more than one peer at the same time. However, the reboot of multiple appliances can be done in batches of 4 to 5 appliances at the same time. It is important to note that the failover steps for Cluster Leader pair must be done last once all other appliances have been rebooted.
Reboot Secondary appliances first (including Cluster Leader Secondary appliance)
Identify to confirm the passive appliance of the pair. From Administration > Cluster Manager > left navigation bar, expand the peer to identify which appliance of the pair is currently passive. For example, 10.129.13.121 as shown in the screenshot below, is the passive appliance of the pair.
- Using ssh, log in to each SevOne NMS passive appliance as support, including
the Cluster Leader passive
appliance.
ssh support@<NMS 'passive' appliance>
- Reboot the passive appliance.
podman exec -it nms-nms-nms /bin/bash SevOne-shutdown reboot
Note: Repeat the steps above for each passive appliance including the Cluster Leader passive appliance.Important: Multiple passive appliances of different peers can be restarted at the same time in batches. SevOne recommends performing no more than 4 to 5 appliances at the same time to keep the operation manageable.
Check and confirm replication status after reboot of Secondary appliances
Once the passive appliance is back up after reboot, confirm the replication is good for each pair - replication can take a few minutes. Execute the following commands to check the system uptime and replication status for the appliance that was rebooted.
podman exec -it nms-nms-nms /bin/bash
uptime
SevOne-act check replication
SevOne-masterslave-status
Perform the failover operation to make Secondary the active appliance (all peers except Cluster Leader)
You may now perform the fail over operation. From Administration > Cluster Manager > left navigation bar, expand the peer to select the active appliance of the pair. In the upper-right corner, click and select option Fail Over. For additional details, please refer to Cluster Manager > section Appliance Level Actions.
Check replication state after failover of the appliances
Once the failover is complete, confirm that the replication is good for each pair. This may take a few minutes for replication to catch up if the replication is lagging. Execute the following commands to check the system uptime and replication status for the appliance that was rebooted.
podman exec -it nms-nms-nms /bin/bash
SevOne-act check replication
SevOne-masterslave-status
You are now ready to restart the primary appliances.
Perform restart of Primary appliance(s) (all peers except Cluster Leader)
After the failover, the secondary appliances that were rebooted in the previous step will now be the current active appliance and the primary appliance will now be in the current passive state. Identify to confirm the passive appliance of the pair from the User Interface > Administration > Cluster Manager. In the left navigation bar, expand the peer to identify which appliance of the pair is currently passive of the pair.
Refresh the browser to confirm that the failovers are successful and the primary appliances are now reported as passive.
Now, log in using SSH to the passive appliances as support and perform the reboot.
podman exec -it nms-nms-nms /bin/bash
SevOne-shutdown reboot
Perform failover operation to make Primary as active appliance (all peers except Cluster Leader)
You may now perform the fail over operation. From Administration > Cluster Manager > left navigation bar, expand the peer to select the active appliance of the pair. In the upper-right corner, click and select option Fail Over. For additional details, please refer to Cluster Manager > section Appliance Level Actions.
Check replication state after failover of the appliances
Once the failover is complete, confirm the replication is good for each pair - replication can take a few minutes. Execute the following commands to check the system uptime and replication status for the appliance that was rebooted.
podman exec -it nms-nms-nms /bin/bash
uptime
SevOne-act check replication
SevOne-masterslave-status
Reboot all peers with single appliance (peers that do not have a Secondary)
Identify all appliances that do not have any secondary appliances. From Administration > Cluster Manager > left navigation bar, expand the peers to identify which peer has a single appliance. For example, 10.129.15.139 as shown in the screenshot below, does not have any associated passive appliance.
Now, log in using SSH to all single primary appliances as support and perform the reboot.
podman exec -it nms-nms-nms /bin/bash
SevOne-shutdown reboot
Perform failover operation to make Cluster Leader Secondary as the active appliance
Before failing over the Cluster Leader, perform a cluster wide check on replication to ensure no errors are reported.
podman exec -it nms-nms-nms /bin/bash
SevOne-act check replication --full-cluster
You may now perform the fail over operation for the Cluster Leader peer. From Administration > Cluster Manager > left navigation bar, expand the peer to select the active appliance of the Cluster Leader pair. In the upper-right corner, click and select option Fail Over. For additional details, please refer to Cluster Manager > section Appliance Level Actions.
Check replication state after failover of the Cluster Leader appliance
Once the passive appliance is back up after reboot, confirm the replication is good for each pair - replication can take a few minutes. Execute the following commands from the Primary appliance using the Command Line Interface to confirm the replication status.
podman exec -it nms-nms-nms /bin/bash
SevOne-act check replication
SevOne-masterslave-status
Perform reboot of Cluster Leader Primary appliance
After the failover, the Cluster Leader Secondary appliance will now be the active appliance and, the Primary appliance will be in the passive state. Identify to confirm the passive appliance of the pair from the User Interface > Administration > Cluster Manager. In the left navigation bar, expand the Cluster Leader peer to identify which appliance of the pair is currently passive of the pair.
Refresh the browser to confirm that the failovers are successful and the Cluster Leader primary appliance is now reported as passive.
Now, log in using SSH to Cluster Leader passive appliance as support and perform the reboot.
podman exec -it nms-nms-nms /bin/bash
SevOne-shutdown reboot
Perform failover operation to make Cluster Leader Primary as active appliance
You may now perform the fail over operation. From Administration > Cluster Manager > left navigation bar, identify the active appliance of the Cluster Leader pair and select the active appliance of the pair. In the upper-right corner, click and select option Fail Over. For additional details, please refer to Cluster Manager > section Appliance Level Actions.
Check replication state after failover of the appliances
Once the failover is complete, confirm the replication is good for the Cluster Leader peer - replication can take a few minutes. Execute the following commands from the Secondary appliance using the Command Line Interface to confirm the replication status.
podman exec -it nms-nms-nms /bin/bash
SevOne-act check replication
SevOne-masterslave-status
(Option 2) Reboot appliances without failover
If the upgrade is in a complete maintenance window and an outage on polling is acceptable for the duration of the restart of the appliances, then you can choose to perform the restart of the appliances in any order. However, SevOne always recommends restart of the passive appliances first and then, restart the active appliances for all the peers.
Reboot Secondary appliances first (including Cluster Leader Secondary appliance)
Identify to confirm the passive appliance of the pair. From Administration > Cluster Manager > left navigation bar, expand the peer to identify which appliance of the pair is currently passive. For example, 10.129.13.121 as shown in the screenshot below, is the passive appliance of the pair.
- Using ssh, log in to each SevOne NMS passive appliance as support, including
the Cluster Leader passive
appliance.
ssh support@<NMS 'passive' appliance>
- Reboot the passive appliance.
podman exec -it nms-nms-nms /bin/bash SevOne-shutdown reboot
Note: Repeat the steps above for each passive appliance including the Cluster Leader passive appliance.Important: Multiple passive appliances of different peers can be restarted at the same time in batches. SevOne recommends performing no more than 4 to 5 appliances at the same time to keep the operation manageable.
Perform restart of the Primary appliance(s) (all peers except Cluster Leader)
Identify to confirm the active appliance of the pair from the User Interface > Administration > Cluster Manager. In the left navigation bar, expand the peer to identify which appliance of the pair is currently active of the pair.
Now, log in using SSH to the active appliance(s) as support and perform the reboot.
podman exec -it nms-nms-nms /bin/bash
SevOne-shutdown reboot
Perform reboot of Cluster Leader Primary appliance
Once all other appliances in the cluster have been restarted, the Cluster Leader primary appliance must be restarted. Identify to confirm the active appliance of the Cluster Leader pair from the User Interface > Administration > Cluster Manager. In the left navigation bar, expand the Cluster Leader peer to identify which appliance of the pair is currently active of the pair.
Now, log in using SSH to the Cluster Leader active appliance as support and perform the reboot.
podman exec -it nms-nms-nms /bin/bash
SevOne-shutdown reboot
Confirm all appliances in cluster have restarted
Execute the following script from the active Cluster Leader.
for IP in $(SevOne-peer-list); do echo -en "IP: $IP \t"; ssh $IP 'echo -e "Hostname: $(hostname) \t
System Uptime: $(uptime)" '; done
Cluster Leader does not need to be restarted again.
Load Kernel & Start Services
- Clean up /data/upgrade directory.
rm /data/upgrade/installRPMs.tar rm /data/upgrade/DigiCert-cs.crt rm /data/upgrade/ibm-sevone-cs.crt rm -rf /data/upgrade/ansible
- After the reboot completes successfully, SSH back to your active Cluster Leader of SevOne
NMS cluster to check the installed kernel packages. Please see the table below to confirm
that you have the correct kernel version.
Check kernel version
podman exec -it nms-nms-nms /bin/bash SevOne-show-version OR rpm -qa | grep kernel OR uname -r
Important: Kernel automatically gets updated as part of the upgrade and not every NMS release has a new kernel.
Depending on the NMS release, kernel versions must be:SevOne NMS Version Kernel Version NMS 7.1.0 4.18.0-553.8.1.el8_10.x86_64 NMS 7.0.1 4.18.0-553.el8_10 Kernel version is based on the NMS release.
Kernel version must be based on the NMS release. If not, you must reboot the entire cluster by executing the step below to apply the new kernel otherwise, reboot is not required.
podman exec -it nms-nms-nms /bin/bash SevOne-shutdown reboot
- Verify the Operating System version. Confirm that version is Red Hat Enterprise Linux
after the upgrade.
cat /etc/redhat-release Output: Red Hat Enterprise Linux release 8.9 (Ootpa)
- Execute the following command to identify errors, if
any.
SevOne-act check checkout --full-cluster --verbose
Log Files
Log File can be found in /var/SevOne/ansible-upgrade/<fromVersion-toVersion>/<timestamp>/<peerIP>.log.
For example, /var/SevOne/ansible-upgrade/v7.0.1-v7.1.0/<timestamp>/<peerIP>.log
- Each peer will have its own log file located on the Cluster Leader.
- A new log file will be created for each run of the upgrade and is split by the timestamp folder.