Online upgrade of IBM Cloud Pak for Watson AIOps (console method)
Use these instructions to upgrade IBM Cloud Pak® for Watson AIOps 3.7.0 or later to 4.1.0.
This procedure can be used on an online deployment of IBM Cloud Pak for Watson AIOps 3.7.0 or later, and can still be used if the deployment has had hotfixes applied. If you have an offline deployment, follow the instructions in Upgrading IBM Cloud Pak for Watson AIOps (offline).
Before you begin
- Some steps must still be performed with the Red Hat® OpenShift® Container Platform command line interface (CLI). Ensure that you are logged in to your Red Hat OpenShift Container Platform cluster with
oc login
for any steps that use the OpenShift command-line interface (CLI). - Red Hat OpenShift Container Platform requires a user with
cluster-admin
privileges for the following operations:
Warnings:
- Custom patches, labels, and manual adjustments to IBM Cloud Pak for Watson AIOps resources are lost when IBM Cloud Pak for Watson AIOps is upgraded, and must be manually reapplied after upgrade. For more information, see Manual adjustments are not persisted.
- If you previously increased the size of the Kafka PVC directly, then you must follow the correct procedure that is supplied in Increasing the Kafka PVC to ensure that the size is updated by the operator. Failure to do so before upgrading IBM Cloud Pak for Watson AIOps causes the operator to attempt to restore a lower default value for the Kafka PVC, and causes an error in your IBM Cloud Pak for Watson AIOps deployment.
Restrictions:
- You cannot use these instructions to upgrade deployments of IBM Cloud Pak for Watson AIOps 3.6.2 or earlier. For example, you cannot upgrade from IBM Cloud Pak for Watson AIOps 3.6.0 or 3.6.2 to 4.1.0.
- The upgrade cannot be removed or rolled back.
- If you are planning to upgrade to Red Hat OpenShift Container Platform 4.12 as part of an upgrade to IBM Cloud Pak for Watson AIOps 4.1.0, you must complete the IBM Cloud Pak for Watson AIOps upgrade before you upgrade to Red Hat OpenShift Container Platform 4.12.
Upgrade procedure
Follow these steps to upgrade your online IBM Cloud Pak for Watson AIOps deployment.
1. Ensure cluster readiness
Recommended: Take a backup before upgrading. For more information, see Backup and restore.
-
Ensure that your cluster still meets all of the prerequisites for deployment. For more information, see Planning.
In IBM Cloud Pak for Watson AIOps 4.1.0, the storage requirements for Kafka have increased to 300 GB (3 persistent volumes (PVs) of 100 GB each) for production deployments, and to 60 GB for starter deployments. Your PVs are already configured with volume expansion enabled, as stated in the Storage class requirements, but you must ensure that there is adequate space for the Kafka PVs to expand before you commence upgrade. You will run the command to increase the storage allocation for Kafka in step 7, Increase Kafka storage size.
Note: IBM Cloud Pak for Watson AIOps requires that Red Hat OpenShift Container Platform must be version 4.10.46 or higher.
-
Run the IBM Cloud Pak for Watson AIOps prerequisite checker script.
The prerequisite checker script ensures that your Red Hat OpenShift Container Platform cluster is correctly set up for an IBM Cloud Pak for Watson AIOps upgrade. When you run the prerequisite checker script, you must run the script in the same project (namespace) that IBM Cloud Pak for Watson AIOps is installed in.
For more information about the script, including how to download and run it, see github.com/IBM
.
2. Configure automatic catalog polling
Ensure that your catalog is set to automatically poll for the latest images.
Your ibm-operator-catalog
CatalogSource object can be configured to automatically poll for the latest catalog version, and to retrieve it if one is available. Polling for updates is enabled by configuring the polling attribute,
spec.updateStrategy.registryPoll
.
You might have already elected to automatically accept updates by adding the polling attribute to your ibm-operator-catalog
YAML when you installed IBM Cloud Pak for Watson AIOps, installed an IBM Cloud Pak for Watson AIOps hotfix
from IBM support , or when you installed another IBM Cloud Pak®.
Use the following steps to check whether you already have a polling attribute set, and to configure it if you do not.
Note: ibm-operator-catalog
also contains the catalogs for other IBM Cloud Paks. If you have multiple IBM Cloud Paks installed on your cluster and you enable the polling attribute, then automatic update is configured
for all of them.
-
Log in to your OpenShift cluster's console.
-
Go to Administration > Cluster Settings.
-
Click Configuration, and then scroll down and click OperatorHub.
-
Select Sources, and then scroll down and click
ibm-operator-catalog
. -
Click YAML to switch to the YAML view.
-
If there is not a
spec.updateStrategy
section, orspec.image
is not set toicr.io/cpopen/ibm-operator-catalog:latest
, then update the YAML to have the following in thespec
block, and save it.spec: displayName: ibm-operator-catalog publisher: IBM Content sourceType: grpc image: icr.io/cpopen/ibm-operator-catalog:latest updateStrategy: registryPoll: interval: 45m
3. Update foundational services
IBM Cloud Pak® foundational services, which is part of your IBM Cloud Pak for Watson AIOps deployment, must be at version 3.23 or higher before you upgrade IBM Cloud Pak for Watson AIOps.
Use the following steps to verify that your ibm-common-service-operator
subscription is set to version 3.23 or higher, and to set it to a qualifying version if it is not.
-
Run the following command to find out what version of foundational services you have installed.
oc get csv -A | grep ibm-common-service-operator
If the version returned is v3.23 or higher, then you do not need to update foundational services and you must skip the rest of this section and proceed to step 4, Create a network policy for log anomaly detection.
-
Download the Common Services upgrade script,
upgrade_common_services.sh
, from github.com/IBM.
-
Run the following command from the directory that you downloaded the Common Services upgrade script to. This script must be run by a user with
cluster-admin
privilege../cp4waiops-samples/upgrade/upgrade_common_services.sh -a -c v3.23
Important: You must only run this script if your version of foundational services is less than v3.23.
-
When
upgrade_common_services.sh
completes, run the following commands to verify that theibm-common-service-operator
channel is set to version 3.23 or higher in the subscription and in the ClusterServiceVersion (CSV) before you continue.-
Check the subscription.
oc get subscription ibm-common-service-operator -n ibm-common-services -o jsonpath='{.spec.channel}'
Example output:
oc get subscription ibm-common-service-operator -n ibm-common-services -o jsonpath='{.spec.channel}' 'v3.23'
-
Check the CSV.
oc get csv -A | grep ibm-common-service-operator
Example output:
oc get csv -A | grep ibm-common-service-operator ibm-common-services ibm-common-service-operator.v3.23.7 IBM Cloud Pak foundational services 3.23.7 Succeeded cp4waiops ibm-common-service-operator.v3.23.7 IBM Cloud Pak foundational services 3.23.7 Succeeded
-
-
The foundational services upgrade commences, and will take approximately 30 - 60 minutes.
You can run the following command to check the status of
ZenService
. When the foundational services upgrade is complete, this command will have a STATUS ofCompleted
. Do not proceed until the upgrade has completed.oc get zenservice -A -o custom-columns='KIND:.kind,NAME:.metadata.name,NAMESPACE:.metadata.namespace,VERSION:status.currentVersion,STATUS:.status.zenStatus,PROGRESS:.status.Progress,MESSAGE:.status.ProgressMessage'
Example output from a successful foundational services upgrade:
KIND NAME NAMESPACE VERSION STATUS PROGRESS MESSAGE ZenService iaf-zen-cpdservice cp4waiops 4.8.0 Completed 100% The Current Operation Is Completed
4. Create a network policy for log anomaly detection
If you plan to use log anomaly for new or existing log connections, run the following commands. Replace the AIOPS_NAMESPACE
value with the name of the project in which Cloud Pak for Watson AIOps is installed.
AIOPS_NAMESPACE="cp4waiops"
cat << EOF | oc apply -n $AIOPS_NAMESPACE -f -
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
labels:
app: flink
cluster: cp4waiops-eventprocessor-eve-29ee-ep
component: taskmanager
name: cp4waiops-eventprocessor-eve-29ee-ep-tm-patch
spec:
egress:
- {}
ingress:
- from:
- podSelector:
matchLabels:
app: flink
cluster: cp4waiops-eventprocessor-eve-29ee-ep
component: taskmanager
- podSelector:
matchLabels:
app: flink
cluster: cp4waiops-eventprocessor-eve-29ee-ep
component: jobmanager
- ports:
- port: 9248
protocol: TCP
- port: 6122
protocol: TCP
- port: 6121
protocol: TCP
podSelector:
matchLabels:
app: flink
cluster: cp4waiops-eventprocessor-eve-29ee-ep
component: taskmanager
policyTypes:
- Ingress
- Egress
EOF
cat << EOF | oc apply -n $AIOPS_NAMESPACE -f -
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
labels:
app: flink
cluster: cp4waiops-eventprocessor-eve-29ee-ep
component: jobmanager
name: cp4waiops-eventprocessor-eve-29ee-ep-jm-patch
spec:
egress:
- {}
ingress:
- from:
- podSelector:
matchLabels:
app: flink
cluster: cp4waiops-eventprocessor-eve-29ee-ep
component: taskmanager
- podSelector:
matchLabels:
app: flink
cluster: cp4waiops-eventprocessor-eve-29ee-ep
component: jobmanager
- ports:
- port: 8081
protocol: TCP
- port: 6123
protocol: TCP
- port: 6125
protocol: TCP
- port: 8080
protocol: TCP
- port: 6124
protocol: TCP
- port: 9249
protocol: TCP
podSelector:
matchLabels:
app: flink
cluster: cp4waiops-eventprocessor-eve-29ee-ep
component: jobmanager
policyTypes:
- Ingress
- Egress
EOF
5. Update the operator subscription
Use the following steps to update the spec.channel
value of the IBM Cloud Pak for Watson AIOps subscription to the release that you want to upgrade to, v4.1.
-
Log in to your OpenShift cluster's console.
-
Select Home > Search.
-
From the Project list, select the project (namespace) that your IBM Cloud Pak for Watson AIOps subscription is deployed in. This is your IBM Cloud Pak for Watson AIOps project if your deployment is namespace scoped, or
openshift-operators
if your deployment has a cluster wide scope. -
In the Resources list, select
SUB Subscription
. A list of subscriptions is displayed. -
Click the subscription that has a Name of
ibm-aiops-orchestrator
. A new window with the subscription details foribm-aiops-orchestrator
is displayed. -
Click the value in the Update channel box. A new window called Change Subscription update channel is displayed.
-
Change the channel to
v4.1
and click Save
6. Verify the deployment
Use the following procedure to check the status of your upgraded IBM Cloud Pak for Watson AIOps deployment.
-
Log in to your OpenShift cluster's console.
-
Click Operators > Installed Operators.
-
From the Project list, select the project (namespace) that IBM Cloud Pak for Watson AIOps is deployed in if your deployment is namespace scoped, or
openshift-operators
if your deployment has a cluster-wide scope. -
Locate IBM Cloud Pak for Watson AIOps AI Manager in the list, and verify that the annotation underneath it shows
4.1.0
. -
Select IBM Cloud Pak for Watson AIOps AI Manager and then click the IBM Cloud Pak for Watson AIOps AI Manager tab.
-
Under Installations, look for the entry with the name that you specified for your IBM Cloud Pak for Watson AIOps instance, and verify that it has a Status of Phase: Running, which means that your deployment is complete and successful.
(Optional): If you want to see more detail about the status of your deployment's components, select the entry with the name that you specified for your IBM Cloud Pak for Watson AIOps instance, and then switch to the YAML view. Scroll down to the Status section near the end of the YAML. A component's installation is complete and successful when the component has a value of
Ready
.Example YAML:
status: size: small customProfileConfigmap: aiops-custom-size-profile customProfileValidationStatus: >- Custom profile configmap not found, continue installation process without customization storageclasslargeblock: rook-ceph-rbd componentstatus: issueresolutioncore: Ready kafka: Ready aiopsanalyticsorchestrator: Ready aiopsedge: Ready tunnel: Ready lifecycleservice: Ready zenservice: Ready vaultaccess: Ready vaultdeploy: Ready flinkcluster: Ready cluster: Ready elasticsearch: Ready kong: Ready aiopsui: Ready redissentinel: Ready <...>
(Optional) You can also download and run a status checker script to see information about the status of your deployment. For more information about how to download and run the script, see github.com/IBM.
If the upgrade fails, or is not complete and is not progressing, then see Troubleshooting installation and upgrade and Known Issues to help you identify any installation problems.
7. Increase Kafka storage size
Increase the storage allocation for Kafka.
-
Log in to your OpenShift cluster's console.
-
Go to Home > Search, and then click on the Resources dropdown and search for
Kafka
. -
Select the checkbox next to
Kafka
. -
Click on
iaf-system
, and then click YAML to switch to the YAML view. -
Locate
spec
>kafka
>storage
>size
, and edit the value here to be100Gi
for a production deployment, and60Gi
for a starter deployment. Save your changes and exit.
8. Post upgrade actions
-
If you previously setup backup or restore on your deployment, then you must follow the instructions in Upgrading IBM Cloud Pak for Watson AIOps backup and restore artifacts.
-
If the EXPIRY_SECONDS environment variable was set for configuring log anomaly alerts, the environment variable was not retained in the upgrade. After the upgrade is completed, set the environment variable again. For more information about setting the variable, see Configuring expiry time for log anomaly alerts.
-
If the
Access Control
page displays custom roles with deprecated permissions after upgrade, see Custom roles with deprecated permissions after upgrade. -
(Optional) A new field is available in IBM Cloud Pak for Watson AIOps 4.1.0 that you can use to specify the terminology for collections of topology resources as
application
orservice
. The default isapplication
. If you want to useservice
as the terminology for your topology resource collections, then use the following steps.-
Log in to your Red Hat OpenShift cluster's console, and then click Operators > Installed Operators.
-
From the Project list, select the project (namespace) that IBM Cloud Pak for Watson AIOps is deployed in.
-
Click IBM Cloud Pak for Watson AIOps AI Manager.
-
On the Operator Details page, click the IBM Cloud Pak for Watson AIOps AI Manager tab, and then click the returned IBM Cloud Pak for Watson AIOps installation name.
-
Click the YAML tab, and then add
topologyModel: service
under thespec
block. Save your changes.Example YAML excerpt:
spec: topologyModel: service
-
-
(Optional) Delete the persistent volume claim (PVC) for training job state data that is no longer required. For more information, see Deleting a persistent volume claim.