Upgrading API Connect on OpenShift

Upgrade your API Connect installation to a newer version on OpenShift in an online (connected to the internet) environment.

Before you begin

Review and complete all steps that are documented in Planning your API Connect upgrade on OpenShift and Pre-upgrade preparation and checks on OpenShift.

If any subsystem database backups are running or are scheduled to run within a few hours, do not start the upgrade process.
Do not perform maintenance tasks such as updating certificates, restoring subsystem databases from backup, or triggering subsystem database backups at any time while you are upgrading API Connect.

Procedure

If you did not run the Pre-upgrade health-checks recently, run them now.
If you have a two data center disaster recovery deployment, then upgrade the warm-standby data center first.
See the special steps for 2DCDR upgrades:
- Upgrading a 2DCDR deployment on Kubernetes and Red Hat OpenShift from V10.0.5
- Upgrading a 2DCDR deployment on Kubernetes and OpenShift from V10.0.7 or later
Upgrading from 10.0.5.x: Install the cert-manager operator for Red Hat OpenShift by completing the following steps:
1. Log in to the OpenShift Container Platform web console.
2. Click Operators > OperatorHub.
3. In the filter box, type: cert-manager Operator for Red Hat OpenShift.
4. Select cert-manager Operator for Red Hat OpenShift and click Install.
5. On the Install Operator page, complete the following steps:
  1. Update the Update channel if needed. The channel defaults to stable-v1, which installs the latest stable release of the cert-manager Operator for Red Hat OpenShift.
  2. Select the Installed Namespace for the operator.
    The default operator namespace is cert-manager-operator; if that namespace doesn't exist, it is created for you.
  3. Select an Update approval strategy:
    - Automatic: allow Operator Lifecycle Manager (OLM) to automatically update the operator when a new version is available.
    - Manual: require a user with the appropriate credentials to approve all operator updates.
  4. Click Install.
6. Verify the new cert-manager installation by completing the following steps:
  1. Click Operators > Installed Operators.
  2. Verify that cert-manager Operator for Red Hat OpenShift is listed with a Status of Succeeded in the cert-manager-operator namespace.
  3. Verify that cert-manager pods are up and running with:
```
oc get pods -n cert-manager
```
    For a successful installation, the response looks like the following example:
```
NAME                                       READY   STATUS    RESTARTS   AGE
cert-manager-bd7fbb9fc-wvbbt               1/1     Running   0          3m39s
cert-manager-cainjector-56cc5f9868-7g9z7   1/1     Running   0          4m5s
cert-manager-webhook-d4f79d7f7-9dg9w       1/1     Running   0          4m9s
```
7. Remove the obsolete cert-manager operator (which was provided by IBM Cloud Pak foundational services).
  If you previously deployed the IBM Cloud Pak foundational services operator and the API Connect operator in the same namespace, you might need to manually remove the instance of ibm-cert-manager-operator that was installed in ibm-common-services. Complete the following steps to check for the operator and then remove it:
  1. Run the following command to get the list of installed cert-manager operators:
```
oc get pods -A | grep cert-manager-operator
```
    If the response lists the ibm-cert-manager-operator operator, then proceed to the next step to remove it.
```
cert-manager-operator   cert-manager-operator-controller-manager-68ccfc6dd9-2ww4t      2/2     Running     0     47h
ibm-common-services     ibm-cert-manager-operator-7c848f5d6c-xtdsb                     1/1     Running     0     2d
```
  2. Delete the ibm-cert-manager-operator operator by running the following commands:
```
oc delete subs ibm-cert-manager-operator -n ibm-common-services   
oc get csv  -n ibm-common-services | grep ibm-cert-manager-operator | awk '{ print $1}' | xargs oc delete csv -n ibm-common-services
```
Update the operators for the API Connect deployment.
API Connect uses three operators; which you update by selecting a newer version for the operator channel. Update the channels in the following sequence:
1. DataPower: set the channel to v1.11-sc2
2. API Connect: set the channel to v5.4-sc2
3. Foundational services: set the channel to v4.6
  Be sure to update the Foundational services operator channel in all namespaces.
Complete the following steps to update a channel:
1. Click Operators > Installed Operators.
2. Select the operator to be upgraded.
3. Select the Subscription tab.
4. In the Update channel list, select the new channel version.
If you previously chose automatic subscriptions, the operator version upgrades automatically when you update the operator channel. If you previously chose manual subscriptions, OpenShift OLM notifies you that an upgrade is available and then you must manually approve the upgrade before proceeding.

Wait for the operators to update, for the pods to restart, and for the instances to display the Ready status. Both the API Connect and DataPower channels must be changed before either operator upgrades. The upgrade of both operators begins when the channel is changed for both operators.

Troubleshooting: See Troubleshooting upgrade on OpenShift.
Upgrading from V10.0.5.x: Upgrading to V10.0.8 involves migrating from Crunchy Postgres to EDB. After the API Connect operator upgrades, the apiconnectcluster will be in a pending state, and the managementcluster will be in a warning state with the following message:
```
message: management CR is not tracking status, please change CR spec.version
    as soon as possible. Refer https://shortlink.url for details
reason: DatabaseSwitchMode
```
This issue applies only to upgrades from V10.0.5.x. The problem is temporary and resolves when you upgrade the API Connect management subsystem in a later step.
You can check the status from the UI or by running the following command:
```
oc get managementcluster -o yaml
```
Note: If analytics backups are configured, then after the API Connect operator is upgraded, the analytics CR reports a status of Blocked until the analytics subsystem is upgraded to V10.0.8. With top-level CR deployments, the apiconnectcluster CR reports Pending until the analytics CR is no longer Blocked. For more information, see Analytics backup changes.
Ensure that the operators and operands are working correctly before proceeding.
Skip this step if you are upgrading from 10.0.5.x due to the issue noted in the previous step.
- Operators: Verify that the OpenShift UI indicates that all operators are in the Succeeded state without any warnings.
- If you are using the top-level CR: To verify that your API Connect cluster is working correctly, run the following command:
```
oc get apiconnectcluster
```
  Confirm that the apiconnectcluster CR reports all pods as READY.
```
NAME        READY   STATUS   VERSION    RECONCILED VERSION   MESSAGE                        AGE
apic-ocp    n/n     Ready    10.0.8.x   10.0.8.x-258         API Connect cluster is ready   56d
```
- If you are using individual subsystem CRs: To verify the health of each subsystem, run the following commands:
```
oc get ManagementCluster -n <mgmt_namespace>
oc get GatewayCluster -n <gway_namespace>
oc get PortalCluster -n <portal_namespace>
oc get AnalyticsCluster -n <mgmt_namespace>
```
  Check that all pods in each subsystem are READY, for example:
```
oc get PortalCluster
NAME     READY   STATUS    VERSION    RECONCILED VERSION   AGE
portal   n/n     Running   10.0.8.2-ifix2   10.0.8.2-ifix2-95    57m
```
Note: If analytics backups are configured, then after the API Connect operator is upgraded, the analytics CR reports a status of Blocked until the analytics subsystem is upgraded to V10.0.8. With top-level CR deployments, the apiconnectcluster CR reports Pending until the analytics CR is no longer Blocked. For more information, see Analytics backup changes.
If you are using the top-level CR (includes Cloud Pak for Integration), then update the top-level apiconnectcluster CR:
The spec section of the apiconnectcluster looks like the following example:
```
apiVersion: apiconnect.ibm.com/v1beta1
kind: APIConnectCluster
metadata:
  labels:
    app.kubernetes.io/instance: apiconnect
    app.kubernetes.io/managed-by: ibm-apiconnect
    app.kubernetes.io/name: apiconnect-production
  name: prod
  namespace: <APIC_namespace>
spec:
  license:
    accept: true
    use: production
    license: L-DZZQ-MGVN8V
  profile: n12xc4.m12
  version: 10.0.8.2-ifix2
  storageClassName: rook-ceph-block
```
1. Edit the apiconnectcluster CR by running the following command:
```
oc -n <APIC_namespace> edit apiconnectcluster
```
2. Update the spec.version property to the version you are upgrading to.
3. If you are upgrading to a version of API Connect that requires a new license, update the spec.license property accordingly.
  
  For the list of licenses, see API Connect licenses.
4. Optional: Set local analytics backups PVC size.
  By default the new PVC that is created for the analytics subsystem's local database backups is set to 150Gi. If you want to specify a larger size, then add the following to the CR spec.analytics section:
```
  storage:
    backup:
      volumeClaimTemplate:
        storageClass: <storage class>
        volumeSize: <size>
```
  where:
  - <storage class> is the same as used by your other analytics PVCs.
  - <size> is the size. For example: 500Gi. See Estimating storage requirements.
5. Optional: If you are upgrading API Connect in Cloud Pak for Integration and want to preserve the Cloud Pak endpoints during the upgrade, add the deprecatedCloudPakRoute object to the spec.management section of the CR:
  Beginning with version 10.0.7.0, API Connect no longer uses the Cloud Pak cpd routes for endpoints when deployed as a component of Cloud Pak for Integration. Instead, the API Management component uses the typical default API Connect routes (or the custom endpoints that are configured in the CR). You can enable the use of Cloud Pak endpoints when you upgrade the API Management component in Cloud Pak for Integration 2023.4.1 or later.
  
  Attention: If you choose to accept the new endpoints (without the Cloud Pak cpd routes), skip this step. However, if you previously configured your own OIDC user registry, be sure to update the redirect_uri to reflect the new API Connect endpoint.
  
  To preserve the Cloud Pak endpoints during the upgrade, insert the deprecatedCloudPakRoute object into the spec.management section as shown in the following example:
```
apiVersion: apiconnect.ibm.com/v1beta1
kind: APIConnectCluster
metadata:
  labels:
    app.kubernetes.io/instance: apiconnect
    app.kubernetes.io/managed-by: ibm-apiconnect
    app.kubernetes.io/name: apiconnect-minimum
  name: <name_of_your_instance> 
  namespace: <APIC-namespace> 
spec:
  license:
    accept: true
    license: L-MMBZ-295QZQ
    metric: PROCESSOR_VALUE_UNIT
    use: nonproduction
  profile: n1xc7.m48
  version: 10.0.8.0
  storageClassName: <default-storage-class>
  management:
    deprecatedCloudPakRoute:
      enabled: true
      cloudPakEndpoint:
        hosts:
        - name: cpd-apic.apps.admin-<domain>.com
```
  Tip: If you want to maintain the same CPD endpoint, ensure that the cloudPakEndpoint hostname is identical to the legacy CPD route name.
6. Optional: If you chose to preserve the Cloud Pak endpoints during the upgrade and want to add a custom certificate for the Cloud Pak route, complete the following steps:
  1. Create a Cloud Pak certificate that is signed by a Cloud Pak CA, as in the following example:
```
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
 name: custom-cpd-ca
 namespace: apic
spec:
 selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: custom-cpd
  namespace: apic
spec:
  commonName: small-mgmt-cpd
  dnsNames:
  - cpd-apic.apps.admin-apickeycloak.cp.fyre.ibm.com
  duration: 17520h0m0s
  issuerRef:
    kind: Issuer
    name: custom-cpd-ca
  privateKey:
    algorithm: RSA
    rotationPolicy: Always
    size: 2048
  renewBefore: 720h0m0s
  secretName: custom-cpd
  usages:
  - key encipherment
  - digital signature
  - server auth
```
  2. In the CR, provide the secret name within the cloudPakEndpoint property of the new deprecatedCloudPakRoute object; for example:
```
spec:
  deprecatedCloudPakRoute:
    enabled: true
    cloudPakEndpoint:
      hosts:
      - name: cpd-apic.apps.admin-<domain>.com
        secretName: custom-cpd
```
7. In the spec.gateway section, delete any template or dataPowerOverride override sections.
  
  If the CR contains an override, then you cannot complete the upgrade.
8. [This step is not applicable if you are upgrading from 10.0.8.0 or 10.0.8.1 to 10.0.8.2-ifix1] If you have a two data center disaster recovery installation, and are upgrading the warm-standby, then add the annotation:
```
apiconnect-operator/dr-warm-standby-upgrade-data-deletion-confirmation: "true"
```
  For more information about two data center disaster recovery upgrade, see Upgrading a 2DCDR deployment on Kubernetes and Red Hat OpenShift from V10.0.5.
9. Save and close the CR to apply your changes.
  The response looks like the following example:
```
apiconnectcluster.apiconnect.ibm.com/prod configured
```
  Troubleshooting: If you see an error message when you attempt to save the CR, see Troubleshooting upgrade on OpenShift.
10. Run the following command to verify that the upgrade is completed and the status of the top-level CR is READY:
```
oc get apiconnectcluster -n <APIC_namespace>
```
  Important: If you need to restart the deployment, wait until all portal sites complete the upgrade.
  After the portal subsystem upgrade is complete, each portal site is upgraded. You can monitor the site upgrade progress from the MESSAGE column in the oc get ptl output. You can still use the portal while sites are upgrading, although a maintenance page is shown for any sites that are being upgraded. When the site upgrades are complete, the oc get ptl output shows how many sites the portal is serving:
  NAME READY STATUS VERSION RECONCILED VERSION MESSAGE AGE portal 3/3 Running <version> <version> Serving 2 sites 22h
  On two data center disaster recovery deployments, the sites are not upgraded until both data centers are upgraded.
  Troubleshooting: If the upgrade appears to be stuck, showing the status of Pending for a long time, then check the management CR status for errors:
```
oc -n <namespace> get mgmt -o yaml
```
  Refer to Troubleshooting upgrade on OpenShift, for known issues.
If you are using individual subsystem CRs: Start with the management subsystem, and update the management CR as follows:
1. Edit the ManagementCluster CR:
```
oc edit ManagementCluster -n <mgmt_namespace>
```
2. Update the spec.version property to the version you are upgrading to.
3. If you are upgrading to a version of API Connect that requires a new license, update the spec.license property accordingly.
  
  For the list of licenses, see API Connect licenses.
4. [This step is not applicable if you are upgrading from 10.0.8.0 or 10.0.8.1 to 10.0.8.2-ifix1] If you have a two data center disaster recovery installation, and are upgrading the warm-standby, then add the annotation:
```
apiconnect-operator/dr-warm-standby-upgrade-data-deletion-confirmation: "true"
```
  For more information about two data center disaster recovery upgrade, see Upgrading a 2DCDR deployment on Kubernetes and Red Hat OpenShift from V10.0.5.
5. Save and close the CR to apply your changes.
  The response looks like the following example:
```
managementcluster.management.apiconnect.ibm.com/management edited
```
  Troubleshooting: If you see an error message when you attempt to save the CR, see Troubleshooting upgrade on OpenShift.
6. Wait until the management subsystem upgrade is complete before you proceed to the next subsystem. Check the status of the upgrade with: oc get ManagementCluster -n <mgmt_namespace>, and wait until all pods are running at the new version. For example:
```
oc -n <mgmt_namespace> get ManagementCluster 
NAME         READY   STATUS    VERSION    RECONCILED VERSION   AGE
management   18/18   Running   10.0.8.2-ifix2   10.0.8.2-ifix2-1281        97m
```
  Troubleshooting: If the upgrade appears to be stuck, showing the status of Pending for a long time, then check the management CR status for errors:
```
oc -n <namespace> get mgmt -o yaml
```
  Refer to Troubleshooting upgrade on OpenShift, for known issues.
7. Repeat the process for the remaining subsystem CRs in your preferred order: GatewayCluster, PortalCluster, AnalyticsCluster.
  Important: In the GatewayCluster CR, delete any template or dataPowerOverride override sections. If the CR contains an override, then you cannot complete the upgrade.
  Note:
  Upgrades to V10.0.8.0: By default the new PVC that is created for the analytics subsystem's local database backups is set to 150Gi. If you want to specify a larger size, then add the following to the CR spec section:
  storage: backup: volumeClaimTemplate: storageClassName: <storage class> volumeSize: <size>
  where:
  
  <storage class> is the same as used by your other analytics PVCs.
  
  <size> is the size. For example: 500Gi. See Estimating storage requirements.
  Important: If you need to restart the deployment, wait until all portal sites complete the upgrade.
  After the portal subsystem upgrade is complete, each portal site is upgraded. You can monitor the site upgrade progress from the MESSAGE column in the oc get ptl output. You can still use the portal while sites are upgrading, although a maintenance page is shown for any sites that are being upgraded. When the site upgrades are complete, the oc get ptl output shows how many sites the portal is serving:
  NAME READY STATUS VERSION RECONCILED VERSION MESSAGE AGE portal 3/3 Running <version> <version> Serving 2 sites 22h
  On two data center disaster recovery deployments, the sites are not upgraded until both data centers are upgraded.
  Troubleshooting: If the upgrade appears to be stuck, showing the status of Pending for a long time, then check the subsystem CR status for errors:
```
oc -n <namespace> get <subsystem cr> -o yaml
```
  Refer to Troubleshooting upgrade on OpenShift, for known issues.
Upgrade your deployment to the latest version of OpenShift that your current release of API Connect supports.

For more information, see Supported versions of OpenShift.
Optional: If upgrading from 10.0.5.x or 10.0.7.0: Provide a custom endpoint for the Consumer Catalog.
Skip this step if you are upgrading from 10.0.8.0 or later.

Beginning with version 10.0.8.0, API Connect includes the Consumer Catalog feature. If your deployment uses cert-manager, then during the upgrade from 10.0.5.x or 10.0.7.0, a new endpoint is added to the CR for the Custom Catalog. If you use the top-level APIConnectCluster CR, the new endpoint is added to the spec.management section. If you use individual CRs, the new endpoint is added to the management CR.

The following definition is generated and automatically added to the CR during the upgrade, using a certificate from cert-manager:
```
  consumerCatalogEndpoint:
    annotations:
      cert-manager.io/issuer: ingress-issuer
    hosts:
    - name: consumer-catalog.<domain>
      secretName: <name>-consumer-catalog-endpoint
```
1. Make the following optional changes to the endpoint definition in the CR:
  - Change the name to a host name of your own choosing.
  - Specify a custom cert-manager certificate.
2. Run the following command to save and apply the CR: wq
Switch from deprecated deployment profiles.

If you have a top-level CR installation, and were using the n3xc16.m48 or n3xc12.m40 deployment profiles, then switch to the new replacement profiles n3xc16.m64 or n3xc12.m56. For information about switching profiles, see Changing deployment profiles on OpenShift top-level CR.

If you are using individual subsystem CRs, and your analytics subsystem uses the n3xc4.m16 profile, then update the profile in your analytics CR to n3xc4.m32. For information about switching profiles, see Changing deployment profiles on Kubernetes and OpenShift

What to do next

Update your toolkit CLI by downloading it from Fix Central or from the Cloud Manager UI. For more information, see Installing the toolkit.