Enabling gateway peering and verifying cluster status

Scale the replicas to 0 and back up to reset gateway-peering.

Before you begin

This task is only needed under certain scenarios, such as when you rotate the root CA, and is not automatically required for an upgrade.

About this task

Follow these steps to fix gateway peering. When the ingress issuer changes, the gateway pods must be scaled down and back up. This process will cause 5 to 10 minutes of API downtime.

Procedure

  1. After operand upgrade completes, wait for Management, Portal and Analytics subsystems to report Running status.
  2. Scale down the gateway firmware containers by editing the Gateway CR and setting replicaCount to 0:
    kubectl edit gw <gw-cr-name>

    For example:

    ...
    spec:
      replicaCount: 0
    ...
  3. Wait for the Gateway firmware pods to scale down and terminate. Ensure that the Gateway firmware pods are terminated before moving to next step.
  4. Scale up the Gateway firmware containers back to their original value, or remove the replicaCount field if none was there before.
    kubectl edit gw <gw-cr-name>
  5. Your operands should all have an upgraded version. To verify, wait for the cluster status to become Running, and Reconciled version to show the new version number.

    For example:

    kubectl get apic -n <namespace>
    
    NAME                                                      READY   STATUS    VERSION              RECONCILED VERSION      AGE
    analyticscluster.analytics.apiconnect.example.com/analytics   8/8     Running   10.0.5.7   10.0.5.7-1074   121m
    
    NAME                                     PHASE     READY   SUMMARY                           VERSION    AGE
    datapowerservice.datapower.example.com/gw1   Running   True    StatefulSet replicas ready: 1/1   10.0.5.7   100m
    
    NAME                                     PHASE     LAST EVENT   WORK PENDING   WORK IN-PROGRESS   AGE
    datapowermonitor.datapower.example.com/gw1   Running                false          false              100m
    
    NAME                                            READY   STATUS    VERSION              RECONCILED VERSION      AGE
    gatewaycluster.gateway.apiconnect.example.com/gw1   2/2     Running   10.0.5.7   10.0.5.7-1074  100m
    
    NAME                                                 READY   STATUS    VERSION              RECONCILED VERSION      AGE
    managementcluster.management.apiconnect.example.com/m1   16/16   Running   10.0.5.7   10.0.5.7-1074   162m
    
    
    NAME                                             READY   STATUS    VERSION              RECONCILED VERSION      AGE
    portalcluster.portal.apiconnect.example.com/portal   3/3     Running   10.0.5.7   10.0.5.7-1074   139m
    
    Note: Troubleshooting if Gateway pods not in sync with Management after upgrade

    A rare failure has been seen during upgrade where some API Connect manager tasks are unable to run after the upgrade. Check if the API Connect 'taskmanager' pods log has the following error message. It would start 15 minutes after upgrade and repeat every 15 minutes for any stuck task.

    TASK: Stale claimed task set to errored state:

    If these errors are reported, restart all the management-natscluster pods, for example: management-natscluster-1.

    kubectl -n <namespace> delete pod management-natscluster-1 management-natscluster-2 management-natscluster-3