Enabling gateway peering and verifying cluster status
To complete the operand upgrade, restart the gateway pods to enable gateway peering.
About this task
Follow these steps to fix gateway peering. Due to the ingress issuer changes, the gateway pods must be scaled down and back up. This process will cause 5 to 10 minutes of API downtime.
Procedure
-
After operand upgrade completes, wait for Management, Portal and Analytics subsystems to
report
Running
status. - Scale down the gateway firmware containers by editing the Gateway CR and setting
replicaCount
to 0:kubectl edit gw <gw-cr-name>
For example:
... spec: replicaCount: 0 ...
- Wait for the Gateway firmware pods to scale down and terminate. Ensure that the Gateway firmware pods are terminated before moving to next step.
- Scale up the Gateway firmware containers back to their original value, or remove the
replicaCount
field if none was there before.kubectl edit gw <gw-cr-name>
- Your operands should all have an upgraded version. To verify, wait for the cluster status
to become
Running
, and Reconciled version to show the new version number.For example:
kubectl get apic -n <namespace> NAME READY STATUS VERSION RECONCILED VERSION AGE analyticscluster.analytics.apiconnect.example.com/analytics 8/8 Running 10.0.4.0 10.0.4.0-1074 121m NAME PHASE READY SUMMARY VERSION AGE datapowerservice.datapower.example.com/gw1 Running True StatefulSet replicas ready: 1/1 10.0.4.0 100m NAME PHASE LAST EVENT WORK PENDING WORK IN-PROGRESS AGE datapowermonitor.datapower.example.com/gw1 Running false false 100m NAME READY STATUS VERSION RECONCILED VERSION AGE gatewaycluster.gateway.apiconnect.example.com/gw1 2/2 Running 10.0.4.0 10.0.4.0-1074 100m NAME READY STATUS VERSION RECONCILED VERSION AGE managementcluster.management.apiconnect.example.com/m1 16/16 Running 10.0.4.0 10.0.4.0-1074 162m NAME READY STATUS VERSION RECONCILED VERSION AGE portalcluster.portal.apiconnect.example.com/portal 3/3 Running 10.0.4.0 10.0.4.0-1074 139m
Note: Troubleshooting if Gateway pods not in sync with Management after upgradeA rare failure has been seen during upgrade where some API Connect manager tasks are unable to run after the upgrade. Check if the API Connect 'taskmanager' pods log has the following error message. It would start 15 minutes after upgrade and repeat every 15 minutes for any stuck task.
TASK: Stale claimed task set to errored state:
If these errors are reported, restart all the
management-natscluster
pods, for example:management-natscluster-1
.kubectl -n <namespace> delete pod management-natscluster-1 management-natscluster-2 management-natscluster-3