Scale the replicas to 0 and back up to reset gateway-peering.
Before you begin
This task is only needed under certain scenarios, such as when you rotate the root CA, and is not automatically required for an upgrade.
About this task
Follow these steps to fix gateway peering. When the ingress issuer changes, the gateway pods
must be scaled down and back up. This process will cause 5 to 10 minutes of API downtime.
Procedure
-
After operand upgrade completes, wait for Management, Portal and Analytics subsystems to
report
Running
status.
- Scale down the gateway firmware containers by editing the Gateway CR and setting
replicaCount
to 0:
kubectl edit gw <gw-cr-name>
For example:
...
spec:
replicaCount: 0
...
- Wait for the Gateway firmware pods to scale down and terminate. Ensure that the Gateway
firmware pods are terminated before moving to next step.
- Scale up the Gateway firmware containers back to their original value, or remove the
replicaCount
field if none was there before.
kubectl edit gw <gw-cr-name>
- Your operands should all have an upgraded version. To verify, wait for the cluster status
to become
Running
, and Reconciled version to show the new version number.
For example:
kubectl get apic -n <namespace>
NAME READY STATUS VERSION RECONCILED VERSION AGE
analyticscluster.analytics.apiconnect.example.com/analytics 8/8 Running 10.0.5.7 10.0.5.7-1074 121m
NAME PHASE READY SUMMARY VERSION AGE
datapowerservice.datapower.example.com/gw1 Running True StatefulSet replicas ready: 1/1 10.0.5.7 100m
NAME PHASE LAST EVENT WORK PENDING WORK IN-PROGRESS AGE
datapowermonitor.datapower.example.com/gw1 Running false false 100m
NAME READY STATUS VERSION RECONCILED VERSION AGE
gatewaycluster.gateway.apiconnect.example.com/gw1 2/2 Running 10.0.5.7 10.0.5.7-1074 100m
NAME READY STATUS VERSION RECONCILED VERSION AGE
managementcluster.management.apiconnect.example.com/m1 16/16 Running 10.0.5.7 10.0.5.7-1074 162m
NAME READY STATUS VERSION RECONCILED VERSION AGE
portalcluster.portal.apiconnect.example.com/portal 3/3 Running 10.0.5.7 10.0.5.7-1074 139m
Note: Troubleshooting if Gateway pods not in sync with Management after upgrade
A rare failure has
been seen during upgrade where some API Connect manager tasks are unable to run after the upgrade.
Check if the API Connect 'taskmanager' pods log has the following error message. It would start 15
minutes after upgrade and repeat every 15 minutes for any stuck
task.
TASK: Stale claimed task set to errored state:
If these errors are
reported, restart all the management-natscluster
pods, for example:
management-natscluster-1
.
kubectl -n <namespace> delete pod management-natscluster-1 management-natscluster-2 management-natscluster-3