Management operations not reflected on Gateway or Portal
Where publishes or deletes executed on the Management subsystem are not reflected on the gateway or portal within a few minutes. These problems are often caused by a network failure between Management and the other subsystems, or due to TLS handshaking failures.
- SSH to one of your subsystem VMs as the user 'apicadm', for example:
ssh apicadm@mgmt1.example.com
- Sudo to the root user:
sudo -i
- From here you can run
kubectl
commands, for example:# kubectl get pods NAME READY STATUS RESTARTS AGE abc-management-analytics-proxy-5f8bcd74c4-jbhch 1/1 Running 1 (97d ago) 113d ...
- API Connect pods run in
the default namespace on VMware, so you do not need to specify the namespace with
-n
in the kubectl commands.
For Gateway, the first place to check is the processing status page: Reviewing a gateway's processing status
taskmanager
pods at the time of the
publish attempt. These pods send the publish operations to the gateway and portals via webhooks. If
there is a network problem you may see socket timeout errors in the logs of these pods, or TLS
errors if there is problem with your certificates. A common problem are load-balancers doing TLS
termination instead of passthrough (where JWT security is not used, see
Using JWT instead of mTLS on Kubernetes and OpenShift, Using JWT
instead of mTLS on OVA). If the logs suggest that the webhook was sent successfully then
check the gateway director (gwd) logs on the Gateway, and the www
pod
admin
container logs on the portal, for example:
kubectl logs abc-portal-site1-www-0 -c admin
Taskmanager pods execute 'send' tasks to publish operations to the gateways and portals. The taskmanager pods log messages about the progress of tasks, so there should be messages in a taskmanager pod for a send task soon after applying a change on the Management subsystem. A "send running in pod" message like the following example means the send task is running:
2023-05-05 14:02:37.270 management-taskmanager-f5d6fd77-dmnnt taskmanager apim:taskmanager:info:taskProcessor [0d30189c-db39-4260-bb1f-a75a3ac5f1ba] task id: b3933261-5799-473d-9b83-9e29c245fde8 / kind: send running in pod: management-taskmanager-f5d6fd77-dmnnt / containerId: cri-o://d5afe6d3960e333c1ea5d8f95b79564e769bc2ed5de9d8baebebd84b3f3ad4be 269ms after claimed
If the taskmanager logs an error containing 'dispatching task failed for task' (logged when attempting to start a task) or an error containing 'Stale claimed task' (logged 15 minutes later), API Connect might have a problem running tasks. A small number of these errors might be reported from time to time, and the system recovers automatically. But if the errors continue being reported, restart the natscluster pods (all together) and then restart the taskmanager pods to recover by running the following commands:
kubectl -n <namespace> delete po -l app.kubernetes.io/name=natscluster
kubectl -n <namespace> delete po -l app.kubernetes.io/name=taskmanager
When raising a support case be sure to include logs from both the Management subsystem and the affected subsystems gateway/portal, stating when the publish attempt was made.