Management operations not reflected on Gateway or Portal

Where publishes or deletes executed on the Management subsystem are not reflected on the gateway or portal within a few minutes. These problems are often caused by a network failure between Management and the other subsystems, or due to TLS handshaking failures.

Note: If you are on VMware, some of the following checks involve running kubectl commands on your VMs. To access your VMs for running kubectl commands:
  1. SSH to one of your subsystem VMs as the user 'apicadm', for example: ssh apicadm@mgmt1.example.com
  2. Sudo to the root user: sudo -i
  3. From here you can run kubectl commands, for example:
    # kubectl get pods
    NAME                                                              READY   STATUS      RESTARTS      AGE
    abc-management-analytics-proxy-5f8bcd74c4-jbhch                   1/1     Running     1 (97d ago)   113d
    ...
  4. API Connect pods run in the default namespace on VMware, so you do not need to specify the namespace with -n in the kubectl commands.

For Gateway, the first place to check is the processing status page: Reviewing a gateway's processing status

For Portal and Gateway check the logs of the taskmanager pods at the time of the publish attempt. These pods send the publish operations to the gateway and portals via webhooks. If there is a network problem you may see socket timeout errors in the logs of these pods, or TLS errors if there is problem with your certificates. A common problem are load-balancers doing TLS termination instead of passthrough (where JWT security is not used, see Using JWT instead of mTLS on Kubernetes and OpenShift, Using JWT instead of mTLS on OVA). If the logs suggest that the webhook was sent successfully then check the gateway director (gwd) logs on the Gateway, and the www pod admin container logs on the portal, for example:
kubectl logs abc-portal-site1-www-0 -c admin

Taskmanager pods execute 'send' tasks to publish operations to the gateways and portals. The taskmanager pods log messages about the progress of tasks, so there should be messages in a taskmanager pod for a send task soon after applying a change on the Management subsystem. A "send running in pod" message like the following example means the send task is running:

2023-05-05 14:02:37.270 management-taskmanager-f5d6fd77-dmnnt taskmanager apim:taskmanager:info:taskProcessor [0d30189c-db39-4260-bb1f-a75a3ac5f1ba] task id: b3933261-5799-473d-9b83-9e29c245fde8 / kind: send running in pod: management-taskmanager-f5d6fd77-dmnnt / containerId: cri-o://d5afe6d3960e333c1ea5d8f95b79564e769bc2ed5de9d8baebebd84b3f3ad4be 269ms after claimed

If the taskmanager logs an error containing 'dispatching task failed for task' (logged when attempting to start a task) or an error containing 'Stale claimed task' (logged 15 minutes later), API Connect might have a problem running tasks. A small number of these errors might be reported from time to time, and the system recovers automatically. But if the errors continue being reported, restart the natscluster pods (all together) and then restart the taskmanager pods to recover by running the following commands:

kubectl -n <namespace> delete po -l app.kubernetes.io/name=natscluster 
kubectl -n <namespace> delete po -l app.kubernetes.io/name=taskmanager

When raising a support case be sure to include logs from both the Management subsystem and the affected subsystems gateway/portal, stating when the publish attempt was made.