Secrets constantly created by Platform API operator
Symptoms
You experience the following symptoms:
secret-watcher
pod inOOMKilled/CrashLoopBackOff
state.cert-manager-cainjector
pod inCrashLoopBackOff
state.cert-manager-cainjector
Events show liveness and readiness probes that failed.
Cause
Starting from foundational services version 3.8.0, a known issue exists where secrets are constantly being created by the Platform API operator. These secrets have the name ibm-platform-api-operand-token-*
and ibm-platform-api-operand-dockercfg-*
.
The behavior of this issue is similar to that of a memory leak because secrets are constantly created, but never deleted. Eventually thousands of unnecessary secrets are created, overwhelming applications that watch secrets. This situation impacts
the performance of the cluster overall.
The root cause is the operator-sdk
version that is used by the Platform API operator. It unnecessarily causes the operator to attempt to upgrade the operand. Each of these empty upgrades causes a secret to be created.
Solution
This issue is fixed in foundational services version 3.10.0.
When it is not possible for you to upgrade to foundational services version 3.10.0, complete the following steps to work around the problem.
-
Scale
ibm-platform-api-operator
deployment from1
to0
pods. Theplatform-api
operand will still function properly. Run the following command:oc scale deployment -n ibm-common-services ibm-platform-api-operator --replicas=0
-
Save the service account secret that the Platform API operand uses by running the following command:
export SECRET_NAME=$(oc get pods -n ibm-common-services -o name | grep "pod/platform-api-" | xargs oc get -o yaml| grep ibm-platform-api-operand-token | head -1 | awk '{print $2}')
The secret name should resemble
ibm-platform-api-operand-token-vs2t5
. -
(Optional) To back up the secret, run the following command:
oc get secret $SECRET_NAME -o yaml > backup-ibm-platform-api-operand-token.yaml
-
Delete all extra secrets with the following command:
oc get secrets -n ibm-common-services | grep ibm-platform-api-operand-token | grep -v "${SECRET_NAME}" | cut -d ' ' -f1 | xargs oc delete secret -n ibm-common-services
The high-level description of the command actions:
Get
all secrets from where foundational services is deployed.- Filter the list of secrets only for ones where the name contains
ibm-platform-api-operand-token
. - Filter out the secret that you saved.
- Filter out extra columns to get only the names of the secrets.
- Delete the secrets.
Deleting the extra secrets can take extra time because thousands of secrets can exist. You might have to run the command multiple times because the command times out due to the large number of secrets.
This issue is fixed in foundational services version 3.10.0. After you upgrade to version 3.10.0 or later, be sure to scale the Platform API operator back up to 1
pod. You can use the following command to achieve this goal:
oc scale deployment -n ibm-common-services ibm-platform-api-operator --replicas=1