Troubleshooting installation on OpenShift
Review the following known issues and troubleshooting tips if you encounter a problem while installing API Connect on OpenShift, including as a component of IBM Cloud Pak for Integration (CP4I).
Failed integration-ibm-cloud-native-postgresql
CatalogSource on ROKS 4.14 and OpenShift Container Platform 4.15
The API Connect operator creates the EDB catalog source in the same namespace as the API Connect operator.
Status:
Message: couldn't ensure registry server - error ensuring pod: : error creating new pod: integration-ibm-cloud-native-postgresql-: pods "integration- ibm-cloud-native-postgresql-hnjbn" is forbidden: violates PodSecurity "restricted:v1.24": allowPrivilegeEscalation != false (container "registry-server" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "registry-server" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "registry-server" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "registry-server" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Reason: RegistryServerErrorThe problem occurs when the namespace is set to enforce the restricted pod security admission
policy with the pod-security.kubernetes.io/enforce: restricted label.
ROKS 4.14 and some OpenShift Container Platform versions such as 4.15 have
enforce set to restricted.
integration-ibm-cloud-native-postgresql
CatalogSource to use the restricted security context
constraint:oc patch CatalogSource integration-ibm-cloud-native-postgresql --type merge --patch '{"spec":{"grpcPodConfig":{"securityContextConfig":"restricted"}}}'One or more pods in CrashLoopBackoff or Error state, and
report a certificate error in the logs
-
javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_unknown -
Error: unable to verify the first certificate -
ERROR: openssl verify failed to verify the Portal CA tls.crt, ca.crt chain signed the Portal Server tls.crt cert
- Use
apicops(v10 version 0.10.57+ required) to validate the certificates in the system:apicops upgrade:stale-certs -n <namespace> - If any certificate that is managed by cert-manager fails the validation, delete the stale
certificate
secret:
oc delete secret <stale-secret> -n <namespace>Cert-manager automatically generates a new certificate to replace the one you deleted.
- Use
apicopsto make sure all certificates can be verified successfully:apicops upgrade:stale-certs -n <namespace>
You see the denied: insufficient scope error during an air-gapped
deployment
Problem: You encounter the
denied: insufficient scope message while mirroring images during an air-gapped
installation or upgrade.
Reason: This error occurs when a problem is encountered with the entitlement key that is used for obtaining images.
Solution: Obtain a new entitlement key by completing the following steps:
- Log in to the IBM Container Library.
- In the Container software library, select Get entitlement key.
- After the Access your container software heading, click Copy key.
- Copy the key to a safe location.
Apiconnect operator pod fails
Problem: During installation (or
upgrade), the apiconnect operator fails with the following message:
panic: unable to build API support: unable to get Group and Resources: unable to retrieve the complete list of server APIs: packages.operators.coreos.com/v1: the server is currently unable to handle the request
goroutine 1 [running]:
github.ibm.com/velox/apiconnect-operator/operator-utils/v2/apiversions.GetAPISupport(0x0)
operator-utils/v2/apiversions/api-versions.go:89 +0x1e5
main.main()
ibm-apiconnect/cmd/manager/main.go:188 +0x4ee
- Apiconnect operator is in crash loopback status
- Kube
apiserverpods log the following information:E1122 18:02:07.853093 18 available_controller.go:437] v1.packages.operators.coreos.com failed with: failing or missing response from https://10.128.0.3:5443/apis/packages.operators.coreos.com/v1: bad status from https://10.128.0.3:5443/apis/packages.operators.coreos.com/v1: 401 - The IP logged here belongs to the package server pod present in the
openshift-operator-lifecycle-managernamespace - Package server pods log the following error message:
./apis/packages.operators.coreos.com/v1API call is being rejected with 401 issueE1122 18:10:25.614179 1 authentication.go:53] Unable to authenticate the request due to an error: x509: certificate signed by unknown authority I1122 18:10:25.614224 1 httplog.go:90] verb="GET" URI="/apis/packages.operators.coreos.com/v1" latency=161.243µs resp=401 UserAgent="Go-http-client/2.0" srcIP="10.128.0.1:41370": - The problem is intermittent
- If you find the exact symptoms as described, the solution is to delete package server pods in
the
openshift-operator-lifecycle-managernamespace. - New package server pods log the
200 Successmessage for the same API call.
Disabling the Portal web endpoint check
portal-www pod,
admin container logs, if the endpoint cannot be reached:
An error occurred contacting the provided portal web endpoint: example.com
The provided Portal web endpoint example.com returned HTTP status code 504In this
instance, you can disable the Portal web endpoint check so that the Developer Portal
service can be created successfully. - On Kubernetes, OpenShift, and IBM® Cloud Pak for Integration
- Add the following section to the Portal custom resource (CR) template:
spec: template: - containers: - env: - name: PORTAL_SKIP_WEB_ENDPOINT_VALIDATION value: "true" name: admin name: www
Enabling 2DCDR in API Agent
The issue arises from enabling API Agent on the passive node before the active node. This causes the passive node to attempt to create the database, which fails because the database does not exist.
- Disable API Agent on the passive node.
- Remove both
api-agentjobs. - Enable API Agent on the active node, and wait for the active node's management to be ready.
- Enable API Agent on the passive node.