Troubleshooting installation on OpenShift
Review the following troubleshooting tips if you encounter a problem while installing or upgrading API Connect on OpenShift, including as a component of IBM Cloud Pak for Integration (CP4I).
- One or more pods in CrashLoopBackoff or Error state, and report a certificate error in the logs
- You see the denied: insufficient scope error during an air-gapped deployment
- Apiconnect operator crashes
- Disabling the Portal web endpoint checkNote: In the
Help page of the Cloud Manager, API Manager, and API Designer user interfaces, there's a Product information tile that you can click to find out information about your product versions, as well as Git information about the package versions being used. Note that the API Designer product information is based on its associated management server, but the Git information is based on where it was downloaded from.
One or more pods in CrashLoopBackoff
or Error
state, and
report a certificate error in the logs
In rare cases, cert-manager might detect a certificate in a bad state right after it has been
issued, and then re-issues the certificate. If a CA certificate has been issued twice, the
certificate that was signed by the previously issued CA will be left stale and can't be validated by
the newly issued CA. In this scenario, one of the following messages displays in the log:
-
javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_unknown
-
Error: unable to verify the first certificate
-
ERROR: openssl verify failed to verify the Portal CA tls.crt, ca.crt chain signed the Portal Server tls.crt cert
Resolve the problem by completing the following steps:
- Use
apicops
(v10 version 0.10.57+ required) to validate the certificates in the system:apicops upgrade:stale-certs -n <namespace>
- If any certificate that is managed by cert-manager fails the validation, delete the stale
certificate
secret:
oc delete secret <stale-secret> -n <namespace>
Cert-manager automatically generates a new certificate to replace the one you deleted.
- Use
apicops
to make sure all certificates can be verified successfully:apicops upgrade:stale-certs -n <namespace>
You see the denied: insufficient scope
error during an air-gapped
deployment
Problem: You encounter the denied: insufficient scope
message while mirroring
images during an air-gapped installation.
Reason: This error occurs when a problem is encountered with the entitlement key used for obtaining images.
Solution: Obtain a new entitlement key by completing the following steps:
- Log in to the IBM Container Library.
- In the Container software library, select Get entitlement key.
- After the Access your container software heading, click Copy key.
- Copy the key to a safe location.
Apiconnect operator crashes
Problem: During installation, the Apiconnect operator crashes with the following message:
panic: unable to build API support: unable to get Group and Resources: unable to retrieve the complete list of server APIs: packages.operators.coreos.com/v1: the server is currently unable to handle the request
goroutine 1 [running]:
github.ibm.com/velox/apiconnect-operator/operator-utils/v2/apiversions.GetAPISupport(0x0)
operator-utils/v2/apiversions/api-versions.go:89 +0x1e5
main.main()
ibm-apiconnect/cmd/manager/main.go:188 +0x4ee
Additional symptoms:
- Apiconnect operator is in crash loopback status
- Kube apiserver pods log the following
information:
E1122 18:02:07.853093 18 available_controller.go:437] v1.packages.operators.coreos.com failed with: failing or missing response from https://10.128.0.3:5443/apis/packages.operators.coreos.com/v1: bad status from https://10.128.0.3:5443/apis/packages.operators.coreos.com/v1: 401
- The IP logged here belongs to the package server pod present in the
openshift-operator-lifecycle-manager
namespace - Package server pods log the following:
/apis/packages.operators.coreos.com/v1
API call is being rejected with 401 issueE1122 18:10:25.614179 1 authentication.go:53] Unable to authenticate the request due to an error: x509: certificate signed by unknown authority I1122 18:10:25.614224 1 httplog.go:90] verb="GET" URI="/apis/packages.operators.coreos.com/v1" latency=161.243µs resp=401 UserAgent="Go-http-client/2.0" srcIP="10.128.0.1:41370":
- Problem is intermittent
Solution:
- If you find the exact symptoms as described, the solution is to delete package server pods in
the
openshift-operator-lifecycle-manager
namespace. - New package server pods will log the
200 Success
message for the same API call.
Disabling the Portal web endpoint check
When you create or register a Developer Portal
service, the Portal subsystem checks that the Portal web endpoint is accessible. However sometimes,
for example due to the complexity of public and private networks, the endpoint cannot be reached.
The following example shows the errors that you might see in the
portal-www
pod,
admin container logs, if the endpoint cannot be reached:
An error occurred contacting the provided portal web endpoint: example.com
The provided Portal web endpoint example.com returned HTTP status code 504
In this
instance, you can disable the Portal web endpoint check so that the Developer Portal
service can be created successfully. To disable the endpoint check, complete the following update:
- On Kubernetes, OpenShift, and IBM® Cloud Pak for Integration
- Add the following section to the Portal custom resource (CR) template:
spec: template: - containers: - env: - name: PORTAL_SKIP_WEB_ENDPOINT_VALIDATION value: "true" name: admin name: www