Troubleshooting installation issues
Troubleshooting IBM Fusion HCI System installation issues.
Failed to pull SCA certs error on OpenShift Container Platform console
- Problem statement
- The following error may occur in oc get co command or on OpenShift® Container Platform console:
Failed to pull SCA certs from https://api.openshift.com/api/accounts_mgmt/v1/certificates: OCM API https://api.openshift.com/api/accounts_mgmt/v1/certificates returned HTTP 500: {"code":"ACCT-MGMT-9","href":"/api/accounts_mgmt/v1/errors/9","id":"9","kind":"Error","operation_id":"316316c3-771a-4bb1-b6f2-3a52084fcbd1","reason":"400 Bad Request"}
- Resolution
- For the resolution, see Red Hat customer portal.
Empty node list in Network precheck wizard
- Problem statement
- The Network validation wizard page (Network setup stage 1) of the IBM Fusion HCI System installation can have an empty node list
with the Finish button in enabled state. The InlineNotification status may
also show a Connection complete! status with a green checkmark that suggests you can
proceed to the next step.
Similarly, Network precheck wizard page (Red Hat OpenShift installation stage 2) of IBM Fusion HCI System may have an empty node list with the Next button enabled.
- Resolution
-
- If you run into this scenario, confirm that the nodes are connected before you proceed:
Check the response of the endpoint
<https://<host IP address>:3000/api/v1/verifyDHCP - Find the failed nodes from the response.
- Manually verify the configuration of the node in DHCP and DNS.
The IP addresses of the nodes can either be incorrect or not reachable.
- Fix the issue and restart Network setup (stage 1) installation or Red Hat OpenShift installation (stage 2).
- If you run into this scenario, confirm that the nodes are connected before you proceed:
Nodes added as local host or local domain to Red Hat OpenShift Container Platform cluster
- Resolution
- If the Red Hat® OpenShift installation fails, then retry OpenShift installation wizard. If problem persists, contact IBM Support.
ISF node exporter pod in container creating state
- Problem statement
- You might encounter
ISF node exporterpod incontainer creatingstate at the end of stage 2 installation. An example pod incontainer creatingstate:2022-05-16T12:06:07.538Z ERROR controller-runtime.manager.controller.node Reconciler error {"reconciler group": "storage.isf.ibm.com", "reconciler kind": "Node", "name": "isf-node-exporter", "namespace": "openshift-monitoring", "error": "pod is not in ready state isf-node-exporter-5bc7bb6587-vcr2n Skipping the installation process"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214
- Resolution
- Ignore this message and proceed to work with next steps as the error gets resolved by itself.
ImagePull failure during an installation
- Problem statement
- The
ImagePullfailure can occur due to intermittent network or registry issues.
- Resolution
- If an
ImagePullfailure occurs due to intermittent network or registry issue during IBM Fusion HCI installation, then restart the pod and retry. If the issue persists, contact IBM support.
Pull a container image from the registry.connect.redhat.com
- Problem statement
- If you pull a container image from the
registry.connect.redhat.com, it redirects to AWS S3. It is a known issue in Red Hat.
- Diagnostic steps
- To verify the error message, do the following steps:
- Log in to Red Hat OpenShift Container Platform control node.
- Use the following commands to manually pull an image from
registry.connect.redhat.com:$ podman login registry.connect.redhat.com $ podman pull registry.connect.redhat.com/seldonio/seldon-core-operator-bundle:latestNote: An example image is used for illustration.
- Resolution
- A portion of the content is hosted on
registry.connect.redhat.comby using the following AWS S3 bucket:rhc4tp-prod-z8cxf-image-registry-us-east-1-evenkyleffocxqvofrk.s3.dualstack.us-east-1.amazonaws.com. Allow this domain so that OpenShift Container Platform can access it in your firewall.
Known issues
- During Red Hat OpenShift cluster creation, you cannot download logs but can monitor them on the user interface and download them after installation.
- If you observe an error
Configmap fusion platform not found in fusion namespacein the prereq operator logs, then the error does not have any impact and can be ignored from theisf-prereq-operator-controller-manager-xxxxpod logs.