Cluster bootstrapping failed

Message: Bootstrapping failed for Red Hat® OpenShift® Container Platform cluster.

Diagnostics

Note: Run all commands as a kni user from provisioner node (also known as RU7 or compute-1-ru7).
  1. Check the .openshift_install.log for errors.
  2. If the logs do not provide the necessary information, then continue the steps in this procedure for further diagnostics.
  3. As a kni user, run oc get nodes on the provisioner (RU7 or compute-1-ru7 node).
  4. Check whether the following three ready nodes are not listed in output:
    NAME                                        STATUS   ROLES                         AGE   VERSION
    control-1-ru2.rackm03.rtp.raleigh.ibm.com   Ready    control-plane,master,worker   55m   v1.27.3+4aaeaec
    control-1-ru3.rackm03.rtp.raleigh.ibm.com   Ready    control-plane,master,worker   56m   v1.27.3+4aaeaec
    control-1-ru4.rackm03.rtp.raleigh.ibm.com   Ready    control-plane,master,worker   56m   v1.27.3+4aaeaec
  5. Identify the node that is not booted. If you do not see the listed three nodes, then note down the missing node(s) from the output.
  6. Identify the IMM of the missing node. Here is mapping of nodes to IMM:
    control-1-ru2 ==> imm_ru2
    control-1-ru3 ==> imm_ru3
    control-1-ru4 ==> imm_ru4
  7. For node(s) that did not show up in output in step 4, run the corresponding IMM command to connect to its IMM. For example, if node control-1-ru3 did not show up, then from the RU7/provisioner (compute-1-ru7), run the imm_ru3 command as kni user. Wait for the successful connection to IMM.
  8. In the system prompt, run the resetsp on IMM prompt.
  9. Wait for 10 minutes and retry installation.
  10. If none of the nodes show up, ping the control nodes by using the IP and hostname that are reserved for the node in DHCP.
  11. If ping does not work, then do the following steps to access the IMM user interface of the Baremetal node:
    1. Find IPv6 address of node to debug using the following command:
      oc get bmh -A -o wide |grep control
      Output:
      openshift-machine-api   control-1-ru2   OK       externally provisioned   isf-rackae4-2bztv-master-0   ipmi://[fd8c:215d:178e:c0de:a94:efff:fefe:2f95]:623   unknown            true             3h28m
      openshift-machine-api   control-1-ru3   OK       externally provisioned   isf-rackae4-2bztv-master-1   ipmi://[fd8c:215d:178e:c0de:a94:efff:fefd:cecd]:623   unknown            true             3h28m
      openshift-machine-api   control-1-ru4   OK       externally provisioned   isf-rackae4-2bztv-master-2   ipmi://[fd8c:215d:178e:c0de:a94:efff:fefe:3031]:623   unknown            true             3h28m
    2. Note down the IPv6 address of the node you are checking.
    3. Go to /home/kni/isfconfig folder and open kickstart-.json file to find the password of the IMM user USERID of node
    4. Open the file and look for the string OCPRole with value equal to hostname (for example, control-1-ru3) of the node.
      Sample section:
      "ipv6ULA": "XXXXXXXXXXXXXXXXXXXXXX", "ipv6LLA": "XXXXXXXXXXXXX", "serialNum": "J1025PXX", "mtm": "7D2XCTO1WW", "ibmSerialNumber": "rackae402", "ibmMTM": "9155-C01", "type": "storage", "OCPRole": "control-1-ru3", "location": "RU2", "name": "IMM_RU2", "bootDevice": "/dev/sda", "users": [ { "user": "CEUSER", "password": "XXXXX", "group": "Administrator", "number": 2 }, { "user": "ISFUSER", "password": "XXXXXX", "group": "Administrator", "number": 3 }, { "user": "USERID", "password": "XXXXXX", "group": "Administrator", "number": 1 }
    5. Run the following command to create a tunnel to the IMM of the node:
      ssh -N -f -L :1443:[IPv6 address of IMM]:443 -L :3900:[IPv6 address of IMM]:3900 root@localhost
    6. Access the IMM user interface using the provisioner node IP address, open your browser and access the following IP.
      https://<provisioner ip>:1443
      • User - Enter `USERID`.
      • Password - Use the value obtained in the previous step.
  12. From the IMM user interface page, open the remote console and check whether the node is up and shows a valid hostname prompt.
    • If the console shows localhost, then review your DHCP/DNS settings to ensure whether correct reservation is made for the node.
    • If the node shows a Red Hat linux login prompt instead of Core OS, then it infers that the node Base management controller (BMC) is not responsive.
    Do the following steps to resolve the issue:
    1. Identify IMM of the missing node. Here is mapping of nodes to IMM:
      
      control-1-ru2 ==> imm_ru2
      control-1-ru3 ==> imm_ru3
      control-1-ru4 ==> imm_ru4
    2. For node(s) that did not show up in the previous steps, run commands to connect to their IMMs. For example, if node `control-1-ru3` did not show up, then from RU7/provisioner (compute-1-ru7), run imm_ru3 command as a kni user:
    3. Wait for the successful connection to IMM.
    4. In the system prompt, run the `resetsp` on IMM prompt.
    5. Wait 10 minutes.

Next actions

In the installation user interface, click the Retry to restart the installation.