Failed control nodes inspection
Message: Inspection failed for one or more control nodes.
Diagnostics
Note: Run all commands as a
kni
user from provisioner node (also known as RU7 or
compute-1-ru7).- Check the
.openshift_install.log
for errors. - If the logs do not provide the necessary information, then continue the steps in this procedure for further diagnostics.
- Ping node by using the IP and hostname that are reserved for the node in DHCP.
- If ping does not work, the do the following steps to access the IMM user interface of the
Baremetal node:Find IPv6 address of node and debug by using the following command:
oc get bmh -A -o wide |grep control openshift-machine-api control-1-ru2 OK externally provisioned isf-rackae4-2bztv-master-0 ipmi://[fd8c:215d:178e:c0de:a94:efff:fefe:2f95]:623 unknown true 3h28m openshift-machine-api control-1-ru3 OK externally provisioned isf-rackae4-2bztv-master-1 ipmi://[fd8c:215d:178e:c0de:a94:efff:fefd:cecd]:623 unknown true 3h28m openshift-machine-api control-1-ru4 OK externally provisioned isf-rackae4-2bztv-master-2 ipmi://[fd8c:215d:178e:c0de:a94:efff:fefe:3031]:623 unknown true 3h28m
- Note down the IPv6 address of the node you are checking.
- Go to
/home/kni/isfconfig
folder and openkickstart-.json
file to find the password of the IMM userUSERID
of node. - Open the file and look for the string
OCPRole
with value equal to hostname (for example, control-1-ru3) of the node.Sample section:"ipv6ULA": "XXXXXXXXXXXXXXXXXXXXXX", "ipv6LLA": "XXXXXXXXXXXXX", "serialNum": "J1025PXX", "mtm": "7D2XCTO1WW", "ibmSerialNumber": "rackae402", "ibmMTM": "9155-C01", "type": "storage", "OCPRole": "control-1-ru3", "location": "RU2", "name": "IMM_RU2", "bootDevice": "/dev/sda", "users": [ { "user": "CEUSER", "password": "XXXXX", "group": "Administrator", "number": 2 }, { "user": "ISFUSER", "password": "XXXXXX", "group": "Administrator", "number": 3 }, { "user": "USERID", "password": "XXXXXX", "group": "Administrator", "number": 1 }
- Run the following command to create a tunnel to the IMM of the
node:
ssh -N -f -L :1443:[IPv6 address of IMM]:443 -L :3900:[IPv6 address of IMM]:3900 root@localhost
- To access the IMM user interface using the provisioner node IP address, open your browser and
access the following IP.
https://<provisioner ip>:1443
-
user - Enter
USERID
. -
password - Use the value obtained in the previous step.
-
-
From the IMM user interface page, open the remote console and check whether the node is up and shows a valid hostname prompt.
- If the console shows localhost, then review your DHCP/DNS settings to ensure whether correct reservation is made for the node.
- If the node shows a Red Hat Linux login prompt instead of Core OS, then it infers that the node Base management controller (BMC) is not responsive.
Do the following steps to resolve the issue:- Identify IMM of the missing node. Here is mapping of nodes to
IMM:
control-1-ru2 ==> imm_ru2 control-1-ru3 ==> imm_ru3 control-1-ru4 ==> imm_ru4
- For node(s) that did not show up in the previous steps, run the following sampe IMM command to
connect to their IMMs: For example, if node
control-1-ru3
did not show up, then from RU7/provisioner (compute-1-ru7), run theimm_ru3
command as akni
user. - Wait for the successful connection to IMM. c. In the system prompt, run the
resetsp
on IMM prompt. - Wait for 10 minutes.
Next actions
In the installation user interface, click the Retry to restart the installation.