The etcd component fails to start
During installation, the etcd component fails to start.
Symptoms
Installation exits after 10 minutes of waiting for the etcd component to start.
Note: By default, the installation waits for 10 minutes before it exits. You can change the wait time by updating the wait_for_timeout
parameter in the /<installation_directory>/cluster/config.yaml
file. Specify the parameter value in seconds.
Causes
- The kubelet container fails to start.
- The etcd image cannot be pulled.
- The etcd container cannot start.
- The etcd port is not accessible. The default etcd port is 4001.
Resolving the problem
-
Check whether kubelet is running:
- Log in to your master node.
-
Run the following command to check kubelet status:
systemctl status kubelet
If kubelet is not running, run the following command to get the logs:
journalctl -u kubelet &> kubelet.log
-
In an IBM® Cloud Private-CE environment, ensure that you have internet access and are able to pull the etcd image.
-
In IBM® Cloud Private Enterprise and Cloud Native environments, ensure that the etcd image is copied on to the boot node. Use the following command to verify whether the etcd image is loaded:
docker images | grep etcd
-
Check whether etcd container was started:
- Log in to your master node as a user with root permission.
-
Run the following command to check etcd container status:
docker ps | grep etcd
If etcd container was not started, run the following commands to get the logs:
- Get the etcd container ID:
docker ps -a | grep etcd
- Run the command to get the logs:
docker logs <etcd container ID> &> etcd.log
- Get the etcd container ID:
-
Check whether you can connect to etcd. Run the following command:
telnet <master node IP or cluster virtual IP address> 4001
-
Ensure that your firewall is not blocking port 4001.