Master nodes show NotReady state
Master nodes show NotReady state after upgrade.
Symptoms
After the upgrade-k8s operation completes, master nodes go into NotReady state.
Causes
Before IBM Cloud Private version 3.2.1 with Amazon Web Services (AWS), you were able to install IBM Cloud Private even though the node name provided by AWS was different from the hostname. The AWS node name was based on the internal DNS name. In
this context, if you attempt to upgrade IBM Cloud Private to version 3.2.1, you will see that master nodes can become NotReady after the upgrade-k8s operation completes. It is because the kubelet certificates on master
nodes are generated based on hostname from the Ansible built-in variables. The mismatch between the node name and hostname prevents the kubelet service on all master nodes from joining the clusters. The nodes remain in NotReady state.
Resolving the problem
As a workaround to resolve this issue, regenerate the kubelet certificate after you run the upgrade-k8s operation. This action corrects the certificate before you run the upgrade-chart operation.
-
Access the installer container on your boot node. For example,
docker run -e LICENSE=accept --rm -it --net host -v $(pwd):/installer/cluster ibmcom/icp-inception:3.2.1-ee bashReplace the image name with your image name.
-
Regenerate Kubernetes certificates in a different directory.
cd playbook/roles/kubernetes-certs/files/ export CERT_DIR=/installer/cluster/cfc-certs/aws-kubernetes export ROOT_CA_CRT=/installer/cluster/cfc-certs/root-ca/ca.crt export ROOT_CA_KEY=/installer/cluster/cfc-certs/root-ca/ca.key ./make-ca-cert.sh "ip-10-38-74-106.ap-southeast-2.compute.internal ip-10-38-74-157.ap-southeast-2.compute.internal ip-10-38-74-34.ap-southeast-2.compute.internal" "127.0.0.1" "DNS:kubernetes,DNS:kubernetes.default,DNS:kubernetes.default.svc" -
Verify that the Kubernetes certificates that you created are on your boot node. For example,
# ls -l cluster/cfc-certs/aws-kubernetes/ total 108 -rw------- 1 root root 6109 Jun 17 20:18 kube-controller-manager.crt -rw------- 1 root root 1704 Jun 17 20:18 kube-controller-manager.key -rw------- 1 root root 6080 Jun 17 20:18 kube-proxy.crt -rw------- 1 root root 1708 Jun 17 20:18 kube-proxy.key -rw------- 1 root root 6088 Jun 17 20:18 kube-scheduler.crt -rw------- 1 root root 1704 Jun 17 20:18 kube-scheduler.key -rw------- 1 root root 6108 Jun 17 20:18 kubecfg.crt -rw------- 1 root root 1704 Jun 17 20:18 kubecfg.key -rw------- 1 root root 6073 Jun 17 20:18 kubelet-client.crt -rw------- 1 root root 1704 Jun 17 20:18 kubelet-client.key -rw------- 1 root root 6228 Jun 17 20:18 kubelet-ip-10-38-74-106.ap-southeast-2.compute.internal.crt -rw------- 1 root root 1708 Jun 17 20:18 kubelet-ip-10-38-74-106.ap-southeast-2.compute.internal.key -rw------- 1 root root 6228 Jun 17 20:18 kubelet-ip-10-38-74-157.ap-southeast-2.compute.internal.crt -rw------- 1 root root 1704 Jun 17 20:18 kubelet-ip-10-38-74-157.ap-southeast-2.compute.internal.key -rw------- 1 root root 6227 Jun 17 20:18 kubelet-ip-10-38-74-34.ap-southeast-2.compute.internal.crt -rw------- 1 root root 1704 Jun 17 20:18 kubelet-ip-10-38-74-34.ap-southeast-2.compute.internal.key -rw------- 1 root root 6394 Jun 17 20:18 server.cert -rw------- 1 root root 1704 Jun 17 20:18 server.keyNote: The information that you need includes the kubelet certificates and keys. For example,
-rw------- 1 root root 6228 Jun 17 20:18 kubelet-ip-10-38-74-106.ap-southeast-2.compute.internal.crt -rw------- 1 root root 1708 Jun 17 20:18 kubelet-ip-10-38-74-106.ap-southeast-2.compute.internal.key -rw------- 1 root root 6228 Jun 17 20:18 kubelet-ip-10-38-74-157.ap-southeast-2.compute.internal.crt -rw------- 1 root root 1704 Jun 17 20:18 kubelet-ip-10-38-74-157.ap-southeast-2.compute.internal.key -rw------- 1 root root 6227 Jun 17 20:18 kubelet-ip-10-38-74-34.ap-southeast-2.compute.internal.crt -rw------- 1 root root 1704 Jun 17 20:18 kubelet-ip-10-38-74-34.ap-southeast-2.compute.internal.key -
Copy the related kubelet certificates and keys for each node name to the correct master node in the
/etc/cfc/kubelet/directory. -
Update
kubeconfigfile,/etc/cfc/kubelet/kubelet-configon each master node to make it point to the correct certificate and key. For example,users: - name: kubelet user: client-certificate: /etc/cfc/kubelet/kubelet-ip-10-38-74-34.ap-southeast-2.compute.internal.crt client-key: /etc/cfc/kubelet/kubelet-ip-10-38-74-34.ap-southeast-2.compute.internal.key -
Restart the kubelet service on the master node. Check node status to make sure that the node is in
Readystate. -
After the kubelet on the master node joins the cluster, verify that your certificate is in the
/etc/cfc/kubelet/directory. The kubelet server certificate is automatically generated, and is used to serve the HTTPS kubelet service that listens on port 10250.# ls -l /etc/cfc/kubelet/kubelet-serv* -rw------- 1 root root 1655 Jun 9 22:45 /etc/cfc/kubelet/kubelet-server-2020-06-09-22-45-38.pem lrwxrwxrwx 1 root root 55 Jun 9 22:45 /etc/cfc/kubelet/kubelet-server-current.pem -> /etc/cfc/kubelet/kubelet-server-2020-06-09-22-45-38.pem -
Verify the status of your cluster.
kubectl -n kube-system logs <pod name> kubectl -n kube-system exec -it <pod name> -c <container name> kubectl -n kube-system port-forward pod/<pod name> helm list --tls
You can now proceed to run upgrade-chart to continue the IBM Cloud Private upgrade.