Known issues and limitations

Review the known issues for version 2.1.0.2.

Kubernetes API Server vulnerability

IBM Cloud Private has a patch (icp-2.1.0.2-build508575) on IBM® Fix Central to address the Kubernetes security vulnerability, where the proxy request handling in the Kubernetes API Server can leave vulnerable TCP connections. For full details, see the Kubernetes kube-apiserver vulnerability issue Opens in a new tab. After you apply the patch, you do not need to redeploy either IBM Cloud Private or your Helm releases.

Prometheus does not work after an upgrade from IBM Cloud Private version 2.1.0.1 to 2.1.0.2

  1. Edit the monitoring-prometheus configMap.

    1. Open the confiMap.

      kubectl  -s 127.0.0.1:8888 edit cm -n kube-system monitoring-prometheus
      
    2. Replace the line that states replacement: /api/v1/nodes/${1}:4194/proxy/metrics with this line replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor.

    3. Save the changes.
  2. Restart the Prometheus pod.

    1. Find the Prometheus pod.

        kubectl  -s 127.0.0.1:8888 get pods -n kube-system -l app=monitoring-prometheus -l component=prometheus
      

      The output contains the ID for your Prometheus pod.

    2. Delete the Prometheus pod. When you delete a pod, the pod automatically restarts.

        kubectl  -s 127.0.0.1:8888 delete pods <Prometheus_POD_ID>
      

Virtual IP (VIP) addresses do not work in VMWare clusters

In some cases, an assigned VIP might bind to all master nodes. When the VIP binds to all master nodes, the VIP might not work properly. This binding issue usually occurs in VMWare clusters that do not have vrrp enabled. To fix this issue, you need to configure your VMWare network to have vrrp enabled.

To configure your VMWare network, complete the following steps:

  1. From the VMWare network settings, select vswitch.
  2. In the security options, for the forged transmits settings, select accept.

Alerting, logging, or monitoring pages displays 500 Internal Server Error

To resolve this issue, complete the following steps:

  1. Create an alias for the insecure kubectl api log in.

    alias kc='kubectl -s 127.0.0.1:8888 -n kube-system'
    
  2. Edit the configuration map for Kibana.

    kc edit cm kibana-nginx-config
    

    Make the following updates:

     upstream kibana {
     server localhost:5602;
     }
     Change localhost to 127.0.0.1
    
  3. Locate the Kibana pod and restart it.

     kc get pod | grep -i kibana
    
     kc delete pod <kibana-POD_ID>
    
  4. Edit the configuration map for Grafana.

    kc edit cm grafana-router-nginx-config
    

    Make the following updates

    upstream grafana {
    server localhost:3000;
    }
    Change localhost to 127.0.0.1
    
  5. Locate the grafana pod and restart it.

    kc get pod | grep -i monitoring-grafana
    
    kc delete pod <monitoring-grafana-POD_ID>
    
  6. Edit the configuration map for the Alertmanager

    kc edit cm alertmanager-router-nginx-config
    
    Edit the following section:
    upstream alertmanager {
    server localhost:9093;
    }
    Change localhost to 127.0.0.1
    
  7. Locate the Alertmanager pod and restart it.

    kc get pod | grep -i monitoring-prometheus-alertmanager
    
    kc delete pod <monitoring-prometheus-alertmanager-POD_ID>
    

IPv6 is not supported

IBM Cloud Private cannot use IPv6 networks. Ensure that the IPv6 settings are commented out of the /etc/hosts file on each cluster node. See Configuring your cluster.

Cloud Automation Manager version 2.1.0.2 is not yet available

If you are using Cloud Automation Manager version 2.1.0.1 that is running on IBM Cloud Private 2.1.0.1, do not upgrade to IBM Cloud Private Version 2.1.0.2.

Cloud Automation Manager version 2.1.0.1 is not supported on IBM Cloud Private version 2.1.0.2.

Helm CLI installation error

During the installation of the Helm Cli you might observe an error message that resembles the following output:

error copying from remote stream to local connection: readfrom tcp4 x.x.x.x>x.x.x.x write tcp4 x.x.x.x>x.x.x.x write: broken pipe
LAST DEPLOYED: Fri Feb 16 16:56:32 2018
NAMESPACE: default
STATUS: DEPLOYED

If your status is DEPLOYED, you can ignore this error message. If your status is DEPLOYED the installation of your Helm CLI is not affected by this error message.

Docker 17.12-ce is not supported

Docker version 17.09-ce is the latest stable version of Docker CE that is supported by IBM Cloud Private. Instability issues are reported or observed with Docker version 17.12-ce. To avoid these issues, use a supported version of Docker. For a list of supported versions, see Supported Docker Versions.

Services are unresponsive or display Gateway Timeout errors

Sometimes a cluster node is online, but the services that run on that node are unresponsive or return 504 Gateway Timeout errors when you try to access them. These errors might be due to a known issue with Docker where an old containerd reference is used even after the containerd daemon was restarted. This defect causes the Docker daemon to go into an internal error loop that uses a high amount of CPU resources and logs a high number of errors. For more information about this error, see the Refresh containerd remotes on containerd restarted pull request against the Moby project.

To determine whether this defect causes the errors, SSH into the affected node, and run the journalctl -u kubelet -f command. Review the command output for error messages that include the following text: transport: dial unix /var/run/docker/containerd/<container_id>: connect: connection refused.

If you see that text, run the top command and confirm that dockerd uses a high percentage of the available CPU.

To work around this issue, use the host operating system command to restart the docker service on the node. After some time, the services resume.

You need to manually delete some pods that use GlusterFS for storage

If a pod that uses a GlusterFS PersistentVolume for storage is stuck in the Terminating state after you try to delete it, you must manually delete the pod. Run the following command:

kubectl -n <namespace> delete pods --grace-period=0 --force <pod_name>

Cannot log in to the management console with an LDAP user after you restart the leading master

If you cannot log in to the management console after you restart the leading master node in a high availability cluster, take the following actions:

  1. Log in to the management console with the cluster administrator credentials. The user name is admin, and the password is admin.
  2. Click Menu > Manage > Authentication (LDAP).
  3. Click Edit and then click Save. LDAP users can log in to the management console.

Calico prefix limitation on Linux™ on Power® nodes

If you install IBM Cloud Private on PowerVM Linux LPARs and your virtual Ethernet devices use the ibmveth prefix, you must set the network adapter to use Calico networking. During installation, be sure to set a calico_ip_autodetection_method parameter value in the config.yaml file. The setting resembles the following text:

calico_ip_autodetection_method: interface=<device_name>

Where <device_name> is the name of your network adapter. You must specify the ibmveth0 interface on each node of cluster, including worker nodes. Note: If you used PowerVC to deploy your cluster node, this issue does not affect you.

Alerts in Slack contain invalid links

If you integrated a Slack provider with Alertmanager, the links in the Slack messages are invalid. You must open the Alertmanager dashboard at https://<master_ip>:8443/alertmanager to view the alerts.

StatefulSets remain in Terminating state after a worker node shuts down

If the node where the StatefulSet pod is running shut down, the pod for the StatefulSet enters a Terminating state. You must manually delete the pod that is stuck in the Terminating state to force it to be re-created on another node. Run the following command:

kubectl -n <namespace> delete pods --grace-period=0 --force <pod_name>

For more information about Kubernetes pod safety management, see Pod Safety, Consistency Guarantees, and Storage Implications External link icon in the Kubernetes community feature specs.

Kubelet fails to umount GlusterFS mount points

If you use GlusterFS to manage your PersistentVolumes and PersistentVolumeClaims (PVCs) and delete a Helm release, the PVCs are removed before they are umounted from all kubelets. See the Kubelet failure to umount glusterfs mount points External link icon issue and the Fix bug:Kubelet failure to umount mount points External link icon pull request in the Kubernetes community.

The LDAP connection has some limits

You can define only one LDAP connection in IBM Cloud Private. After you add an LDAP connection, you can edit it, but you cannot remove it.

Syncing repositories might not update Helm chart contents

Synchronizing repositories takes several minutes to complete. While synchronization is in progress, you might see an error if you try to display the readme file. After synchronization completes, you can view the readme file and deploy the chart.

Some features are not available from the new management console

IBM Cloud Private 2.1.0.2 supports the new management console only. However, some options from the previous console are not yet available. To workaround this you must use the kubectl CLI for these functions.

Containers fail to start

During installation, containers fail to start and a no space left on device error message is displayed. This issue is a known Docker engine problem that is caused by the leaking of cgroups. For more information about this issue, see https://github.com/moby/moby/issues/29638 External link icon.

To work around this issue, you must restart the host.

The management console displays 502 Bad Gateway Error

The management console sometimes displays a 502 Bad Gateway Error after installation or rebooting the master node. If you recently installed IBM Cloud Private, wait a few minutes and reload the page.

If you rebooted the master node, take the following steps:

  1. Obtain the IP addresses of the icp-ds pods. Run the following command:

     kubectl -s http://localhost:8888  get pods -o wide  -n kube-system | grep "icp-ds"
    

    The output resembles the following text:

     icp-ds-0                                                  1/1       Running       0          1d        10.1.231.171   10.10.25.134
    

    In this example, 10.1.231.171 is the IP address of the pod.

    In high availability (HA) environments, an icp-ds pod exists for each master node.

  2. From the master node, ping the icp-ds pods. Check the IP address for each icp-ds pod by running the following command for each IP address:

     ping 10.1.231.171
    

    If the output resembles the following text, you must delete the pod:

     connect: Invalid argument
    
  3. Delete each pod that you cannot reach:

      kubectl -s http://localhost:8888 delete pods icp-ds-0 -n kube-system
    

    In this example, icp-ds-0 is the name of the unresponsive pod.

    In HA installations, you might have to delete the pod for each master node.

  4. Obtain the IP address of the replacement pod or pods. Run the following command:

     kubectl -s http://localhost:8888 get pods -o wide  -n kube-system | grep "icp-ds"
    

    The output resembles the following text:

     icp-ds-0                                                  1/1       Running       0          1d        10.1.231.172   10.10.2
    
  5. Ping the pods again. Check the IP address for each icp-ds pod by running the following command for each IP address:

     ping 10.1.231.172
    

    If you can reach all icp-ds pods, you can access the IBM Cloud Private management console when that pod enters the available state.

Enable Ingress Controller to use a new annotation prefix

The NGINX ingress annotation contains a new prefix in version 0.9.0, which is used in IBM Cloud Private 2.1.0.2 nginx.ingress.kubernetes.io. This change uses the flag to avoid breaks to deployments that are running.

To avoid breaking a running NGINX ingress controller, add the flag, --annotations-prefix=ingress.kubernetes.io to the nginx ingress controller deployment. The product accepts the flag --annotations-prefix=ingress.kubernetes.io by default in IBM Cloud Private ingress controller.

If you want to use new ingress annotation, update ingress controller by removing the --annotations-prefix=ingress.kubernetes.io flag with the following command:

For Linux® 64-bit:

kubectl -s 127.0.0.1:8888 edit ds nginx-ingress-lb-amd64 -n kube-system

For Linux® on Power® 64-bit LE:

kubectl -s 127.0.0.1:8888 edit ds nginx-ingress-lb-ppc64le -n kube-system

Then remove --annotations-prefix=ingress.kubernetes.io. Save and exit to implement the change. Ingress controller restarts to pick up the new configuration.

Failure to create a persistent volume claim (PVC) with a vSphere Cloud Provider

If you are using a vSphere Cloud Provider and your IBM Cloud Private nodes have a VM hardware version of 13 or later, the PVC might not bind with the persistent volume. This issue exists because the vSphere Cloud Provider cannot find the node by its UUID. For more information about the issue, see https://github.com/kubernetes/kubernetes/issues/58927.

To avoid this issue, the VM hardware version on the nodes must be earlier than version 13.