Known issues and limitations

Known issues and limitations related to accelerator deployment.

OpenShift Container Platform accelerator

  • When you add a storage or worker node, the new node is not created with the required memory.

    Impact: The new node does not meet the minimum OpenShift Container Platform requirement. Therefore, warnings appear in the OpenShift Container Platform console.

    Resolution/Workaround: Manually increase the memory from the IBM Cloud Pak System user interface. Make sure that you are logged into the IBM Cloud Pak® System user interface before you use one of the following options to increase the memory:

    Option 1: Increase memory and CPU from the IBM Cloud Pak® System accelerator page

    1. From the Manage accelerator instances page, click the deployed instance.
    2. Change the cluster to Maintenance mode from the Actions list before you change the configurations.
    3. Click the Nodes tab.
    4. Click the Configure option of the node that you want to configure by clicking the ellipsis (three dots) icon next to the node record.
    5. Specify the memory and CPU values in the corresponding window. Note: If you want to configure the virtual machine with more than 8 CPUs, you must stop the node first.
    6. Click Submit.
    7. Remove the cluster from the maintenance mode.

    Option 2: Increase memory and CPU from the Virtual Machines page

    1. Copy the host name of the newly created node which was created with less memory.
    2. Search on the Patterns > Virtual Machines page.
    3. Click Configure.
    4. In the new popup window, update the memory to the required size and click OK. This job takes at least 3 - 4 minutes to complete.
    5. Refresh the Virtual Machines page. The new memory is now displayed for that node.
  • When you remove a storage node after you deploy a Cloud Pak® accelerator, you might see few alerts that are triggered on the OpenShift Container Platform web console.

    Impact: Few alerts appear on the OpenShift Container Platform web console that do not confirm whether the storage node was removed or not.

    Resolution/Workaround: The Cloud Pak System team is in consultation with the Red Hat or the OpenShift Data Foundation team to address this issue.

  • When the project name space changes from default to other namespace, the cluster inventory and utilization is not seen.

    Cause: Because the token is created and stored in the default project name space, when you change the namespace other than default the token cannot be retrieved.

    Resolution/Workaround: Log in to the primary helper and change the project name space to default with the oc project default command.

    The sample output is as follows:

    [root@cps-rxx-9-xx-xx-xx ~]# oc project default
    Now using project "default" on server "https://api.cps-rxx-9-xx-xx-xx.rtp.raleigh.ibm.com:6443".
    

    The oc status command displays the project name space that you are in. The sample output is as follows:

    [root@cps-rxx-9-xx-xx-xx ~]# oc status
    In project default on server https://api.cps-rxx-9-xx-xx-xx.rtp.raleigh.ibm.com:6443
    
  • While adding a worker node, check whether sufficient IP addresses are available in the selected IP group. If the number of worker nodes is greater than the number of free IP addresses, then the deployment status changes to an error state.

  • Adding Multiple Worker Nodes Simultaneously
    Resolution/Workaround: Adding several worker nodes at the same time can sometimes lead to unexpected issues. To maintain system stability, it is recommended to add one worker node at a time.

  • Concurrent Add and Delete Operations
    Resolution/Workaround: Do not perform Add and Delete operations on worker nodes simultaneously. Wait for one operation to complete before starting another to avoid potential conflicts

  • You must give all the required details for both OpenShift Helper Node (PrimaryHelper > OpenShift Helper Node) and OpenShift Helper Node_1 (SecondaryHelper > OpenShift Helper Node_1), otherwise the deployment fails with the following error:

    <OCP_version>: "Error: OpenShift <OCP_version> images not found in OpenShift mirror registry
    

    For example:

    4.4.6: "Error: OpenShift 4.4.6 images not found in OpenShift mirror registry
    

    This error is applicable only for Platform System® Manager System management user interface and all Cloud Pak accelerators that are on OpenShift Container Platform.

  • Sometimes, OpenShift Container Platform console cannot be accessed after instance restart. To resolve the issue, do the following steps:

    1. Run the oc get nodes command to check whether all nodes are in READY state.
    2. Run the following steps to check the OpenShift Container Platform nodes for any failed services:
      1. Connect to Primary Helper as virtuser.
      2. Run the ssh core@ -i /core_rsa command as sudo to connect to each core OS virtual machine (OpenShift Container Platform node): For example, failed units are displayed as follows:
       -bash-4.2# ssh core@master1 -i /core_rsa
       Warning: Permanently added '9.42.52.206' (ECDSA) to the list of known hosts.
       Red Hat Enterprise Linux CoreOS 44.81.202005250830-0
       Part of OpenShift 4.4, RHCOS is a Kubernetes native operating system
       managed by the Machine Config Operator (clusteroperator/machine-config).
       WARNING: Direct SSH access to machines is not recommended; instead,
       make configuration changes via machineconfig objects:
       https://docs.openshift.com/container-platform/4.4/architecture/architecture-rhcos.html
       [systemd]
       Failed Units: 8
       chronyd.service
       irqbalance.service
       polkit.service
       rpc-statd.service
       rpcbind.service
       sssd.service
       vgauthd.service
       vmtoolsd.service  
      
    3. Run the following commands to start the node:
      sudo su -
      systemctl start
      
      For example, systemctl start irqbalance.
    4. While you run the 'start' commands, also run the ps -eaf|grep openshift|wc -l command on each of the nodes.
    5. If the result shows less than three processes of OpenShift running, then run the following commands and exit the node:
      sudo su -
      systemctl restart kubelet
      
  • After a node restart on the user interface, it shows "Not Ready" state on the OpenShift Container Platform console or the command-line interface (CLI). To address this issue, log in to the node and check whether it shows any failed units. Do the following steps (on Primary Helper):

    1. Run the following commands:
      -bash-4.2# ssh core@ -i /core_rsa
      core@<hostname>~]# sudo su
      core@<hostname>~]# hostname
      <output>
      
    2. If the hostname output shows local host, then restart the node with the following command:
      core@<hostname>~]# reboot
      
    3. Restart any failed units, if required (on Primary Helper), with the following command:
      #  oc get nodes 
      

    The node is now displayed in "Ready" state.

  • The following incorrect error message is displayed when none of the parameters (OpenShift image registry name, OpenShift pull Secret, or Shared service instance is not running on the system) are provided during accelerator deployment:

    Script package OpenShift Helper Node on virtual machine PrimaryHelper failed execution
    

IBM Cloud Pak® for Applications accelerator

  • At times, even when the IBM Cloud Pak for Applications accelerator deployment is successful in the IBM Cloud Pak® System user interface, the OpenShift Container Platform dashboard may display Mobile Foundation pod failure due to DB2 disconnection. To resolve this issue, contact IBM Support.

IBM Cloud Pak® for Integration accelerator

  • If any IBM Cloud Pak for Integration capability installation fails while one or more capabilities are deployed successfully, then uninstall the failed capability and redeploy. If the redeployment does not resolve the issue, see Operator installation hangs during upgrade External link icon. If the steps that are mentioned in the troubleshooting section of IBM Cloud Platform Common Services IBM Documentation does not resolve your issue, contact IBM Support.

  • If you encounter random authentication issues while you log in to any of the IBM Cloud Pak for Integration capabilities, contact IBM Support.

  • A capability might fail with the following error message:

    Common Services must be Ready. Currently: iamstatus is: NotReady
    

    If any capability fails because of IBM Cloud Platform Common Services, try either of the following options:

    • Go through the workaround in IBM Cloud Platform Common Services Knowledge Center. If the ibm-common-services-status changes to Ready state after you apply the workaround for the cluster, then you can manually redeploy the failed components. For more information about the workaround, see Known issues in common services External link icon.
    • If the workaround does not fix the issue, then deploy the accelerator and its capabilities again.

IBM Edge Application Manager accelerator

  • The IBM Edge Application Manager accelerator V4.0.0.0 deployment might fail on an offline system.

OpenShift Container Storage accelerator

  • If primary node is down, OpenShift Container Storage pods go into CrashLoopBackOff error. This error occurs because OpenShift Container Storage operator and cluster installation is on primary helper and no failback mechanism is available.

IBM Cloud Pak for Multicloud Management accelerator

  • For IBM Cloud Pak for Multicloud Management, when you use an external NFS, it might fail due to insufficient memory.
  • A cluster must not be stopped before 24 hours of deployment or else irrecoverable damage might occur. It might even result in the redeployment of the cluster. For more information, see Red Hat customer portal External link icon.

IBM Cloud Pak for Security accelerator

  • You might see some warnings on the OpenShift Container Platform dashboard when the IBM Cloud Pak for Security accelerator is not in a healthy state. These warnings might be displayed due to certain configuration changes that are expected by IBM Cloud Pak for Security at the time of deployment. Contact the IBM Cloud Pak for Security team to get the warnings resolved.

Issues common for all accelerators

  • For Cloud Pak deployments, if you deploy from the Provision accelerators page of the user interface, then the deployment does not support the Pattern Deployer type of Environment profile. In such a case, deploy specific accelerator by using the traditional way of deployment.
  • Deployment with Pattern Deployer type of Environment profile is not supported for Cloud Pak deployments, which are done from the Provision accelerators page of the user interface. In this case, deploy specific patterns by using the legacy way of pattern deployment.
  • In spite of a successful deployment of a IBM Cloud Pak for Multicloud Management 1.3.1.2 and {{site.data.keyword.auto}} 20.0.2.1 accelerators on an air-gapped environment, the Dashboard section of the deployed instance does not display a summary of nodes, pods, and PVC. To resolve this issue, verify whether the OpenShift Container Platform API server is reachable from the system's PSM.
  • When you add worker nodes to a deployed instance, the History section may show node failure due to insufficient IP. Though it shows failure, the worker nodes get added successfully on a deployed instance.
  • Kubernetes does not support snapshots and the operation on an accelerator instance might result in errors or unpredictable behavior. Do not use snapshots for Cloud Native Storage virtual machines. For more information about configuring Kubernetes cluster virtual machines, see VMware Docs External link icon.
  • The etcd is reported as "Unhealthy" post the deployment of OpenShift Container Platform. It is a known issue in Red Hat V4.4.x. For more information, see https://access.redhat.com/solutions/5070671 External link icon.
  • When worker nodes are added to a deployed Cloud Pak accelerator cluster, the instance might go into an ERROR state because of insufficient IP addresses. The known issue with IBM Cloud Pak System instance is that if one of the nodes encounters an error, then the whole pattern instance gets into an ERROR state. Sometimes, restarting the instance might clear the ERROR state.
  • For OpenShift Container Platform based Cloud Pak accelerator instances, the OpenShift Container Platform and application web-ui URL becomes inaccessible whenever 2 master nodes go down. It is because etcd requires a minimum number of nodes to be up and running. For example, 3 etcd nodes require a minimum of two nodes to be up.
  • The following issues are seen when the PrimaryHelper goes down:
    • As Worker Scaling Policy runs on the PrimaryHelper node, the Manage > Operations > PrimaryHelper.WORKERSCALINGPOLICY shows a blank page.
    • When you select the Manage > Operations > PrimaryHelper.OPENSHIFT_4HA-Par > OpenShift > Get service account token option, the following error message is displayed:
      PrimaryHelper.11594690159618.OPENSHIFT_4HA-Part: cat: /ocpClusterInfo/.clusterinfo.json: No such file or directory
      
      This is applicable for OpenShift Container Platform accelerator and all Cloud Pak accelerators that are on OpenShift Container Platform.
  • When you add worker nodes post-deployment, the OpenShift Container Platform node gets scaled up. There is no change to the Cloud Pak workload pod replicas.

Note: