Troubleshooting IBM Certified Container Software

The following table describes possible reasons for occurrence and troubleshooting tips for issues that most frequently occur when deploying Connect:Direct® for UNIX in an IBM Certified Container Software.

The quickest way to check the reason of a failed installation is to check the container logs or CDStartup.log in the work directory on Storage Volume mapped path.

Table 1. Troubleshooting deployment and functional issues
Issue Possible Reason Solution/Workaround
Deployment issues
Connect Direct not running. Exiting. Installation of Connect:Direct failed with a non-zero rc. Look for the issue in the container logs. kubectl/oc logs <cdu pod name>
File "/opt/cdfiles/<certificate name>" is not a PEM key certificate. The certificate in use is not in recommended .PEM format. Use a certificate in PEM format for installation. See, Generating certificate in PEM format.
‘Permission denied’ error when deploying C:D using certified container software when SELinux is set to enforced If SELinux policy is set to enforced, the host mounted path is not writable.

Run the following command to set the proper SELinux policy type label to be attached to the host directory:

chcon -Rt svirt_sandbox_file_t <host mounted path>

‘helm install’ command fails to deploy C:D Some mandatory parameters are missing in the helm install command. Check that all mandatory parameters are provided when executing the helm install command. For more information see, Installing IBM Connect:Direct for Unix using Helm chart.
Pod recovery fails when persistent volume is mapped to the host path. Possible reasons could be that the pod recovered on a different worker node and the persistent volume was mapped to a different host path. NFS storage is the preferred storage volume. If NFS server is not available, set the Pod Affinity such that the pod is always scheduled on the same worker node where persistent volume can be accessed.
Permission error occurs when trying to create a persistent volume or SecurityContextConstraints in an OpenShift environment Openshift Admin user was used to create persistent volume or SecurityContextConstraints Only the cluster Admin has privileges to create persistent volume and SecurityContextConstraints. Attempt creating these as a cluster Admin. For more information see, Creating security context constraints for Red Hat OpenShift Cluster.
Deployment fails with following error:

SPCLI> 20200401 17:19:17 8 CDAI007E Secure+ configuration failed.

20200401 17:19:17 0 CDAI010I createExitStatusFile() entered.

20200401 17:19:17 0 CDAI010I createExitStatusFile() exited.

command terminated with exit code 137

The certificate file is not valid. Check if conventional Connect:Direct for UNIX can be installed using the automated install procedure using this certificate. Use the correct certificate file. Check the chain sequence for a chained certificate.
Functional Issues
Error encountered during Connect:Direct container node recovery.
  1. Configurations of the previous container are not mapped to Storage Volume.
  2. Hostname provided does not match the hostname of the destroyed/removed container
  3. Passwords missing in cd_param_file
The configurations of a container should be mapped to the Storage Volume so that the container can be recovered in future. Hostname of the removed/destroyed container should be provided at container recovery. Also, ensure cd_param_file contains all passwords.
Error encountered in connecting to an installed Connect:Direct container node. Connect:Direct API port is not exposed to the host. The Connect:Direct API port (default 1363) running inside the container should be mapped to an available host port.
Error encountered in performing a file transfer with other Connect:Direct nodes. Connect:Direct Server port is not exposed to host or its entry is missing inside the netmap.cfg file of the partner node.
  • The server port (default 1364) of the Connect:Direct node running inside the container should be mapped to the available host port.
  • Define exposed port in partner node's netmap.cfg file for a successful file transfer.
Error encountered in file transfer from pod to Connect:Direct windows node. Netmap check on Connect:Direct windows node is enabled and not allowing file transfer. The IPs of all the worker nodes should be mentioned in the netmap of Connect:Direct windows node using the alternate.comminfo parameter, like below:

Alternate Comminfo : <worker node1 ip>, <worker node2 ip>

Unable to add certificates for Secure+ using Connect:Direct Web services This functionality is not present in the current version of Connect:Direct Web services and will be available in the upcoming release. Import the certificates using Secure+ CLI by attaching to the container.
In case of migration of Connect:Direct to container environment, local users in userfile.cfg not available inside the container The local users defined in userfile.cfg are not present inside the container environment. When migrating to container environment from a conventional Connect:Direct environment, the local users defined in userfile.cfg should be added inside the container using the useradd command.
In case of OpenShift cluster, you get below error while executing helm commands:
Error: Kubernetes cluster unreachable: the server has asked for the client to provide credentials.
User is logged on the cluster Login to the cluster `oc login -u <username> -p <password>`