Troubleshooting
If you experience issues with your IBM Confidential Computing Containers for Red Hat OpenShift Container Platform (CCCO) deployment, the following information might help you identify and resolve the issues.
- Missing seeds for the volume
An encrypted volume depends on its seeds to access and preserve the data stored in that volume. If the seeds are deleted or lost, the system can no longer decrypt the existing data. As a result, the data in the encrypted persistent block volume cannot be recovered or retained.
To continue using the volume, you must format the persistent block volume and then use it again as a new volume. Formatting removes the inaccessible data and prepares the volume for reuse.
- Container creation failed
The container image pull failed because the workload could not unpack layers due to insufficient storage.
To resolve the issue, perform the following steps:- Access the CCCO VM logs. For more information, see Enable debug logs and identify Kata/QEMU processes for Bare Metal.
- If logs show No space left on device, increase the memory allocated to the pod by adding
or updating the following
annotation:
io.katacontainers.config.hypervisor.default_memoryFor more information, see Configuring resources by using annotations.
- The contract is mandatory
If you create a workload on IBM Hyper Protect Confidential Container virtual machine (VM) without passing any contract, the VM starts running, and then shuts down eventually.
- Contract format is yaml
The contract must follow the YAML format. If the format is not proper, it fails the contract validation check, and the IBM Hyper Protect Confidential Container VM shuts down eventually. For more information, see Verifying the contract.
- Contract schema
If the contract schema is incorrect, the contract validation fails when the IBM Hyper Protect Confidential Container VM is booting and the VM shuts down. You can check for errors that are associated with
ccco-contract logidentifier in IBM Logging instance. - Contract encryption key
When using the encryption contract, make sure to encrypt the Encrypted Multi-Persona Contract with the correct encryption key. If the key is incorrect, the bootloader fails to decrypt the contract, and you may see decryption failure messages in the serial console of the IBM Confidential Computing Containers VM.
- Logging configuration failure
When the IBM Confidential Computing Containers VM boots, monitor the serial console to identify any errors that are logged by the bootloader or from the logging service. If you don’t see any logs reaching the Logging instance, it might be because your logging configuration failed. Failure to configure logging also leads to the VM shutting down.
- If the logging service is Syslog
Ensure the
certandkeyare valid and if you are able to connect to the server via Openssl connect.If the logging configuration is correct, the output should be as follows:
depth=1 C = US, O = Logstash Test CA, CN = ca.example.org verify return:1 depth=0 C = US, O = Rsyslog Test Server, CN = 192.168.122.153 verify return:1 - If the logging service is IBM Cloud Logs, check whether the logging hostname, port, and ingestion key that is provided in the contract are correct.
- If the logging service is Syslog
- Issues with IBM Secure Execution for Linux
For information about troubleshooting issues with IBM Secure Execution for Linux, see Troubleshooting.
- Initdata is incorrect
When the initdata is incorrect, the IBM Hyper Protect Confidential Container fails to start. The logs cannot be sent to the external logging service and an error message is redirected to the console output.
Check the logs to investigate the issue.
- Invalid GZIP header
The provided initdata is not compressed, which causes the container creation to fail. Ensure that the initdata is properly compressed before use. For more information about creating compressed initdata using gzip, see Creating initdata file.
- Initdata annotation failed
The initdata file is compressed more than once. This results in NULL bytes in the file, which causes the TOML parser to fail. Ensure that the initdata file is compressed only once. For more information about creating compressed initdata using gzip, see Creating initdata file.
- Connection Refused to API Server
When there is a missing authentication the oc CLI cannot connect to the OpenShift API server due to which the user on the bastion node cannot connect to the cluster.
You must run the following command to authenticate:
oc login -u kubeadmin -p "$(cat /root/ansible_workdir/auth/kubeadmin-password)" --insecure-skip-tls-verify - Undefined variable 'storage' in ansible playbook
When you have not defined the storage variable in the inventory or
host_vars/hostname.yamlfile.-
You must verify that the volume path specified is accessible to the user by running the following command:
storage: pool_path: /var/lib/libvirt/images/<user-name> - Ensure the path mentioned in host_vars/hostname.yaml is correct.
- Run the playbook and check if the path is accessible.
-
- KataConfig Resource Not Found
The error indicates that the required Custom Resource Definitions (CRDs) are not yet available in the cluster. The KataConfig resource cannot be recognized until the CRDs are installed.
You must apply the Operatorgroup and Subscription before applying the KataConfig by running the following command:
oc apply -f Operatorgroup.yaml -f subscription.yaml - Image manifest unknown
This error occurs when the specified image tag is incorrect, missing or does not exist in the container registry.
You must verify and correct the image tag according to the following example:
PODVM_IMAGE_URI: "oci::icr.io/ibm_ccco/ibm-ccco-podvm-container-image:1.2.1::/image/ccco-1.2.1.qcow2" - Unsupported image path format
This error occurs when the image path format is incorrect.
You must ensure the image path follows the correct format according to the following example:
PODVM_IMAGE_URI: "oci::icr.io/ibm_ccco/ibm-ccco-podvm-container-image:1.2.1::/image/ccco-1.2.1.qcow2" - Unauthorized access to image registry
This error occurs when the cluster cannot authenticate with the image registry.
You must perform the following steps to reauthenticate:
- Extract the current pull secret by running the following
command:
oc get secrets pull-secret -n openshift-config -o template='{{index .data ".dockerconfigjson"}}' | base64 -d | jq - Save the output in
config.jsonfile. - Update
config.jsonwith valid credentials:"icr.io": { "auth": "<your-base64-auth>", "email": "<your-email>"} - Update the pull secret in a cluster by running the following
command:
oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson="config.json" - Restart the deployment to apply
changes:
oc rollout restart deployement controller-manager -n openshift-sandboxed-containers-operator
- Extract the current pull secret by running the following
command:
- Invalid Hypervisor section in Kata Configuration
This error occurs when the
configuration.tomlfile on the worker node contains an unrecognized hypervisor section.You must perform the following steps to resolve the error:
- Check the worker node
IP:
oc get nodes -o wide - Log in to the worker
node:
ssh core@<worker-node-IP> - Verify the kata-shim
version:
/usr/bin/containerd-shim-kata-v2-tp --version - Check the Red Hat OpenShift Container Platform version on
bastion:
oc version - Upgrade Red Hat OpenShift Container Platform if necessary:
oc adm upgrade --allow-explicit-upgrade --force=true --to-image=quay.io/openshift-release-dev/ocp-release:<4.16.13>-s390xWarning Forced upgrades are not advisable, as they are risky and should be used only as a last resort.
- Check the worker node
IP:
- Failed to create pod sandbox
This error occurs when there is a timeout due to various reasons, including Pod VM boot issues.
to investigate the issue.
- YAML Parsing Error
This error occurs due to incorrect formatting in the
busybox.yaml.You must validate the indentation and the structure of
busybox.yamlfile. - Invalid PEM certificate format
This error occurs when the encryption certificate contains extra characters or has formatting issues.
You must ensure that the encryption certificate contains only valid PEM blocks, and remove any extra characters or whitespace.
- Pod restarting frequently
This error occurs when the valid IBM Cloud Logging (ICL) instance is absent.
You must verify the availability of the logging endpoint
${PUBLIC_INGRESS_ENDPOINT}in ICL. - Contract decryption failure
This error occurs when the encryption certificate used does not match the one used to encrypt the contract.
You must ensure the correct version of the encryption certificate is used.
- Unable to load certificate
This error occurs when the certificate file is either corrupted, invalid, or in an unsupported format.
Ensure to use the correct encryption certificate.
- Logging ingestion failure
This error occurs when
iamApiKeyis incorrect or when the logrouter hostname is invalid.You must check and correct the information in the
env.yamlfile. - Workload configuration error
This error occurs when the policy field in the
workload.yamlis base64-encoded, but the encoded string is corrupted or invalid. When decoded, it results in unreadable or malformed output.Ensure to encrypt the correct policy or use the correct encrypted policy value.
- Contract signature verification failure
This error occurs when the signing key in the
env.yamlfile is tampered or does not match the key used to sign the contract.You must check the signingKey field in
env.yamland ensure it matches the private key used to sign the contract. - Contract validation error
This error occurs when the
envworkloadsignaturehas been altered or does not match the actual content of theenv.yamlandworkload.yaml.You must ensure the
env.yamlandworkload.yamlfiles are not modified after signing and verify if theenvworkloadsignatureis correct. - Trace called before context set
This error occurs due to a misconfiguration in
configuration.toml. In most cases, the image parameter is missing or incorrectly specified.Verify that the
configuration.tomlfile includes a valid image entry and correct any missing or incorrect values.