Common problems with IBM z/OS Container Platform
Use this information to help diagnose common problems that are found when you use IBM® z/OS® Container Platform (zOSCP).
Source image is rejected
- If you try to pull an image and the source image is rejected, you might see an error message
containing the text
Running image <image name> is rejected by policy. For example:Error: Source image rejected: Running image docker://icr.io/zoscp/zos:latest is rejected by policy.Podman for IBM z/OS (Podman), cri-o for IBM z/OS (cri-o), and IBM z/OS for Skopeo (Skopeo) require a default trust policy to be defined to understand which images are acceptable to pull.
To define a trust policy, the following command can be issued to trust images from the IBM Cloud Container Registry:
For more information, see Pushing to and pulling from a container registry.podman image trust set -t accept icr.ioSee Trusting external container image registries, for an example of how to establish trust in order to accept the image from your internal registry.
x509 certificate signed by unknown authority
- If you try to securely connect to an image registry with Skopeo, without setting up the location of an x509 root CA
certificate, you may encounter an error message containing the text
certificate signed by unknown authority. For example:
To resolve the problem, you need to set up TLS. For more information, see Set up TLS to securely connect to image registries.$ skopeo inspect docker://icr.io/zoscp/ibm-semeru-runtimes:certified-17-jdk-zos FATA[0000] Error parsing image name "docker://icr.io/zoscp/ibm-semeru-runtimes:certified-17-jdk-zos": pinging container registry icr.io: Get "https://icr.io/v2/": tls: failed to verify certificate: x509: certificate signed by unknown authority
JVMSHRC245E Error mapping shared class cache file
-
If you see a
Error mapping shared class cache fileerror when using Podman, for example:JVMSHRC245E Error mapping shared class cache file JVMSHRC336E Port layer error code = -155 JVMSHRC337E Platform error message: EDC5132I Not enough memory. JVMSHRC840E Failed to start up the shared cache. JVMSHRC686I Failed to startup shared class cache. Continue without using it as -Xshareclasses:nonfatal is specifiedThis error does not prevent Java™ from running, but might cause runtime performance degradation that is caused by the lack of a shared class cache.
You need to ensure that your system has the SMFLIMxx parmlib updates. This is a requirement for the ibm-semeru-runtimes:certified-17-jdk-zos container image in order to support caches mapped above the 2 GB address range. The maximum size of these caches are limited by the MAXSHARE value within the SMFLIMxx PARMLIB member. For more information, see Container image storage requirements.
EDC5133I No space left on device
-
If you see a
No space left on deviceerror when trying to remove containers using Podman, for example:$ podman --log-level debug rm -a ... DEBU[0001] Using tmp dir /var/run/libpod ... Error: Unable to write container exited event: "write /var/run/libpod/events/events.log: EDC5132I No space left on device."It may mean that theevents.logfile in thetmpdirectory is causing the filesystem to be full. You can specify a new path under TFS fortmp_dirin the/etc/containers/containers.conffile for temporary files.# Directory for temporary files. Must be tmpfs (wiped after reboot) # #tmp_dir = “/var/run/libpod"To ensure that the new value fortmp_dirtakes effect and is not overridden, you need to remove thedb.sqlfile under the current graph root.$ podman info | grep graphRoot: graphRoot: /SYSTEM/var/lib/containers/storage $ rm /SYSTEM/var/lib/containers/storage/db.sqlFor more information, see Storage requirements.
- If you get an error similar to the following when using Podman:
$ podman rm -af Error: cleaning up storage: removing container 2e777fc138d8ef84022bc62a44572150963a502a5775d3bf362d6a3d979eeb12 root filesystem: write /var/lib/podman/storage/ufs-layers/.tmp-layers.json1865200036: EDC5133I No space left on device.It may mean that the ZFS filesystem (mounted as /var/lib/podman ) for Podman has become unstable due to reaching its capacity.$ cd /var/lib/podman $ df -Pkv Filesystem 1024-blocks Used Available Capacity Mounted on OMVSSPA.SVT.SA.VAR.LIB.PODMAN.ZFS 4193280 4193193 87 100% /J7D/var/lib/podman ZFS, Read/Write, Device:220, ACLS=Y File System Owner : J7D Automove=U Client=N Filetag : T=off codeset=0 Aggregate Name : OMVSSPA.SVT.SA.VAR.LIB.PODMAN.ZFSTo resolve this error, you need to increase the size of the ZFS using bpxwmigf. Avoid unmounting file systems that are used by zOSCP, this can cause problems for zOSCP programs.
Container image exists in local storage but may be corrupted
- If you try to list images and you get:
$ podman images ERRO[0000] Image ed6a9832ff9b exists in local storage but may be corrupted (remove the image to resolve the issue): layer not known ERRO[0000] retrieving label for image "f05ce1cae7ee45e60c7d8902dd337c1ead07fd69daa7d76ed58d56477916a937": you may need to remove the image to resolve the error: layer not known- You need to remove the failing
images:
$ podman rmi ed6a9832ff9b f05ce1cae7ee45e60c7d8902dd337c1ead07fd69daa7d76ed58d56477916a937 WARN[0000] Failed to determine if an image is a parent: layer not known, ignoring the error WARN[0000] Failed to determine parent of image: layer not known, ignoring the error WARN[0000] Failed to determine if an image is a parent: layer not known, ignoring the error WARN[0000] Failed to determine parent of image: layer not known, ignoring the error Untagged: localhost/test:latest Deleted: ed6a9832ff9bf91e5988186dcd0f209dd9afa27cd5ff18eefe94d141c1806075 Deleted: f05ce1cae7ee45e60c7d8902dd337c1ead07fd69daa7d76ed58d56477916a937 - A middleware system programmer may have removed the IBM provided images in the shared space in the internal registry. If you need the local images to be re-built, speak to your middleware system programmer to pull the base images again.
- You need to remove the failing
images:
Instructions fail to set permissions for the target directory when building a Containerfile
- When using Podman to build a Containerfile
with ADD/COPY instructions with the --chmod option specified, the instruction fails to set
permissions for the target directory.
The ADD/COPY instruction uses the contents of the source directory only to populate the target directory. If the target directory does not exist it will be created but only the new contents of the target directory will respect the characteristics specified on the --chmod. The following example shows the impact of using the --chmod option but neither of the target directories in the example respects the --chmod option:
host $ mkdir source-dir host $ touch source-dir/1.txt source-dir/2.txt host $ cat Containerfile FROM zos:latest COPY source-dir /target-dir COPY --chmod=777 source-dir /target-dir-777 host $ podman build -f Containerfile -t test-chmod STEP 1/3: FROM zos:latest STEP 2/3: COPY source-dir /target-dir --> 4d4146942ed STEP 3/3: COPY --chmod=777 source-dir /target-dir-777 COMMIT test-chmod --> 77964efafaa Successfully tagged localhost/test-chmod:latest 77964efafaa5fcbf203ff32045266b88f0e800fba162f930df37dc20e958074b host $ podman run --rm -i -t --entrypoint=/bin/sh test-chmod $ ls -l total 152 drwxr-xr-x 2 BPXROOT OMVS 8192 Sep 14 14:41 bin drwxr-xr-t 2 ZUSER1 TSOUSER 20480 Nov 27 14:47 dev drwxr-xr-x 2 ZUSER1 TSOUSER 8192 Nov 27 14:47 etc drwxrwxrwx 9 ZUSER1 TSOUSER 0 Nov 27 14:47 proc drwxr-xr-x 2 ZUSER1 TSOUSER 8192 Nov 27 14:47 run drwxr-xr-x 2 ZUSER1 TSOUSER 8192 Nov 27 14:41 target-dir drwxr-xr-x 2 ZUSER1 TSOUSER 8192 Nov 27 14:41 target-dir-777 drwxrwxrwt 2 ZUSER1 TSOUSER 8192 Nov 27 14:47 tmp drwxr-xr-x 4 BPXROOT OMVS 8192 Sep 14 14:41 usr $ ls -al target-dir* target-dir: total 32 drwxr-xr-x 2 ZUSER1 TSOUSER 8192 Nov 27 14:41 . drwxr-xr-x 5 ZUSER1 TSOUSER 8192 Nov 27 14:41 .. -rw------- 1 ZUSER1 TSOUSER 0 Nov 27 14:39 1.txt -rw------- 1 ZUSER1 TSOUSER 0 Nov 27 14:39 2.txt target-dir-777: total 32 drwxr-xr-x 2 ZUSER1 TSOUSER 8192 Nov 27 14:41 . drwxr-xr-x 5 ZUSER1 TSOUSER 8192 Nov 27 14:41 .. -rwxrwxrwx 1 ZUSER1 TSOUSER 0 Nov 27 14:39 1.txt -rwxrwxrwx 1 ZUSER1 TSOUSER 0 Nov 27 14:39 2.txt
Errors when using Podman to pull an image
- If you try to pull an image from your internal registry without the correct authority, you might
see the following
output:
A middleware system programmer must pull the IBM provided images to a shared space in the internal registry. If you need an image to be pulled, speak to your middleware system programmer. For more information on user ID authorization, see User ID requirements.ZUSER1:/u/user1 #>podman pull <internal-registry-location>/ibm-semeru-runtimes:certified-17-jdk-zos -\\-tls-verify=false Trying to pull <internal-registry-location>/ibm-semeru-runtimes:certified-17-jdk-zos... Getting image source signatures Copying blob 2f2baac5a799 done Copying blob db4a75b7aa56 skipped: already exists Error: writing blob: adding layer with blob "sha256:2f2baac5a7999d01d89bfd6408cc832c68f9dcc703c2a51a77e604a6f6005c04": lsetxattr /u/user2/.local/share/containers/storage/ufs/9d31a5094916825ba740cc9766e2cf003012cfe305fa6c9c28678585811f916c/diff/usr/lpp/java/J8.0_64/bin/appletviewer: EDC5139I Operation not permitted.
EDC5111I Permission denied
- If you get an error similar to the following when using Podman:
.It may be because of the erroneous $TMPDIR in the sample storage.conf and TMPDIR is not set. Update$ podman system info Error: creating runtime static files directory: mkdir /containers-storage-user-257: EDC5111I Permission denied/etc/containers/storage.confto replace it with/tmp, for example:
Or setrootless_storage_path = "/tmp/containers-storage-user-$UID"TMPDIR=/tmpfor Podman. - If you get an error similar to the following when trusting an external container
registry:
Ensure that you are trusting the external container registry using the IMGADMIN user ID. For more information, see Trusting external container image registries.$ podman image trust set --type accept icr.io Error: open /etc/containers/policy.json: EDC5111I Permission denied
Errors when using Podman to build an image
- When a new user is connected to the PODMAN group and tries to build an image
using Podman, the user might see the following
error on the command line:
$ podman build -t javatest . STEP 1/1: FROM ibm-semeru-runtimes:certified-17-jdk-zos ERRO[0000] Unmounting /tmp/containers-storage-user-245/ufs/e5f9485bfbeb0cc7d96d928563a26f568a9c5aaa2c44b0ef5a31e3c44af01ead/merged: EDC5121I Invalid argument. (errno2=0xC943014A) Error: mounting new container: mounting build container "81038e283a9590e9a6297c70a260bf6063f8496de9d312c904e6aef9baba7530": creating ufs mount to /tmp/containers-storage-user-245/ufs/e5f9485bfbeb0cc7d96d928563a26f568a9c5aaa2c44b0ef5a31e3c44af01ead/merged, mount_data="lowerdir=/var/share/containers/storage/ufs/l/CBRLXMBXVD34SYZCW75U2QJQLU:/var/share/containers/storage/ufs/l/YG3I6MGZIU5RABD6IG3FOKJN72,upperdir=/tmp/containers-storage-user-245/ufs/e5f9485bfbeb0cc7d96d928563a26f568a9c5aaa2c44b0ef5a31e3c44af01ead/diff,workdir=/tmp/containers-storage-user-245/ufs/e5f9485bfbeb0cc7d96d928563a26f568a9c5aaa2c44b0ef5a31e3c44af01ead/work,metacopy=off,supercopy=on": EDC5139I Operation not permitted. (errno2=0x119B00B0)During this failure, message ICH408I appears in the z/OS system log stating that the user has '
INSUFFICIENT AUTHORITY TO MOUNTSETUID'.To resolve this error, the user that was recently added to the PODMAN group needs to log out and log in again.
Errors when compiling a Java application within a container
- If, when compiling a Java application within a container, you get an error code
JVMJ9GC020Ethis might be due to your address space limits inside the container:STEP 3/4: RUN mkdir /app && javac -encoding iso8859-1 -d /app src/hello/HelloWorld.java JVMJ9GC020E -Xms too large for heap JVMJ9VM015W Initialization error for library j9gc29(2): Failed to initialize Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. Error: building at STEP "RUN mkdir /app && javac -encoding iso8859-1 -d /app src/hello/HelloWorld.java": while running runtime: exit status 1By default, the MAXASSIZE value is set in the BPXPRM00 member for zOSCP. To resolve this error, you need to increase the MAXASSIZE value. For more information, see MAXASSIZE in IBM z/OS documentation.
Encoding issues when using ssh
- If, when using ssh, potential text encoding/conversion issues are encountered, the following
commands may help resolve for subsequent commands:
export _BPXK_AUTOCVT=ON chtag -tc1047 /proc/self/fd/0 /proc/self/fd/1 /proc/self/fd/2
Errors when using Podman in a CINET environment
-
$ podman run <image> Error: OCI runtime error: runc: time="2024-02-08T16:44:43Z" level=fatal msg="nsexec-1[188]: failed to unshare remaining namespaces: EDC5121I Invalid argument. (errno2=0x12C206DA) "time="2024-02-08T16:44:43Z" level=fatal msg="nsexec-0[187]: failed to sync with stage-1: next state: EDC5137I Inappropriate I/O control operation. (errno2=0x05FC0119) "time="2024-02-08T16:44:43Z" level=error msg="runc create failed: unable to start container process: can't get final child's PID from pipe: EOF" -
$ podman run <image> WARN[0000] Failed to load cached network config: network zos_hybrid_network not found in CNI cache, falling back to loading network zos_hybrid_network from disk WARN[0000] 1 error occurred: * plugin type="zos-cni" failed (delete): cni plugin zos-cni failed: {"cniVersion": "0.4.0","code": 3,"msg": "Container unknown or does not exist.","details": "No DVIPA for Container found or unexpected internal error. Contact IBM, errno 157, errnojr 766c7307."} Error: plugin type="zos-cni" failed (add): cni plugin zos-cni failed: {"cniVersion": "0.4.0","code": 103,"msg": "Incorrect Network Configuration.","details": "VIPARANGE not defined for ZCONTAINER, errno 121, errnojr 766c7303."}
To resolve the issue, ensure _BPXK_SETIBMOPT_TRANSPORT is configured in /etc/containers/containers.conf. For more information, see Configuring the container runtime using a z/OSMF workflow.
For more information on z/OS UNIX Common INET (CINET) and zOSCP, see Considerations for z/OS UNIX Common INET (CINET).
Errors when starting an application within a container
-
If a container application fails to bind to a given port with a
Permission deniedmessage, the port may be reserved somewhere else by the system. If you have an existing PORT statement for the requested port, or a PORTRANGE statement that includes the requested port, you need to add a PORT statement to allow applications within a container to be permitted to bind to that port. This can be accomplished by using the reserved jobname BCZ-CNTR on the PORT statement.For more information, see Network Support for IBM z/OS Container Platform.
- If common INET is configured in your BPXPRMxx PARMLIB member, also verify that the INADDRANYPORT/INADDRANYCOUNT range does not include the requested port.
Avoid unmounting file systems that are used by zOSCP
The first time Podman is run it starts a
process in a new mount namespace. Podman runs in the private mount namespace that the
podman pausepodman pause process is associated, and is isolated from mounts in the global mount
namespace. Any changes that are made to the global mount namespace, for example mounting, are not
visible to Podman container processes and the
podman pause process.
However, unmounts are propagated to the podman pause mount namespace. Unmounting
file systems used by zOSCP (which include
user home directories) can cause problems for zOSCP programs. In many cases (such as increasing
the size of a ZFS), the unmount can be avoided by using bpxwmigf.
For example, if you want to increase the size of a home directory, you should use
bpxwmigf to avoid the unmount. When bpxwmigf is used, the
mount update is reflected in all mount namespace, and no Podman actions are needed. When an unmount/mount sequence
is used, it must be proceeded by the Podman
command to remove podman pause, before the unmount is done and to restore
podman pause after the mount is done. You can run podman system
migrate to stop both the running containers and the podman pause process,
which allows Podman to run in a new mount
namespace to reflect the changes.
podman system migrate, you should verify that there is no active
Podman process. If there is, you can run
podman system migrate again to stop it. $ ps -ef | grep podman
ROOTLESS 913 1 - 22:48:06 ? 0:00 podman-
$ podman images Error: overriding network config directory: creating CNI config file : "open /u/rootless/.config/cni/net.d/10-zoscni.conflist: EDC5129I No such file or directory." -
$ podman system migrate Error: creating runtime static files directory: mkdir /u/k8sauto/.local: EDC5134I Function not implemented.
podman
pause process. This then allows Podman to
run in a new mount namespace.$ ps -ef | grep -e podman -e conmon
ROOTLESS 16777363 1 - 02:08:39 ? 0:00 podman
ROOTLESS 16777365 1 - 02:08:39 ? 0:00 /usr/lpp/IBM/zoscp/bin/conmon --api-version 1 -c 71adcfa3f78a1a24aa861b0dc424d6575e8
$ kill 16777363 16777365kubeadmz join failure
- When a kubeadmz join command fails for a worker node, the command ends with
this
message:
Check the ~/.kube/kubeadmz.log to determine the reason for failure.BCZ2219E kubeadmz failed to join the z/OS worker node to the Kubernetes cluster-
If you see the following message in the log:
You need to check that your session has the export _BPXK_AUTOCVT=ON environment variable set.Calling 'wnadm' directly is not supported, will not continue.
-