Development stream - Restrictions

The following applies to latest "Development stream", currently based on kernel 4.12.

'Development stream' - General Restrictions

  • Starting with kernel 3.5
    • the optional builtin tape IPL code has been removed. IPL from tape device using the zipl tool is still possible.
    • CPU capability sysfs attribute file has been removed. but info can be retrieved by reading the /proc/sysinfo file.
    • Token Ring is no longer supported. Thus the exploiting networking drivers geth and lcs do no longer offer this support.
  • Starting with kernel 2.6.31, the Linux on System z 'Development stream' will be updated regularily to reflect upstream changes and contributions for Linux on System z.
  • The Linux on System z 'Development stream' has all the features of the 'October 2005 stream' since HBA API 2.0 and automatic port discovery were introduced (with the kernel 2.6.27 based 'Development stream').
  • Starting with the kernel 2.6.27 based 'Development stream', IBM supports only distributions compiled with march=z9-109, so that Linux on System z can benefit from the compiler enhancements exploiting z9 instructions. Therefore the below restrictions do not mention anything related to running Linux on earlier mainframes.
  • Starting with kernel 2.6.35, no more kernel kerntypes-patches (which are required for work with lcrash, from lkcdutils/LKCD) will be provided for the 'Development stream': Use crash instead of lcrash.
  • If a kernel compiled for z9 is installed on an earlier machine, the boot process is stopped with a disabled wait PSW.

Furthermore:

  • There are known bugs in kernel 3.8 which can also show up on System z hardware.
  • IBM supports only 64-bit distributions based on this stream; 31-bit applications are supported on the 31-bit emulation layer.
  • IBM does not recommend the following network devices for distributions based on this stream:
    • CTCM all protocols:
      • CTCM protocols 0, 1, 2 - former CTC
      • CTCM protocol 4 - CTCMPC (since kernel 3.10)
    • NETIUCV network device -- Note that the IUCV-infrastructure will be supported for distributions based on this stream.

Kernel (general)

Tape driver

  • The tape driver assigns the device to the system while the device is online. If the tape device is not online it may be assigned to another system with which the tape is shared.
  • The 3480/90 tape driver is unable to detect manual operations on the tape device, in particular manual tape unloads, and these operations will lead to errors in reading and writing. The driver provides ioctl functions to control the device and these must be used, either through the API or by using the Linux mt utility.

Cryptographic Device Driver

  • If you have a PCICC only, or are attempting to use a CRT key on a system with PCIXCC only, you need to ensure that your data is PKCS-1.2 padded. If you have a PCICA, or are only using Mod-Expo keys with a PCIXCC card you do not need to ensure PKCS-1.2 padding.
  • z/VM hides PCICCs, PCIXCCs, and CEX2Cs from its guests if a PCICA or a CEX2A is also available.
  • Crypto Express3 Accelerator (CEX3A) and Crypto Express3 Coprocessor (CEX3C):
    • CEX3A is supported as CEX3A (since kernel 2.6.33)
    • CEX3C is supported as CEX3C and secure key is enabled (since kernel 2.6.33)
  • When using CCA version 4.0, the zcrypt device driver may not unload properly. You will see a 'device busy' report. If this occurs, stop the catcher.exe daemon and re-try the unload. After successfully unloading the device driver, re-start the catcher.exe daemon.
  • With kernel 3.9 and libica v.2.3.0 a new API function call is available to retrieve the crypto mechanisms that are supported on a given system. This includes crypto modes provided by CPACF (CP Assist for Cryptographic Functions) and CEX (Crypto Express) crypto cards (if installed and enabled).

Kernel parameter files

  • Kernel parameter files may not contain newline characters when using the LPAR Load From FTP IPL method or other IPL methods which load boot files without extra processing directly to memory.
    Note that this restriction does not apply to IPL methods where the IPL device is prepared by the zipl tool.

Networking

Networking - OSA - Layer 2 / configured as slaves for bonding mode active-backup

In 2014 recent hardware driver bundles contain an OSA firmware update which may result in loss of network connectivity, if the OSA-Express cards are used as slaves for a bonding interface configured in active-backup mode AND parameter fail_over_mac is either not specified or defined as 0.

The new zEnterprise BC12 and EC12 systems released September 2013 are exposed to this problem when they are brought up to System Driver 15F - everything from Bundle 2C and above.
Affected are all possible OSA types OSA-Express5S / OSA-Express4S / OSA-Express3 on BC12 and EC12.

In September 2014 an OSA firmware fix MCL H49530.005 has been released to customer in System Driver D15F Bundle 23a. Resulting OSD Code levels C.90 (for OSA-Express4S and OSA-Express5S) and (0.B5 for OSA-Express3) and above return to the old OSA behavior as default. Without this fix, a Linux workaround is required: Configure bonding active-backup mode together with parameter fail_over_mac=1 (active), which requires distinct MAC-addresses for the enslaved OSAs.

That means, specify

  • for SLES10, or SLES11:
    BONDING_MODULE_OPTS='... mode=active_backup fail_over_mac=active ...'
    in /etc/sysconfig/network/ifcfg-bond0
  • for RHEL5:
    alias bond0 bonding
    options bond0 ... mode=active_backup fail_over_mac=active ...
    in /etc/modprobe.conf
  • for RHEL6:
    BONDING_OPTS='... mode=active_backup fail_over_mac=active ...'
and use different MAC-addresses for the OSA interfaces to be enslaved.

OSA, HiperSockets, QETH

  • Priority queueing can be unavailable for OSA devices, e.g. if configured for more than 160 TCP/IP stacks.
  • Multicasting has to be switched on in the kernel configuration.
  • The MTU range is 576 - 57344. However, depending on the medium and networking hardware settings, it may be restricted to 1492, 1500, 8992 or 9000. For HiperSockets the MTU range extends to 57344. This may be restricted by the framesize announced by the microcode.
    The maximum MTU size is limited by the maximum frame (buffer) size supported by the hardware. The frame size for OSA is fixed at 64 KB. For HiperSockets, the maximum frame size is defined during HiperSockets CHPID definition in the IOCDS. E.g. if the hardware configuration specifies the maximum frame size as 40 KB, the MTU can be configured up to 32 KB (frame size minus 8 KB) using ifconfig. Possible frame sizes are 16, 24, 40, and 64 KB.
  • The total memory usage for inbound data buffers per device in Linux is (number of buffers) * (Linux memory usage per buffer) The number of buffers allocated by Linux can be specified for each device via the sysfs attribute 'buffer_count'; the number of buffers must be between 8 and 128; the default is 64 for OSA-devices and 128 for HiperSockets devices. Linux memory usage per buffer is equivalent to the respective frame sizes.
  • Two ports per CHPID (four ports per OSA-card) is available starting with OSA-Express3 GbE SX and LX on IBM System z10, running Linux on System z in an LPAR or as a VM guest.
  • With the kernel 2.6.29 based 'Development stream', support for Enhanced Device Driver Packing (EDDP) has been removed (no performance benefits).
  • When using HiperSockets, running Linux on System z as a VM-guest, the appropriate z/VM TCP/IP PTF for APAR PK80882 is required. For traffic between Linux and z/OS the appropriate z/OS TCP/IP PTF for APAR PK83573 is required.
  • VEPA mode requires a switch that can support RR (reflective relay) mode
  • An OSA-Express port name was required to identify a shared OSA port. All operating system instances that shared the port had to use the same port name.This requirement no longer applies. Starting with kernel 4.4 the qeth attribute portname still exists, but is no longer used during device online setting.

Networking - OSA - Layer 3 mode only

The following applies only when running in Layer 3 mode (i.e. not in Layer 2 mode):

  • There may be circumstances that prevent ifconfig (or other commands) from setting an IP address on an OSA-Express network feature. The most common one is that another system in the network has set that IP address already. As a result, the IP address will be indicated by ifconfig as being set on the interface, but the address is not actually set on the feature. Since the design of the network stack in Linux does not allow feedback about IP address changes, there is no means of notifying the user of the problem other than to log a message. This message (usually containing text such as 'Registering IP address a.b.c.d failed') will appear in the kernel messages (displayable using 'dmesg'). For most distribution settings, this will also trigger a message in /var/log/messages. If you are not sure whether the IP address was set properly or experience a networking problem, you should always check these logs to see if an error was encountered when setting the address.
    This requirement applies to both IPv4 and IPv6 addresses, and to IP takeover, VIPA, HiperSockets, and Proxy ARP.

Networking - HiperSockets - No Layer 2 / Layer 3 traffic

  • Two hosts can communicate with each other via HiperSockets only if:
    • both are using HiperSockets Layer 2 or
    • both are using HiperSockets Layer 3.

Large MTU sizes may fail due to "order-N allocation failed" problem

  • When using MTU sizes >8K on a network interface, the Linux TCP/IP stack may run into problems on heavily loaded systems because allocating memory for packets may fail due to memory fragmentation. As a symptom of this problem you will see messages of the form "order-N allocation failed" in the system log; in addition, network connections will drop packets, in extreme cases to the extent that the network is no longer usable.

    As a workaround, use MTU sizes at most of 8K (minus header size), even if the network hardware allows larger sizes (e.g. HiperSockets).

Storage

DASD

  • Detachment in VM of a device still open or mounted in Linux may trigger a limitation in the Linux kernel 2.6 common code and cause the system to hang or crash.
  • zIPL does not work with diag access to VM minidisks.
  • fdasd expects more than one entry in the config file of the -c option.
  • Note that running DASDFMT occupies one channel path which may affect your I/O performance.
  • IBM does not recommend native FBA DASD devices for distributions based on this stream except for the following scenarios:
    • access to virtual FBA devices (using FBA channel programs or DIAG access method)
    • FBA emulated SCSI devices (z/VM minidisk support).
  • HyperPAV requires DS8000 with HyperPAV LIC installed and z/VM 5.3, when running Linux on System z as a VM guest; base-PAV requires that the PAV feature is enabled on the storage server. If the prerequisites for HyperPAV and base-PAV are not there, the DASD driver works without using PAV.
  • In a scenario where PAV is used (base-PAV or HyperPAV) and all channel paths to a DASD disk are lost (e.g. due to cable pull), some or all PAV alias devices may not automatically recover even after the channel paths have become available again.
  • Changing the list of installed CHPIDs for a device which Linux has already sensed can result in Linux not using those CHPIDs correctly. It is recommended to shut down Linux after such a change has been made to the I/O configuration. Note that this does not apply to operations that do not change the ID of a channel path, i.e. configure on/off operations may still be performed while Linux is running.
  • Extended Address Volume (EAV) is available for IBM System Storage DS8000 since R4.0. When running Linux on System z as a z/VM-guest, using EAV requires z/VM 5.4 or z/VM 6.1 with the PTFs for APARs VM64709 (CP) and VM64711 (CMS).

zfcp

  • Please see http://www.ibm.com/systems/z/connectivity/products/fc.html for FCP channel limitations, and the book Device Drivers, Features, and Commands.
  • If access to a device is lost although it is currently mounted, a limitation of the kernel 2.6 code may be triggered which causes the kernel to hang or to crash. It is recommended to use a multi-pathing setup.
  • Error recovery by the Linux SCSI stack on a (virtual) FCP adapter may impact ongoing SCSI traffic to other devices attached to the same adapter. Therefore, it is recommended to use a unique (virtual) FCP adapter (unique device number, see 'lscss' in s390-tools ) to isolate a device from other SAN traffic. In particular, this applies to tape devices.
  • In case of non-recoverable errors (e.g., temporary adapter failure, low-level data loss on the fiber), zfcp uses the host return codes ("host_byte") provided by the Linux SCSI stack (see include/scsi/scsi.h ) to indicate these error conditions to upper layer drivers. In particular, low-level errors can be disruptive to the SCSI traffic of devices which do not allow retries (e.g., tape read/write commands). It is recommended to the upper layer driver to try to recover these conditions, or just to return I/O error to the application. In case of a high frequency of host return codes, please check your SAN equipment (firmware etc.).
  • For the dump on panic feature for SCSI disks in a z9 LPAR, MCL G40954.002 in Driver D67L Bundle 18 is required. (z10 has no specific prerequisite)
  • FCP-attached IBM tape devices using the IBMtape device driver must be attached through a switch.
  • Support for "FCP hardware data router":
    • Requires z196 or z114
  • Support for "Support for end-to-end data consistency checking (T10-DIF)":
    • Requires z196 or z114
    • Can only be used to the full extent with FCP-attached storage providing SCSI disks implementing T10-DIF
    • When using DIX with a DIF-enabled SCSI disk and buffered I/O, the file system must not overwrite page content after a writeback to disk has been scheduled
    • Accessing a block device or an XFS file system by means of direct I/O is known to work. Expect error messages about invalid checksums when using other access methods

Virtual Server

Collaborative Memory Management Stage II (CMM2)

  • requires Linux on System z running as a guest on z/VM 5.3 - which introduced Collaborative Memory Management Assist (CMMA) - with the PTFs for APARs VM64265 and VM64297, on an IBM System z9 or z10; no specific PTF needed for later z/VM.
  • Starting with the kernel 2.6.31 based 'Development stream' there is no longer an optional feature patch for CMM2 delivering the 'full' CMM2-functionality. (As the 'basic' cmm2-functionality is available in recent kernels, the cmma IPL-option is still there.)
    The Linux support for CMM2 is activated per IPL-option cmma=on (default is cmma=off).