IBM Support

SLES will not boot with intel_iommu=on kernel parameter - IBM Servers

Troubleshooting


Problem

If the intel_iommu=on kernel parameter is added in order to utilize TBOOT, KVM PCIe passthrough, or SR-IOV in SUSE Linux Enterprise Server 11 (SLES11) SP3 or SLES 12, then the system will hang on boot showing messages similar to the following: [ 221.984472] linux-z8jz dracut-initqueue[776]: ccoleman@lenovo.comWarning: Could not boot. [ 222.037413] linux-z8jz dracut-initqueue[776]: Warning: /dev/disk/by-uuid/8377ed16-4f76-4c02-bcba-b119afdc9cbd does not exist [ 222.037572] linux-z8jz dracut-initqueue[776]: Warning: /dev/disk/by-uuid/CC32-0D71 does not exist Additionally, the boot log will show a DMAR fault: [ 5.586106] linux-z8jz kernel: dmar: DMAR:[DMA Read] Request device [11:00.0] fault addr 76b64000 (where TBOOT = Trusted Boot, KVM = Kernel-based Virtual Machine, PCIe = Peripheral Component Interconnect Express, SR-IOV = Single Root I/O Virtualization)

Resolving The Problem

Source

RETAIN tip: H213410

Symptom

If the intel_iommu=on kernel parameter is added in order to utilize TBOOT, KVM PCIe passthrough, or SR-IOV in SUSE Linux Enterprise Server 11 (SLES11) SP3 or SLES 12, then the system will hang on boot showing messages similar to the following:

 

[ 221.984472] linux-z8jz dracut-initqueue[776]: ccoleman@lenovo.comWarning: Could not boot.

[222.037413] linux-z8jz dracut-initqueue[776]: Warning: /dev/disk/by-uuid/8377ed16-4f76-4c02-bcba-b119afdc9cbd does not exist

[222.037572] linux-z8jz dracut-initqueue[776]: Warning: /dev/disk/by-uuid/CC32-0D71 does not exist

Additionally, the boot log will show a DMAR fault:

  [5.586106] linux-z8jz kernel: dmar: DMAR:[DMA Read] Request device [11:00.0] fault addr 76b64000

(where TBOOT = Trusted Boot, KVM = Kernel-based Virtual Machine, PCIe = Peripheral Component Interconnect Express, SR-IOV = Single Root I/O Virtualization)

Affected configurations

The system may be any of the following IBM servers:

  • Flex System x280 X6 Compute Node, type 4259, any model
  • Flex System x280 X6 Compute Node, type 7903 , any model
  • Flex System x480 X6 Compute Node, type 4259, any model
  • Flex System x480 X6 Compute Node, type 7903 , any model
  • Flex System x880 X6 Compute Node, type 4259, any model
  • Flex System x880 X6 Compute Node, type 7903 , any model
  • System x3650 M4 HD, type 5460, any model
  • System x3750 M4, type 8752, any model
  • System x3850 X6, type 3837 (4-socket, 3-year warranty), any model
  • System x3850 X6, type 3839 (4-socket, 4-year warranty), any model
  • System x3950 X6, type 3837 (8-socket, 3-year warranty), any model
  • System x3950 X6, type 3839 (8-socket, 4-year warranty), any model

The system is configured with at least one of the following:

  • SUSE Linux Enterprise Server 11, Service Pack 3

The system is configured with one or more of the following IBM Options:

  • ServeRAID M1215 SAS/SATA Controller for IBM System x, Option part number 46C9114, any model
  • ServeRAID M5210 SAS/SATA Controller for IBM System x, Option part number 46C9110, any model
  • ServeRAID M5210e SAS/SATA Controller for IBM System x, Option part number 46C9117CTO, any model
The following system BIOS/uEFI level(s) are affected: Build IDs: n2e108n, koe142c, vve142c, a8e112b

The < 24.2.1-0045 firmware for the ServeRAID M1215 and M5200 series adapters is affected.

Note: This does not imply that the network operating system will work under all combinations of hardware and software.

Please see the compatibility page for more information:

http://www.ibm.com/systems/info/x86servers/serverproven/compat/us/

Solution

In SLES 11 SP3, the fix for this issue is to update the Unified Extensible Firmware Interface (UEFI) and ServeRAID firmware to the latest available levels and the issue will disappear.

In SLES 12, updating the UEFI and ServeRAID firmware is necessary, but will not fix the issue alone. This behavior will be corrected in a future release of SLES 12 kernel.

UEFI firmware should be updated to the following build versions or later:

x3650 M4 HD
VVE142E-1.80
x3750 M4 KOE142C-1.51
x3850 X6 A8E112B-1.00
x3950 X6 A8E112B-1.00v
x280 X6 N2E108N-1.00
x480 X6 N2E108N-1.00
x880 X6 N2E108N-1.00

ServeRAID firmware on all affected systems should be updated to version 24.2.1-0045 or later.

The target date for this release is scheduled for fourth quarter 2014.

The file is or will be available by selecting the appropriate Product Group, type of System, Product name, Product machine type, and Operating system on IBM Support's Fix Central web page, at the following URL:

  http://www.ibm.com/support/fixcentral/

Workaround

Either update the firmware/kernel to the latest levels or do not boot with the intel_iommu=on parameter. This includes disabling TBOOT for the time being.

Additional information

In IBM's UEFI firmware, there is a mechanism for creating Reserved Memory Region Reporting (RMRR) entries so that the operating system can reserve memory for various devices. In this case, when a Transportable Memory Module (TMM) is not installed in the system, the ServeRAID controller utilizes system memory as its cache and an RMRR entry in the Direct Memory Access Remapping (DMAR) table is required. In this RMRR entry, a complete PCI path up to the device in question is defined. However, in IBM's UEFI/ServeRAID firmware, the complete path is not defined. Instead, only the end point path is specified, which is not what the Operating System (OS) is expecting.

SUSE has created a fix in SLES 12 that works around this issue while IBM/Lenovo work to provide a fix for their firmware. SUSE's fix is expected to be released at by the end of November 2014.

Document Location

Worldwide

Operating System

System x:SUSE Linux Enterprise Server 11

PureFlex System and Flex System:SUSE Linux Enterprise Server 11 x86-64

Lenovo x86 servers:SUSE Linux Enterprise Server 11

Lenovo x86 servers:SUSE Linux Enterprise Server 11 x86-64

[{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOEJ2R","label":"System x->System x3650 M4 HD->5460"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOEVPK","label":"System x->System x3850 X6->3837"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOEVSH","label":"System x->System x3950 X6->3837"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOEVSP","label":"System x->System x3950 X6->3839"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOEWIQ","label":"System x->System x3750 M4->8752"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU050","label":"BU NOT IDENTIFIED"},"Product":{"code":"QUOEZVT","label":"PureFlex System and Flex System->x280 X6 Compute Node->4259"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU050","label":"BU NOT IDENTIFIED"},"Product":{"code":"QUOEZVV","label":"PureFlex System and Flex System->x480 X6 Compute Node->4259"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU050","label":"BU NOT IDENTIFIED"},"Product":{"code":"QUOEZVX","label":"PureFlex System and Flex System->x880 X6 Compute Node->4259"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU050","label":"BU NOT IDENTIFIED"},"Product":{"code":"QUOEZZB","label":"PureFlex System and Flex System->x880 X6 Compute Node->7903"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU050","label":"BU NOT IDENTIFIED"},"Product":{"code":"QUOEZZD","label":"PureFlex System and Flex System->x280 X6 Compute Node->7903"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU050","label":"BU NOT IDENTIFIED"},"Product":{"code":"QUOEZZG","label":"PureFlex System and Flex System->x480 X6 Compute Node->7903"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"QUOFECN","label":"System x->System x3850 X6->3839"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOFMO0","label":"Lenovo x86 servers->Lenovo System x3750 M4->8753"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"QUOFNIM","label":"Lenovo x86 servers->Lenovo System x3650 M4 HD->5460"},"Platform":[{"code":"PF047","label":"SurePOS"}],"Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
30 January 2019

UID

ibm1MIGR-5096499