IBM Support

EEH errors when a PCI device is reassigned from host to guest

Flashes (Alerts)


Abstract

On PowerPC systems, loading the vfio-pci driver with its default parameter (`disable_idle_d3=N`) enables PCI runtime power management, allowing devices to enter low-power states (D3hot or D3cold) when idle. However, on these systems, the transition to a low-power state might fail and trigger PCI bus errors. The Enhanced Error Handling (EEH) mechanism recovers the device, but frequent recovery events can affect device stability and availability.

Content

Linux Releases Affected

SUSE Linux Enterprise Server (SLES 16.0), all supported PPC platforms

IBM Systems Affected

All IBM PowerPC (pseries) systems that use PCI devices managed by vfio-pci.

Symptoms

When the vfio-pci driver is loaded with disable_idle_d3=N (default), the driver enables PCI runtime power management. During idle periods, the device transitions to a low-power state (D3hot or D3cold).

On PowerPC platforms, if a device transitions to D3cold at the host (L1) level, the guest(L2) or userspace driver might not be aware of the transition. If the guest attempts to access the device while it remains in D3cold, a PCI bus error occurs, that triggers an EEH event.

Although the EEH subsystem of the kernel recovers the device, repeated EEH recoveries might lead to degraded performance or intermittent device unavailability.

Workaround

To resolve this issue, load the vfio-pci driver with the parameter disable_idle_d3=Y to disable idle D3 state transitions. 

Temporary setting

modprobe vfio-pci disable_idle_d3=Y

Persistent setting

To persist this setting across restarts, create a modprobe configuration file by running the following command:

echo "options vfio-pci disable_idle_d3=Y" | sudo tee -a /etc/modprobe.d/vfio-pci.conf

Note

Disabling idle D3 power management might increase power consumption, but the action prevents device reset failures and EEH recoveries that are associated with unsupported power state changes on PowerPC platforms.

Fix outlook 

SUSE mirrored bug number: SUSE1251023

The fix for this issue will be included in a later release.

I/O device impacted

All PCI devices that use VFIO for user-space pass-through are subject to specific handling requirements. These devices include:

  • Network interface cards (NICs)
  • Storage controllers
  • Other hardware devices bound to **vfio-pci** for direct assignment to guests or user space

[{"Type":"MASTER","Line of Business":{"code":"LOB68","label":"Power HW"},"Business Unit":{"code":"BU070","label":"IBM Infrastructure"},"Product":{"code":"SGMV168","label":"IBM Support for SUSE Linux Enterprise Server"},"ARM Category":[{"code":"a8m0z000000GnlCAAS","label":"SUSE Linux Enterprise Server"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"11.0.0;12.0.0;15.0.0"}]

Document Information

Modified date:
10 November 2025

UID

ibm17247831