Recovering a PCIe device
Use the zpcictl command or the recover sysfs attribute to handle a malfunctioning PCIe device if automatic recovery fails.
Before you begin
The following sample
sequence of kernel messages indicates a successful recovery for an NVMe
device:
zpci: 000e:00:00.0: Event 0x3a reports an error for PCI function 0x1004
nvme nvme0: frozen state error detected, reset controller
zpci: 000e:00:00.0: Initiating reset
nvme nvme0: restart after slot reset
zpci: 000e:00:00.0: The device is ready to resume operations
nvme nvme0: Shutdown timeout set to 10 seconds
nvme nvme0: 63/0/0 default/read/poll queues
Failed automatic recoveries end with
error messages that call for operator intervention as shown in the following
example.
zpci: 000d:00:00.0: Automatic recovery failed after slot reset
zpci: 000d:00:00.0: Automatic recovery failed; operator intervention is required