PCIe Bus Error Status 00001100
Under Linux, PCIe Bus Error messages can occur on systems with activated Active State Power Management (ASPM). The messages refer to corrected error status/mask=00001100/00002000 errors. You can avoid these problems by disabling ASPM with the kernel parameter pcie_aspm=off. You can find out how to do this in this Wiki article.
Problem Description
The kernel logs error status/mask=00001100/00002000 errors via the PCIe advanced error reporting (AER) function to the /var/log/syslog log file. The errors themselves are fixed according to the log entries (severity=corrected), but the repeated entries make the log file confusing.
For example, a corresponding entry in /var/log/syslog looks like this in Debian:
Nov 9 11:20:38 debian kernel: [69300.887669] pcieport 0000:00:02.0: AER: Multiple Corrected error received: id=0010 Nov 9 11:20:38 debian kernel: [69300.887680] pcieport 0000:00:02.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0010(Transmitter ID) Nov 9 11:20:38 debian kernel: [69300.889913] pcieport 0000:00:02.0: device [8086:6f04] error status/mask=00001100/00002000 Nov 9 11:20:38 debian kernel: [69300.892156] pcieport 0000:00:02.0: [ 8] RELAY_NUM Rollover Nov 9 11:20:38 debian kernel: [69300.894370] pcieport 0000:00:02.0: [12] Replay Timer Timeout
Ubuntu displays the messages directly on the console:
Affected Systems
According to a bug report in Ubuntu, this problem mainly affects laptops.[1] In our opinion, the problem can potentially arise with any system with enabled ASPM. We have seen the problem with a server system with Ubuntu 16.04.2 LTS and also with Debian 9 (Stretch) (with Linux Kernel 4.9.0. -3-amd64) with ASPM enabled, especially if network cards (e. g. Intel Intel XXV710, Intel X710, Intel X550) were installed.
Solution
In our tests we were able to solve the problem by setting the kernel parameter pcie_aspm=off.
To permanently set this setting, follow these steps: Um diese Einstellung dauerhaft zu setzen, führen Sie folgende Schritte aus:
- Open the Grub configuration file in an editor, e. g. in vi:
sudo vi /etc/default/grub
- Adjust the GRUB_CMDLINE_LINUX_DEFAULT variable as follows, save the file and close the editor:
GRUB_CMDLINE_LINUX_DEFAULT="quiet pcie_aspm=off"
- After the configuration change, update the Grub Bootloader with the following command:
sudo update-grub
- Finally, restart the program.
In addition to this solution, Ubuntu's bug report also lists the option to disable PCIe Advanced Error Reporting.[1] However, we recommend that you enable PCIe Advanced Error Reporting to log other potential PCIe errors. For more information about the complete list of kernel boot paramenters, see the kernel documentation.[2] Here are extracts of the information for pci=noaer and pcie_aspm=off:
pci=option[,option...] [PCI] various PCI subsystem options: [...] noaer [PCIE] If the PCIEAER kernel config parameter is enabled, this kernel boot option can be used to disable the use of PCIE advanced error reporting. [...] pcie_aspm= [PCIE] Forcibly enable or disable PCIe Active State Power Management. off Disable ASPM. force Enable ASPM even on devices that claim not to support it. WARNING: Forcing ASPM on may cause system lockups.
References
- ↑ 1.0 1.1 AER: Corrected error received: id=00e0 (bugs. launchpad. net, bugreport 1521173)
- ↑ Kernel-Parameters (kernel.org/doc)
Further Information
- Active State Power Management (en.wikipedia.org)
- Active-State Power Management (redhat.com/documentation, RHEL 6 Documentation)
- The PCI Express Advanced Error Reporting Driver Guide HOWTO (kernel.org/doc)
Author: Werner Fischer Werner Fischer, working in the Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.
|