Hardware error from APEI Generic Hardware Error Source
Operating systems can document details about the errors in log files with the help of ACPI Platform Error Interfaces (APEI) when hardware errors occur. In this article, we show how, for example, a network card error can be located in Linux using the message "Hardware error from APEI Generic Hardware Error Source".
Fundamentals and terminology

The ACPI specification provides extensive possibilities for the error reporting with the ACPI Platform Error Interfaces (APEI).[1] Operating systems such as Linux, Windows or FreeBSD can protocol information about hardware errors in log files.[2]
Terms frequently used in this context are:
- ACPI: Advanced Configuration and Power Interface
- APEI: ACPI Platform Error Interfaces
- OSPM: OS-directed configuration and power management
Example
The following log entry from /var/log/syslog on an Ubuntu 18.04 system shows an error with a network card:
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 514
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: It has been corrected by h/w and requires no further action
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: event severity: corrected
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: Error 0, type: corrected
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: section_type: PCIe error
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: port_type: 0, PCIe end point
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: version: 0.2
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: command: 0x0406, status: 0x0010
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: device_id: 0000:43:00.0
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: slot: 0
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: secondary_bus: 0x00
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: vendor_id: 0x8086, device_id: 0x1563
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: class_code: 020000
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: bridge: secondary_status: 0x0000, control: 0x0000
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: Error 1, type: corrected
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: section_type: PCIe error
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: port_type: 0, PCIe end point
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: version: 0.2
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: command: 0x0406, status: 0x0010
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: device_id: 0000:43:00.1
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: slot: 0
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: secondary_bus: 0x00
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: vendor_id: 0x8086, device_id: 0x1563
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: class_code: 020000
[Do Mär 26 07:38:49 2020] {2}[Hardware Error]: bridge: secondary_status: 0x0000, control: 0x0000
[Do Mär 26 07:38:49 2020] ixgbe 0000:43:00.0: AER: aer_status: 0x00001000, aer_mask: 0x00000000
[Do Mär 26 07:38:49 2020] ixgbe 0000:43:00.0: AER: [12] Timeout
[Do Mär 26 07:38:49 2020] ixgbe 0000:43:00.0: AER: aer_layer=Data Link Layer, aer_agent=Transmitter ID
[Do Mär 26 07:38:49 2020] ixgbe 0000:43:00.1: AER: aer_status: 0x00001000, aer_mask: 0x00000000
[Do Mär 26 07:38:49 2020] ixgbe 0000:43:00.1: AER: [12] Timeout
[Do Mär 26 07:38:49 2020] ixgbe 0000:43:00.1: AER: aer_layer=Data Link Layer, aer_agent=Transmitter ID
The log entries contain the following information:
- Hardware error from APEI Generic Hardware Error Source: 514
- section_type: PCIe error
- device_id: 0000:43:00.0
- vendor_id: 0x8086, device_id: 0x1563
- ixgbe 0000:43:00.0 [...] aer_layer=Data Link Layer
The output of lspci -nn shows that a Dual-Port X550 network card is affected in this example:
lspci -nn | grep 1563 43:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10G X550T [8086:1563] (rev 01) 43:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10G X550T [8086:1563] (rev 01)
Possible causes
Three causes are possible in this example:
- Problems with the plug connection
- The Dual-Port X550 network card itself.
- Problems with the mainboard.
We recommend the following troubleshooting in such cases:
- Removing and reinserting the affected expansion card.
- Replacement of the affected expansion card (like network card in this example).
- Replacement of the mainboard.
More information
- Unified Extensible Firmware Interface Forum - Specifications (uefi.org)
- Some PCIe errors not surfaced through rasdaemon (bugs.launchpad.net)
- PCIe AER [Advanced Error Reporting and ACS [Access Control Services] BIOS Settings for vGPUs that Support SR-IOV] (enterprise-support.nvidia.com, 13.10.2021)
- virtualization function under OS on my H11/H12 series motherboard: Which BIOS items should I enable or disable under BIOS (www.supermicro.com/support/faqs)
References
- ↑ 1.0 1.1 Advanced Configuration and Power Interface (ACPI) Specification Version 6.3 (uefi.org, 01/2019) Kapitel 18 ACPI Platform Error Interfaces (APEI) (Seite 834 ff.)
- ↑ APEI Generic Hardware Error Source support (github.com/torvalds/linux)
|
Author: Werner Fischer Werner Fischer, working in the Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.
|
|
Translator: Alina Ranzinger Alina has been working at Thomas-Krenn.AG since 2024. After her training as multilingual business assistant, she got her job as assistant of the Product Management and is responsible for the translation of texts and for the organisation of the department.
|


