ASUS RS500A-E10-RS12U mit PCIe 4.0 NVMe SSD Hardware error from APEI Generic Hardware Error Source 514

From Thomas-Krenn-Wiki
Jump to navigation Jump to search

The ASUS RS500A-E10-RS12U (Thomas-Krenn RA1112) experiences hardware error from APEI Generic Hardware Error Source: 514 error messages under Linux when using PCIe 4.0 SSDs and older BIOS versions. We were able to confirm these errors with both KIOXIA CM6-V U.3 NVMe SSDs (3,2 TB Modell) and Intel D7-P5510 NVMe SSDs (3,84 TB model). An update to BIOS version 0401 ( for AMD EPYC Milan CPU) and, presumably, BIOS Version 4301 (for AMD EPYC Rome CPU).

Overview of the problem

History BIOS updates (Milan and Rome) for ASUS RS500A-E10-RS12U.[1]
KIOXIA CM6-V Intel D7-P5510
AMD EPYC 7003 Milan Problem confirmed with BIOS version 0107 0105
functional BIOS version 0401 We could not recreate the problem anymore.



expected to be permanently

solved with 0401

AMD EPYC 7002 Rome ASUS RS500A-E10-RS12U (not tested) 4201
functional BIOS version (not tested) (not tested)

KIOXIA CM6-V with Milan

  • Hardware
    • RS500A-E10-RS12U with AMD EPYC 7443P (Milan)
    • KIOXIA CM6-V 3,2 TB with firmware 105 as well as 106 (tested)
  • Software

Problem with BIOS 0107

The following BIOS causes error messages to appear:

  • BIOS: KRPA-U16-M Series, 0107 (Milan), 04/22/2021

Output of dmesg:

[156835.766481] pcieport 0000:80:03.3:    [ 8] Rollover               
[156835.766483] pcieport 0000:80:03.3:    [12] Timeout                
[156835.766485] pcieport 0000:80:03.3: AER: aer_layer=Data Link Layer, aer_agent=Transmitter ID
[156835.767390] pcieport 0000:80:03.4: AER: aer_status: 0x00001100, aer_mask: 0x00000000
[156835.768153] pcieport 0000:80:03.4:    [ 8] Rollover               
[156835.768155] pcieport 0000:80:03.4:    [12] Timeout                
[156835.768156] pcieport 0000:80:03.4: AER: aer_layer=Data Link Layer, aer_agent=Transmitter ID
[156839.590957] {51953}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 514
[156839.590959] {51953}[Hardware Error]: It has been corrected by h/w and requires no further action
[156839.590960] {51953}[Hardware Error]: event severity: corrected
[156839.590960] {51953}[Hardware Error]:  Error 0, type: corrected
[156839.590961] {51953}[Hardware Error]:   section_type: PCIe error
[156839.590962] {51953}[Hardware Error]:   port_type: 0, PCIe end point
[156839.590962] {51953}[Hardware Error]:   version: 0.2
[156839.590962] {51953}[Hardware Error]:   command: 0x0406, status: 0x0010
[156839.590963] {51953}[Hardware Error]:   device_id: 0000:81:00.0
[156839.590963] {51953}[Hardware Error]:   slot: 0
[156839.590964] {51953}[Hardware Error]:   secondary_bus: 0x00
[156839.590964] {51953}[Hardware Error]:   vendor_id: 0x1e0f, device_id: 0x0007
[156839.590964] {51953}[Hardware Error]:   class_code: 010802
[156839.590965] {51953}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
[156839.590965] {51953}[Hardware Error]:  Error 1, type: corrected
[156839.590965] {51953}[Hardware Error]:   section_type: PCIe error
[156839.590966] {51953}[Hardware Error]:   port_type: 0, PCIe end point
[156839.590966] {51953}[Hardware Error]:   version: 0.2
[156839.590966] {51953}[Hardware Error]:   command: 0x0406, status: 0x0010
[156839.590967] {51953}[Hardware Error]:   device_id: 0000:82:00.0
[156839.590967] {51953}[Hardware Error]:   slot: 0
[156839.590967] {51953}[Hardware Error]:   secondary_bus: 0x00
[156839.590968] {51953}[Hardware Error]:   vendor_id: 0x1e0f, device_id: 0x0007
[156839.590968] {51953}[Hardware Error]:   class_code: 010802
[156839.590968] {51953}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
[156839.590969] {51953}[Hardware Error]:  Error 2, type: corrected
[156839.590969] {51953}[Hardware Error]:   section_type: PCIe error
[156839.590969] {51953}[Hardware Error]:   port_type: 0, PCIe end point
[156839.590969] {51953}[Hardware Error]:   version: 0.2
[156839.590970] {51953}[Hardware Error]:   command: 0x0406, status: 0x0010
[156839.590970] {51953}[Hardware Error]:   device_id: 0000:81:00.0
[156839.590970] {51953}[Hardware Error]:   slot: 0
[156839.590971] {51953}[Hardware Error]:   secondary_bus: 0x00
[156839.590971] {51953}[Hardware Error]:   vendor_id: 0x1e0f, device_id: 0x0007
[156839.590971] {51953}[Hardware Error]:   class_code: 010802
[156839.590971] {51953}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
[156839.590972] {51953}[Hardware Error]:  Error 3, type: corrected
[156839.590972] {51953}[Hardware Error]:   section_type: PCIe error
[156839.590972] {51953}[Hardware Error]:   port_type: 0, PCIe end point
[156839.590972] {51953}[Hardware Error]:   version: 0.2
[156839.590973] {51953}[Hardware Error]:   command: 0x0406, status: 0x0010
[156839.590973] {51953}[Hardware Error]:   device_id: 0000:82:00.0
[156839.590973] {51953}[Hardware Error]:   slot: 0
[156839.590974] {51953}[Hardware Error]:   secondary_bus: 0x00
[156839.590974] {51953}[Hardware Error]:   vendor_id: 0x1e0f, device_id: 0x0007
[156839.590974] {51953}[Hardware Error]:   class_code: 010802
[156839.590975] {51953}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
[156839.590975] {51953}[Hardware Error]:  Error 4, type: corrected
[156839.590975] {51953}[Hardware Error]:   section_type: PCIe error
[156839.590975] {51953}[Hardware Error]:   port_type: 0, PCIe end point
[156839.590976] {51953}[Hardware Error]:   version: 0.2
[156839.590976] {51953}[Hardware Error]:   command: 0x0406, status: 0x0010
[156839.590976] {51953}[Hardware Error]:   device_id: 0000:81:00.0
[156839.590976] {51953}[Hardware Error]:   slot: 0
[156839.590977] {51953}[Hardware Error]:   secondary_bus: 0x00
[156839.590977] {51953}[Hardware Error]:   vendor_id: 0x1e0f, device_id: 0x0007
[156839.590977] {51953}[Hardware Error]:   class_code: 010802
[156839.590978] {51953}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
[156839.590978] {51953}[Hardware Error]:  Error 5, type: corrected
[156839.590978] {51953}[Hardware Error]:   section_type: PCIe error
[156839.590978] {51953}[Hardware Error]:   port_type: 0, PCIe end point
[156839.590979] {51953}[Hardware Error]:   version: 0.2
[156839.590979] {51953}[Hardware Error]:   command: 0x0406, status: 0x0010
[156839.590979] {51953}[Hardware Error]:   device_id: 0000:82:00.0
[156839.590980] {51953}[Hardware Error]:   slot: 0
[156839.590980] {51953}[Hardware Error]:   secondary_bus: 0x00
[156839.590980] {51953}[Hardware Error]:   vendor_id: 0x1e0f, device_id: 0x0007
[156839.590980] {51953}[Hardware Error]:   class_code: 010802
[156839.590981] {51953}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
[156839.590981] {51953}[Hardware Error]:  Error 6, type: corrected
[156839.590981] {51953}[Hardware Error]:   section_type: PCIe error
[156839.590981] {51953}[Hardware Error]:   port_type: 0, PCIe end point
[156839.590982] {51953}[Hardware Error]:   version: 0.2
[156839.590982] {51953}[Hardware Error]:   command: 0x0406, status: 0x0010
[156839.590982] {51953}[Hardware Error]:   device_id: 0000:81:00.0
[156839.590983] {51953}[Hardware Error]:   slot: 0
[156839.590983] {51953}[Hardware Error]:   secondary_bus: 0x00
[156839.590983] {51953}[Hardware Error]:   vendor_id: 0x1e0f, device_id: 0x0007
[156839.590983] {51953}[Hardware Error]:   class_code: 010802
[156839.590984] {51953}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
[156839.590984] {51953}[Hardware Error]:  Error 7, type: corrected
[156839.590984] {51953}[Hardware Error]:   section_type: PCIe error
[156839.590985] {51953}[Hardware Error]:   port_type: 0, PCIe end point
[156839.590985] {51953}[Hardware Error]:   version: 0.2
[156839.590985] {51953}[Hardware Error]:   command: 0x0406, status: 0x0010
[156839.590985] {51953}[Hardware Error]:   device_id: 0000:82:00.0
[156839.590986] {51953}[Hardware Error]:   slot: 0
[156839.590986] {51953}[Hardware Error]:   secondary_bus: 0x00
[156839.590986] {51953}[Hardware Error]:   vendor_id: 0x1e0f, device_id: 0x0007
[156839.590986] {51953}[Hardware Error]:   class_code: 010802
[156839.590987] {51953}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
[156839.590987] {51953}[Hardware Error]:  Error 8, type: corrected
[156839.590987] {51953}[Hardware Error]:   section_type: PCIe error
[156839.590988] {51953}[Hardware Error]:   port_type: 0, PCIe end point
[156839.590988] {51953}[Hardware Error]:   version: 0.2
[156839.590988] {51953}[Hardware Error]:   command: 0x0406, status: 0x0010
[156839.590988] {51953}[Hardware Error]:   device_id: 0000:81:00.0
[156839.590989] {51953}[Hardware Error]:   slot: 0
[156839.590989] {51953}[Hardware Error]:   secondary_bus: 0x00
[156839.590989] {51953}[Hardware Error]:   vendor_id: 0x1e0f, device_id: 0x0007
[156839.590989] {51953}[Hardware Error]:   class_code: 010802
[156839.590990] {51953}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
[156839.590990] {51953}[Hardware Error]:  Error 9, type: corrected
[156839.590990] {51953}[Hardware Error]:   section_type: PCIe error
[156839.590991] {51953}[Hardware Error]:   port_type: 0, PCIe end point
[156839.590991] {51953}[Hardware Error]:   version: 0.2
[156839.590991] {51953}[Hardware Error]:   command: 0x0406, status: 0x0010
[156839.590991] {51953}[Hardware Error]:   device_id: 0000:82:00.0
[156839.590992] {51953}[Hardware Error]:   slot: 0
[156839.590992] {51953}[Hardware Error]:   secondary_bus: 0x00
[156839.590992] {51953}[Hardware Error]:   vendor_id: 0x1e0f, device_id: 0x0007
[156839.590993] {51953}[Hardware Error]:   class_code: 010802
[156839.590993] {51953}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
[156839.591399] nvme 0000:81:00.0: AER: aer_status: 0x00000081, aer_mask: 0x00000000
[156839.591754] nvme 0000:81:00.0:    [ 0] RxErr                  (First)
[156839.591755] nvme 0000:81:00.0:    [ 7] BadDLLP                
[156839.591756] nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
[156839.592048] nvme 0000:82:00.0: AER: aer_status: 0x00000041, aer_mask: 0x00000000
[156839.592335] nvme 0000:82:00.0:    [ 0] RxErr                  (First)
[156839.592336] nvme 0000:82:00.0:    [ 6] BadTLP                 
[156839.592336] nvme 0000:82:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
[156839.592624] nvme 0000:81:00.0: AER: aer_status: 0x00000081, aer_mask: 0x00000000
[156839.593221] nvme 0000:81:00.0:    [ 0] RxErr                  (First)
[156839.593221] nvme 0000:81:00.0:    [ 7] BadDLLP                
[156839.593222] nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
[156839.593510] nvme 0000:82:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
[156839.593794] nvme 0000:82:00.0:    [ 0] RxErr                  (First)
[156839.593795] nvme 0000:82:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
[156839.594077] nvme 0000:81:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
[156839.594578] nvme 0000:81:00.0:    [ 0] RxErr                  (First)
[156839.594578] nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
[156839.594949] nvme 0000:82:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
[156839.595229] nvme 0000:82:00.0:    [ 0] RxErr                  (First)
[156839.595230] nvme 0000:82:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
[156839.595512] nvme 0000:81:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
[156839.595792] nvme 0000:81:00.0:    [ 0] RxErr                  (First)
[156839.595793] nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
[156839.596076] nvme 0000:82:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
[156839.596356] nvme 0000:82:00.0:    [ 0] RxErr                  (First)
[156839.596356] nvme 0000:82:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
[156839.596640] nvme 0000:81:00.0: AER: aer_status: 0x00000081, aer_mask: 0x00000000
[156839.597218] nvme 0000:81:00.0:    [ 0] RxErr                  (First)
[156839.597219] nvme 0000:81:00.0:    [ 7] BadDLLP                
[156839.597219] nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
[156839.597502] nvme 0000:82:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
[156839.597782] nvme 0000:82:00.0:    [ 0] RxErr                  (First)
[156839.597783] nvme 0000:82:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID

Solution with BIOS 0401

After an update of the BIOS on the following version, the error does not appear anymore in the dmesg issue:

  • BIOS 0401

Intel D7-P5510 with Milan

Problem with BIOS 0105

test@test:~$ uname -a
Linux test 5.13.0-30-generic #33~20.04.1-Ubuntu SMP Mon Feb 7 14:25:10 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
test@test:~$ tail -n 100 /var/log/syslog
[...]
Feb 17 15:49:26 test systemd[1]: Started Session 1 of user test.
Feb 17 15:49:36 test kernel: [  111.046733] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 514
Feb 17 15:49:36 test kernel: [  111.046738] {1}[Hardware Error]: It has been corrected by h/w and requires no further action
Feb 17 15:49:36 test kernel: [  111.046740] {1}[Hardware Error]: event severity: corrected
Feb 17 15:49:36 test kernel: [  111.046742] {1}[Hardware Error]:  Error 0, type: corrected
Feb 17 15:49:36 test kernel: [  111.046743] {1}[Hardware Error]:  fru_text: PcieError
Feb 17 15:49:36 test kernel: [  111.046745] {1}[Hardware Error]:   section_type: PCIe error
Feb 17 15:49:36 test kernel: [  111.046746] {1}[Hardware Error]:   port_type: 0, PCIe end point
Feb 17 15:49:36 test kernel: [  111.046748] {1}[Hardware Error]:   version: 0.2
Feb 17 15:49:36 test kernel: [  111.046749] {1}[Hardware Error]:   command: 0x0406, status: 0x0010
Feb 17 15:49:36 test kernel: [  111.046751] {1}[Hardware Error]:   device_id: 0000:41:00.0
Feb 17 15:49:36 test kernel: [  111.046753] {1}[Hardware Error]:   slot: 0
Feb 17 15:49:36 test kernel: [  111.046754] {1}[Hardware Error]:   secondary_bus: 0x00
Feb 17 15:49:36 test kernel: [  111.046755] {1}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x0b60
Feb 17 15:49:36 test kernel: [  111.046757] {1}[Hardware Error]:   class_code: 010802
Feb 17 15:49:36 test kernel: [  111.046758] {1}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
Feb 17 15:49:36 test kernel: [  111.048269] nvme 0000:41:00.0: AER: aer_status: 0x00002001, aer_mask: 0x00000000
Feb 17 15:49:36 test kernel: [  111.048311] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
Feb 17 15:49:36 test kernel: [  111.048314] nvme 0000:41:00.0:    [13] NonFatalErr
Feb 17 15:49:36 test kernel: [  111.048317] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
Feb 17 15:49:54 test systemd-timesyncd[1119]: Timed out waiting for reply from 91.189.91.157:123 (ntp.ubuntu.com).
Feb 17 15:50:04 test systemd-timesyncd[1119]: Timed out waiting for reply from 91.189.89.198:123 (ntp.ubuntu.com).
Feb 17 15:50:15 test systemd-timesyncd[1119]: Timed out waiting for reply from 91.189.89.199:123 (ntp.ubuntu.com).
Feb 17 15:50:25 test systemd-timesyncd[1119]: Timed out waiting for reply from 91.189.94.4:123 (ntp.ubuntu.com).
Feb 17 15:50:25 test kernel: [  160.457304] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 514
Feb 17 15:50:25 test kernel: [  160.457310] {2}[Hardware Error]: It has been corrected by h/w and requires no further action
Feb 17 15:50:25 test kernel: [  160.457311] {2}[Hardware Error]: event severity: corrected
Feb 17 15:50:25 test kernel: [  160.457314] {2}[Hardware Error]:  Error 0, type: corrected
Feb 17 15:50:25 test kernel: [  160.457316] {2}[Hardware Error]:   section_type: PCIe error
Feb 17 15:50:25 test kernel: [  160.457317] {2}[Hardware Error]:   port_type: 0, PCIe end point
Feb 17 15:50:25 test kernel: [  160.457318] {2}[Hardware Error]:   version: 0.2
Feb 17 15:50:25 test kernel: [  160.457319] {2}[Hardware Error]:   command: 0x0406, status: 0x0010
Feb 17 15:50:25 test kernel: [  160.457321] {2}[Hardware Error]:   device_id: 0000:41:00.0
Feb 17 15:50:25 test kernel: [  160.457323] {2}[Hardware Error]:   slot: 0
Feb 17 15:50:25 test kernel: [  160.457324] {2}[Hardware Error]:   secondary_bus: 0x00
Feb 17 15:50:25 test kernel: [  160.457325] {2}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x0b60
Feb 17 15:50:25 test kernel: [  160.457327] {2}[Hardware Error]:   class_code: 010802
Feb 17 15:50:25 test kernel: [  160.457328] {2}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
Feb 17 15:50:25 test kernel: [  160.457395] nvme 0000:41:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
Feb 17 15:50:25 test kernel: [  160.457458] nvme 0000:41:00.0:    [ 0] RxErr                  (First)
Feb 17 15:50:25 test kernel: [  160.457461] nvme 0000:41:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
test@test:~$ sudo nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     A0707428             WUS4BB096D7P3E4                          1         960.20  GB / 960.20  GB    512   B +  0 B   R1410002
/dev/nvme1n1     A07073DC             WUS4BB096D7P3E4                          1           1.09  GB / 960.20  GB    512   B +  0 B   R1410002
/dev/nvme2n1     BTAC111204C07P6CGN   INTEL SSDPF2KX076TZ                      1           7.68  TB /   7.68  TB    512   B +  0 B   JCV10100
test@test:~$ sudo dmidecode -t bios
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 3.3.0 present.
# SMBIOS implementations newer than version 3.2.0 are not
# fully supported by this version of dmidecode.

Handle 0x0000, DMI type 0, 26 bytes
BIOS Information
        Vendor: American Megatrends Inc.
        Version: 0105
        Release Date: 03/26/2021
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 16 MB
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                BIOS ROM is socketed
                EDD is supported
                Japanese floppy for NEC 9800 1.2 MB is supported (int 13h)
                Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
                5.25"/360 kB floppy services are supported (int 13h)
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                Print screen service is supported (int 5h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                CGA/mono video services are supported (int 10h)
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
                UEFI is supported
        BIOS Revision: 1.5

Handle 0x004C, DMI type 13, 22 bytes
BIOS Language Information
        Language Description Format: Long
        Installable Languages: 1
                en|US|iso8859-1
        Currently Installed Language: en|US|iso8859-1

test@test:~$

Solution with BIOS 0401 suspected

Like with KIOXIA CM6-v, we suspect that an update on a newer BIOS version (0202 or rather 0401) will solve the problem. Tests for this are currently (February 23, 2022) still pending. We update the article, as soon as we have concluded the tests.

Intel D7-P5510 with Rome

Problem with BIOS 4201

  • Hardware
    • RS500A-E10-RS12U with AMD EPYC 7702P (Rome)
      • BIOS: KRPA-U16 Series, 4201 (Rome), 03/26/2021
    • Intel D7-P5510 3,84 TB with firmware JCV10200
  • Software
  • Ubuntu 20.04 with HWE Kernel version 5.11
[   51.558131] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 514
[   51.558137] {2}[Hardware Error]: It has been corrected by h/w and requires no further action
[   51.558140] {2}[Hardware Error]: event severity: corrected
[   51.558142] {2}[Hardware Error]:  Error 0, type: corrected
[   51.558144] {2}[Hardware Error]:   section_type: PCIe error
[   51.558146] {2}[Hardware Error]:   port_type: 0, PCIe end point
[   51.558148] {2}[Hardware Error]:   version: 0.2
[   51.558149] {2}[Hardware Error]:   command: 0x0406, status: 0x0010
[   51.558152] {2}[Hardware Error]:   device_id: 0000:81:00.0
[   51.558154] {2}[Hardware Error]:   slot: 0
[   51.558156] {2}[Hardware Error]:   secondary_bus: 0x00
[   51.558157] {2}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x0b60
[   51.558159] {2}[Hardware Error]:   class_code: 010802
[   51.558161] {2}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
[   51.558208] nvme 0000:81:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
[   51.558257] nvme 0000:81:00.0:    [ 0] RxErr                  (First)
[   51.558261] nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
[  117.605908] {3}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 514
[  117.605913] {3}[Hardware Error]: It has been corrected by h/w and requires no further action
[  117.605915] {3}[Hardware Error]: event severity: corrected
[  117.605917] {3}[Hardware Error]:  Error 0, type: corrected
[  117.605919] {3}[Hardware Error]:   section_type: PCIe error
[  117.605920] {3}[Hardware Error]:   port_type: 0, PCIe end point
[  117.605922] {3}[Hardware Error]:   version: 0.2
[  117.605924] {3}[Hardware Error]:   command: 0x0406, status: 0x0010
[  117.605926] {3}[Hardware Error]:   device_id: 0000:81:00.0
[  117.605929] {3}[Hardware Error]:   slot: 0
[  117.605930] {3}[Hardware Error]:   secondary_bus: 0x00
[  117.605932] {3}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x0b60
[  117.605934] {3}[Hardware Error]:   class_code: 010802
[  117.605935] {3}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
[  117.605974] nvme 0000:81:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
[  117.606024] nvme 0000:81:00.0:    [ 0] RxErr                  (First)
[  117.606028] nvme 0000:81:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID
[...]
    $ sudo nvme list
    Node             SN                   Model                                    Namespace Usage                      Format           FW Rev
    ---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
    /dev/nvme0n1     PHAC113600473P8AGN   INTEL SSDPF2KX038TZ                      1           3.84  TB /   3.84  TB    512   B +  0 B   JCV10200
    /dev/nvme1n1     18161E7964B7         Micron_9200_MTFDHAL1T6TCU                1           1.60  TB /   1.60  TB    512   B +  0 B   101008P0
    $ uname -a
    Linux admin 5.11.0-41-generic #45~20.04.1-Ubuntu SMP Wed Nov 10 10:20:10 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Solution with BIOS 4301 suspected

We suspect that an update on the BIOS version 4301 (Rome) solves the problem. This version 4301 (Rome) was released on January 3, 2022, with BIOS version 0401 (Milan) from ASUS. Since such problems with BIOS update 0401 on KIOXIA CM6-V SSDs have been resolved on Milan-based systems, we assume that BIOS version 4301 (Rome) will also solve the problem here.

As such configurations have not been requested yet, we have not performed any tests yet.

References

Author: Werner Fischer

Werner Fischer, working in the Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.


Translator: Alina Ranzinger

Alina has been working at Thomas-Krenn.AG since 2024. After her training as multilingual business assistant, she got her job as assistant of the Product Management and is responsible for the translation of texts and for the organisation of the department.


Related articles

ASUS P11C-M/4L BIOS update
ASUS server BIOS settings via Web
Correction of ASUS RS500A-E10-RS12U boot loap