MegaRAID 9341-4i Debian 11 DMAR DRHD handling fault status reg 3

From Thomas-Krenn-Wiki
Jump to navigation Jump to search

After installing Debian 11 on a system with MegaRAID 9341-4i RAID controller, messages such as AVAGO EFI SAS Driver is Unhealthy may appear during the boot process and DMAR: DRHD: handling fault status reg 3 in the system log. In this article, we explain how to solve this problem by using the kernel parameter intel_iommu=on iommu=pt.

Description of problem

After installing Debian 11, the following problems appear on a system with MegaRAID 9341-4i RAID controller:

  • Message during boot process:
    Avago EFI SAS Driver is Unhealthy
  • Message in the BIOS:
    L2/L3 Cache error was detected on the RAID controller. Please contact technical support to resolve this issue. Press 'X' to continue or else power off the system, replace the controller and reboot.
  • Error message in Syslog:
    DMAR: DRHD: handling fault status reg 3

Error message in Syslog explained in detail:

[ 44.978734] megasas: 07.714.04.00-rc1
[ 44.979059] megaraid_sas 0000:02:00.0: BAR:0x1 BAR's base_addr(phys):0x0000000091300000 mapped virt_addr:0x00000000e2c49ffb
[ 44.979061] megaraid_sas 0000:02:00.0: FW now in Ready state
[ 44.979061] megaraid_sas 0000:02:00.0: 63 bit DMA mask and 32 bit consistent mask
[ 44.979181] megaraid_sas 0000:02:00.0: firmware supports msix : (96)
[ 44.979388] megaraid_sas 0000:02:00.0: requested/available msix 13/13
[ 44.979389] megaraid_sas 0000:02:00.0: current msix/online cpus : (13/12)
[ 44.979390] megaraid_sas 0000:02:00.0: RDPQ mode : (disabled)
[ 44.979391] megaraid_sas 0000:02:00.0: Current firmware supports maximum commands: 272 LDIO threshold: 237
[ 44.979425] megaraid_sas 0000:02:00.0: Configured max firmware commands: 271
[ 44.979686] megaraid_sas 0000:02:00.0: Performance mode :Latency
[ 44.979687] megaraid_sas 0000:02:00.0: FW supports sync cache : Yes
[ 44.979688] megaraid_sas 0000:02:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
[ 45.235288] DMAR: DRHD: handling fault status reg 3
[ 45.235366] DMAR: [DMA Write] Request device [02:00.0] PASID ffffffff fault addr 3ffb0000 [fault reason 05] PTE Write access is not set
[ 45.236414] DMAR: DRHD: handling fault status reg 3
[ 45.236492] DMAR: [DMA Read] Request device [02:00.0] PASID ffffffff fault addr 3ffb0000 [fault reason 06] PTE Read access is not set
[ 46.289908] DMAR: DRHD: handling fault status reg 3
[ 46.289986] DMAR: [DMA Read] Request device [02:00.0] PASID ffffffff fault addr 3ffb0000 [fault reason 06] PTE Read access is not set
[ 47.353019] DMAR: DRHD: handling fault status reg 3
[ 50.542825] dmar_fault: 8 callbacks suppressed ...
[ 291.968323] dmar_fault: 5 callbacks suppressed
[ 291.968327] DMAR: DRHD: handling fault status reg 3
[ 291.973328] DMAR: [DMA Read] Request device [02:00.0] PASID ffffffff fault addr 3ffb0000 [fault reason 06] PTE Read access is not set
[ 293.031635] DMAR: DRHD: handling fault status reg 3
[ 293.034716] DMAR: [DMA Read] Request device [02:00.0] PASID ffffffff fault addr 3ffb0000 [fault reason 06] PTE Read access is not set
[ 294.094631] DMAR: DRHD: handling fault status reg 3
[ 294.098574] DMAR: [DMA Read] Request device [02:00.0] PASID ffffffff fault addr 3ffb0000 [fault reason 06] PTE Read access is not set
[ 295.157716] DMAR: DRHD: handling fault status reg 3
[ 296.988718] megaraid_sas 0000:02:00.0: Init cmd return status FAILED for SCSI host 9
[ 296.994525] megaraid_sas 0000:02:00.0: Failed from megasas_init_fw 6460

Affected systems

The problem appeared with the following hardware/software.

Hardware:

  • Supermicro mainboard X11SCH-LN4F
  • MegaRAID 9341-4i RAID controller

Software:

  • Debian 11 with Linux kernel 5.10.0-11, megaraid_sas module version (via Debian) 07.714.04.00-r, even the latest Broadcom module (07.719.04.00) does not resolve the problem.

When using Debian 10 with Linux kernel 4.19.0-18, megaraid_sas module version 07.706.03.00-rc1, the poblem does not appear.

Cause

The error message DMAR: DRHD: handling fault status reg 3 indicates a problem in combination with the IOMMU. The terms have the following meanings:[1][2]

  • DMAR = DMA Remapping Reporting
  • DRHD = DMA Remapping Hardware Unit Definition

Solution

To solve the issue, the Intel IOMMU functions must be activated in the Linux kernel and the IOMMU must be set to pass-through mode.

For this, the following parameter (via /etc/default/grub) must be set:

  • intel_iommu=on iommu=pt

More information

References


Author: Werner Fischer

Werner Fischer, working in the Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.


Translator: Alina Ranzinger

Alina has been working at Thomas-Krenn.AG since 2024. After her training as multilingual business assistant, she got her job as assistant of the Product Management and is responsible for the translation of texts and for the organisation of the department.


Related articles

Creation of Sudo User in Debian Linux
Debian GNU/Linux
Network Configuration under Debian