On 19/07/2019 20:31, Roman Shaposhnik wrote: > Hi! > > we're using Xen on Advantech ARK-2250 Embedded Box PC: > https://www.elmark.com.pl/web/uploaded/karty_produktow/advantech/ark-2250l/ark-2250l_instrukcja-uzytkownika.pdf > > After upgrading to Xen 4.12.0 from 4.11.0 we now have to utilize iommu=no-igfx > workaround as per: > https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#iommu > > Without the workaround the screen appears to be garbled with colored > static noise and the following message keeps showing up: > (XEN) printk: 26235 messages suppressed. > (XEN) [VT-D]DMAR:[DMA Read] Request device [0000:00:02.0] fault addr > 8e43c000, iommu reg = ffff82c00021d000 > (XEN) printk: 26303 messages suppressed. > (XEN) [VT-D]DMAR:[DMA Read] Request device [0000:00:02.0] fault addr > 8e2c6000, iommu reg = ffff82c00021d000 > > Once iommu=no-igfx is applied the box can boot fine. > > At the end of this email, you can see a full log of the box booting > all the way into Dom0 with iommu=no-igfx applied. I am also attaching > similar log without no-igfx > > Please let me know if you need any more information to help us diagnose this. This will be a consequence of trying to remove various pieces of stupidity with the preexisting IOMMU logic, in an attempt to unify the PV and PVH paths. As for the symptoms you're seeing, that is because the GPU is not being given access to the RAM stolen for graphics purposes. Picking the log apart: (XEN) EFI RAM map: (XEN) 0000000000000000 - 0000000000058000 (usable) (XEN) 0000000000058000 - 0000000000059000 (reserved) (XEN) 0000000000059000 - 000000000009f000 (usable) (XEN) 000000000009f000 - 00000000000a0000 (reserved) (XEN) 0000000000100000 - 000000008648a000 (usable) (XEN) 000000008648a000 - 000000008648b000 (ACPI NVS) (XEN) 000000008648b000 - 00000000864b5000 (reserved) (XEN) 00000000864b5000 - 000000008c224000 (usable) (XEN) 000000008c224000 - 000000008c528000 (reserved) (XEN) 000000008c528000 - 000000008c736000 (usable) (XEN) 000000008c736000 - 000000008cea7000 (ACPI NVS) (XEN) 000000008cea7000 - 000000008d2ff000 (reserved) (XEN) 000000008d2ff000 - 000000008d300000 (usable) (XEN) 000000008d300000 - 000000008d400000 (reserved) (XEN) 00000000e0000000 - 00000000f0000000 (reserved) (XEN) 00000000fe000000 - 00000000fe011000 (reserved) (XEN) 00000000fec00000 - 00000000fec01000 (reserved) (XEN) 00000000fee00000 - 00000000fee01000 (reserved) (XEN) 00000000ff000000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 000000016e000000 (usable) ... (XEN) Enabling APIC mode: Flat. Using 1 I/O APICs (XEN) [VT-D] RMRR address range 8d800000..8fffffff not in reserved memory; need "iommu_inclusive_mapping=1"? (XEN) Switched to APIC driver x2apic_cluster. ... (XEN) [VT-D]DMAR:[DMA Read] Request device [0000:00:02.0] fault addr 8e480000, iommu reg = ffff82c00021d000 (XEN) [VT-D]DMAR: reason 06 - PTE Read access is not set (XEN) [VT-D]INTR-REMAP: Request device [0000:f0:1f.0] fault index 0, iommu reg = ffff82c00021f000 (XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear The RMRR identified is a hole in the e820, and the range which is causing IOMMU faults.  Clearly it isn't being set up correctly. First of all, can you check what effect booting with iommu_inclusive_mapping=1 has please?  While at it, iommu=debug would also be helpful. Back to the log.  Strictly speaking, this is a violation of the VT-d spec.  Section 8.4 Reserved Memory Region Reporting Structure says: "BIOS must report the RMRR reported memory addresses as reserved (or as EFI runtime) in the system memory map returned through methods such as INT15, EFI GetMemoryMap etc." However, Xen's logic here is very broken, and in need of fixing. For that message, it only checks the first and last address for being reserved, not the entire region, which will give it plenty of false negatives. For RMRRs themselves, system firmware is well known for abiding by the spec [citation needed], but an RMRR must be honoured, because the entire purpose of them is to state "this device won't function without access to this area". An RMRR in a hole, while a violation of the spec, is obviously fine to use in practice, so we should just accept it and stop complaining. OTOH, RMRRs which hit other memory (particularly RAM) need more care, and probably want to force override the e820 to reserved.  Nothing good will come from trusting the e820 over the DMAR table here, seeing as there is clearly an error somewhere in the firmware-provided information. However - I'm struggling to locate anywhere which actually walks dom0's RMRR list and inserts them into the IOMMU.  Anyone got any hints? ~Andrew