iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Revert "iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions"
@ 2020-09-23  2:26 Baoquan He
  2020-09-23  2:32 ` Baoquan He
  2020-09-24  9:08 ` Joerg Roedel
  0 siblings, 2 replies; 4+ messages in thread
From: Baoquan He @ 2020-09-23  2:26 UTC (permalink / raw)
  To: joro, ahuang12; +Cc: iommu, linux-kernel

A regression failure of kdump kernel boot was reported on a HPE system.
Bisect points at commit 387caf0b759ac43 ("iommu/amd: Treat per-device
exclusion ranges as r/w unity-mapped regions") as criminal. Reverting it
fix the failure.

With the commit, kdump kernel will always print below error message, then
naturally AMD iommu can't function normally during kdump kernel bootup.

  ~~~~~~~~~
  AMD-Vi: [Firmware Bug]: IVRS invalid checksum

Why commit 387caf0b759ac43 causing it haven't been made clear.

From the commit log, a discussion thread link is pasted. In that discussion
thread, Adrian told the fix is for a system with already broken BIOS, and
Joerg suggested two options. Finally option 2) is taken. Maybe option 1)
should be the right approach?

  1) Bail out and disable the IOMMU as the BIOS screwed up
  2) Treat per-device exclusion ranges just as r/w unity-mapped
     regions.

https://lists.linuxfoundation.org/pipermail/iommu/2019-November/040117.html
Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd/init.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 9aa1eae26634..bbe7ceae5949 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1109,17 +1109,22 @@ static int __init add_early_maps(void)
  */
 static void __init set_device_exclusion_range(u16 devid, struct ivmd_header *m)
 {
+	struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
+
 	if (!(m->flags & IVMD_FLAG_EXCL_RANGE))
 		return;
 
-	/*
-	 * Treat per-device exclusion ranges as r/w unity-mapped regions
-	 * since some buggy BIOSes might lead to the overwritten exclusion
-	 * range (exclusion_start and exclusion_length members). This
-	 * happens when there are multiple exclusion ranges (IVMD entries)
-	 * defined in ACPI table.
-	 */
-	m->flags = (IVMD_FLAG_IW | IVMD_FLAG_IR | IVMD_FLAG_UNITY_MAP);
+	if (iommu) {
+		/*
+		 * We only can configure exclusion ranges per IOMMU, not
+		 * per device. But we can enable the exclusion range per
+		 * device. This is done here
+		 */
+		set_dev_entry_bit(devid, DEV_ENTRY_EX);
+		iommu->exclusion_start = m->range_start;
+		iommu->exclusion_length = m->range_length;
+	}
+
 }
 
 /*
-- 
2.17.2

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions"
  2020-09-23  2:26 [PATCH] Revert "iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions" Baoquan He
@ 2020-09-23  2:32 ` Baoquan He
  2020-09-23 14:29   ` [External] " Adrian Huang12
  2020-09-24  9:08 ` Joerg Roedel
  1 sibling, 1 reply; 4+ messages in thread
From: Baoquan He @ 2020-09-23  2:32 UTC (permalink / raw)
  To: joro, ahuang12; +Cc: iommu, linux-kernel

Forgot CC-ing Jerry, add him.

On 09/23/20 at 10:26am, Baoquan He wrote:
> A regression failure of kdump kernel boot was reported on a HPE system.
> Bisect points at commit 387caf0b759ac43 ("iommu/amd: Treat per-device
> exclusion ranges as r/w unity-mapped regions") as criminal. Reverting it
> fix the failure.
> 
> With the commit, kdump kernel will always print below error message, then
> naturally AMD iommu can't function normally during kdump kernel bootup.
> 
>   ~~~~~~~~~
>   AMD-Vi: [Firmware Bug]: IVRS invalid checksum
> 
> Why commit 387caf0b759ac43 causing it haven't been made clear.

Hi Joerg, Adrian

We only have one machine which can reproduce the issue, it's a gen10-01
of HPE. If any log or info are needed, please let me know, I can attach
here.

Thanks
Baoquan

> 
> From the commit log, a discussion thread link is pasted. In that discussion
> thread, Adrian told the fix is for a system with already broken BIOS, and
> Joerg suggested two options. Finally option 2) is taken. Maybe option 1)
> should be the right approach?
> 
>   1) Bail out and disable the IOMMU as the BIOS screwed up
>   2) Treat per-device exclusion ranges just as r/w unity-mapped
>      regions.
> 
> https://lists.linuxfoundation.org/pipermail/iommu/2019-November/040117.html
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  drivers/iommu/amd/init.c | 21 +++++++++++++--------
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index 9aa1eae26634..bbe7ceae5949 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -1109,17 +1109,22 @@ static int __init add_early_maps(void)
>   */
>  static void __init set_device_exclusion_range(u16 devid, struct ivmd_header *m)
>  {
> +	struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
> +
>  	if (!(m->flags & IVMD_FLAG_EXCL_RANGE))
>  		return;
>  
> -	/*
> -	 * Treat per-device exclusion ranges as r/w unity-mapped regions
> -	 * since some buggy BIOSes might lead to the overwritten exclusion
> -	 * range (exclusion_start and exclusion_length members). This
> -	 * happens when there are multiple exclusion ranges (IVMD entries)
> -	 * defined in ACPI table.
> -	 */
> -	m->flags = (IVMD_FLAG_IW | IVMD_FLAG_IR | IVMD_FLAG_UNITY_MAP);
> +	if (iommu) {
> +		/*
> +		 * We only can configure exclusion ranges per IOMMU, not
> +		 * per device. But we can enable the exclusion range per
> +		 * device. This is done here
> +		 */
> +		set_dev_entry_bit(devid, DEV_ENTRY_EX);
> +		iommu->exclusion_start = m->range_start;
> +		iommu->exclusion_length = m->range_length;
> +	}
> +
>  }
>  
>  /*
> -- 
> 2.17.2
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [External]  Re: [PATCH] Revert "iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions"
  2020-09-23  2:32 ` Baoquan He
@ 2020-09-23 14:29   ` Adrian Huang12
  0 siblings, 0 replies; 4+ messages in thread
From: Adrian Huang12 @ 2020-09-23 14:29 UTC (permalink / raw)
  To: Baoquan He, joro; +Cc: iommu, linux-kernel

Hi Baoquan,

> -----Original Message-----
> From: Baoquan He <bhe@redhat.com>
> Sent: Wednesday, September 23, 2020 10:33 AM
> To: joro@8bytes.org; Adrian Huang12 <ahuang12@lenovo.com>
> Cc: iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> jsnitsel@redhat.com
> Subject: [External] Re: [PATCH] Revert "iommu/amd: Treat per-device exclusion
> ranges as r/w unity-mapped regions"
> 
> Forgot CC-ing Jerry, add him.
> 
> On 09/23/20 at 10:26am, Baoquan He wrote:
> > A regression failure of kdump kernel boot was reported on a HPE system.
> > Bisect points at commit 387caf0b759ac43 ("iommu/amd: Treat per-device
> > exclusion ranges as r/w unity-mapped regions") as criminal. Reverting
> > it fix the failure.
> >
> > With the commit, kdump kernel will always print below error message,
> > then naturally AMD iommu can't function normally during kdump kernel
> bootup.
> >
> >   ~~~~~~~~~
> >   AMD-Vi: [Firmware Bug]: IVRS invalid checksum
> >
> > Why commit 387caf0b759ac43 causing it haven't been made clear.
> 
> Hi Joerg, Adrian
> 
> We only have one machine which can reproduce the issue, it's a gen10-01 of
> HPE. If any log or info are needed, please let me know, I can attach here.

Could you please provide the following info?
1. The booting log for both system kernel and kdump kernel by appending the kernel parameter 'amd_iommu_dump'
2. ACPI table (# acpidump > acpi-table) -> Send out the file 'acpi-table'. 

-- Adrian
 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions"
  2020-09-23  2:26 [PATCH] Revert "iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions" Baoquan He
  2020-09-23  2:32 ` Baoquan He
@ 2020-09-24  9:08 ` Joerg Roedel
  1 sibling, 0 replies; 4+ messages in thread
From: Joerg Roedel @ 2020-09-24  9:08 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, ahuang12, linux-kernel

On Wed, Sep 23, 2020 at 10:26:55AM +0800, Baoquan He wrote:
> A regression failure of kdump kernel boot was reported on a HPE system.
> Bisect points at commit 387caf0b759ac43 ("iommu/amd: Treat per-device
> exclusion ranges as r/w unity-mapped regions") as criminal. Reverting it
> fix the failure.
> 
> With the commit, kdump kernel will always print below error message, then
> naturally AMD iommu can't function normally during kdump kernel bootup.
> 
>   ~~~~~~~~~
>   AMD-Vi: [Firmware Bug]: IVRS invalid checksum
> 
> Why commit 387caf0b759ac43 causing it haven't been made clear.

I think this should be debugged further, in future IOMMUs the exclusion
range feature will not be available anymore (mmio-fields get re-used for
SNP). So starting to use them again is not going to work.

Regards,

	Joerg

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-09-24  9:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-23  2:26 [PATCH] Revert "iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions" Baoquan He
2020-09-23  2:32 ` Baoquan He
2020-09-23 14:29   ` [External] " Adrian Huang12
2020-09-24  9:08 ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).