linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel
@ 2017-06-16  8:15 Joerg Roedel
  2017-06-23  8:57 ` Baoquan He
  0 siblings, 1 reply; 7+ messages in thread
From: Joerg Roedel @ 2017-06-16  8:15 UTC (permalink / raw)
  To: iommu; +Cc: linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

When booting into a kdump kernel, suppress IO_PAGE_FAULTs by
default for all devices. But allow the faults again when a
domain is assigned to a device.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c       | 3 ++-
 drivers/iommu/amd_iommu_init.c  | 9 +++++++++
 drivers/iommu/amd_iommu_types.h | 1 +
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 80efa72..623ab53 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2050,7 +2050,8 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, bool ats)
 		flags    |= tmp;
 	}
 
-	flags &= ~(0xffffUL);
+
+	flags &= ~(DTE_FLAG_SA | 0xffffULL);
 	flags |= domain->id;
 
 	amd_iommu_dev_table[devid].data[1]  = flags;
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 5a11328..d9f5ddd 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -29,6 +29,7 @@
 #include <linux/export.h>
 #include <linux/iommu.h>
 #include <linux/kmemleak.h>
+#include <linux/crash_dump.h>
 #include <asm/pci-direct.h>
 #include <asm/iommu.h>
 #include <asm/gart.h>
@@ -1898,6 +1899,14 @@ static void init_device_table_dma(void)
 	for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
 		set_dev_entry_bit(devid, DEV_ENTRY_VALID);
 		set_dev_entry_bit(devid, DEV_ENTRY_TRANSLATION);
+		/*
+		 * In kdump kernels in-flight DMA from the old kernel might
+		 * cause IO_PAGE_FAULTs. There are no reports that a kdump
+		 * actually failed because of that, so just disable fault
+		 * reporting in the hardware to get rid of the messages
+		 */
+		if (is_kdump_kernel())
+			set_dev_entry_bit(devid, DEV_ENTRY_NO_PAGE_FAULT);
 	}
 }
 
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 4de8f41..4cad9b3 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -322,6 +322,7 @@
 #define IOMMU_PTE_IW (1ULL << 62)
 
 #define DTE_FLAG_IOTLB	(1ULL << 32)
+#define DTE_FLAG_SA	(1ULL << 34)
 #define DTE_FLAG_GV	(1ULL << 55)
 #define DTE_FLAG_MASK	(0x3ffULL << 32)
 #define DTE_GLX_SHIFT	(56)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel
  2017-06-16  8:15 [PATCH] iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel Joerg Roedel
@ 2017-06-23  8:57 ` Baoquan He
  2017-06-23 11:43   ` Baoquan He
  0 siblings, 1 reply; 7+ messages in thread
From: Baoquan He @ 2017-06-23  8:57 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, Joerg Roedel, linux-kernel

Hi dear Joerg,

On 06/16/17 at 10:15am, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> When booting into a kdump kernel, suppress IO_PAGE_FAULTs by
> default for all devices. But allow the faults again when a
> domain is assigned to a device.

I have two bugs at hand reported by customer, saying their system hang
with amd iommu on. I remember I borrowed the system and found it hang very
early so that no one knew what's happened. One time it printed several lines
of boot message and I found it's amd iommu system, adding amd_iommu=off
to make the system boot normally.

And with the kdump fix of amd iommu patchset applied, kdump kernel boots
well. So maybe suppressing the fault message is not enough.

Thanks
Baoquan

> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  drivers/iommu/amd_iommu.c       | 3 ++-
>  drivers/iommu/amd_iommu_init.c  | 9 +++++++++
>  drivers/iommu/amd_iommu_types.h | 1 +
>  3 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 80efa72..623ab53 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2050,7 +2050,8 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, bool ats)
>  		flags    |= tmp;
>  	}
>  
> -	flags &= ~(0xffffUL);
> +
> +	flags &= ~(DTE_FLAG_SA | 0xffffULL);
>  	flags |= domain->id;
>  
>  	amd_iommu_dev_table[devid].data[1]  = flags;
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 5a11328..d9f5ddd 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -29,6 +29,7 @@
>  #include <linux/export.h>
>  #include <linux/iommu.h>
>  #include <linux/kmemleak.h>
> +#include <linux/crash_dump.h>
>  #include <asm/pci-direct.h>
>  #include <asm/iommu.h>
>  #include <asm/gart.h>
> @@ -1898,6 +1899,14 @@ static void init_device_table_dma(void)
>  	for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
>  		set_dev_entry_bit(devid, DEV_ENTRY_VALID);
>  		set_dev_entry_bit(devid, DEV_ENTRY_TRANSLATION);
> +		/*
> +		 * In kdump kernels in-flight DMA from the old kernel might
> +		 * cause IO_PAGE_FAULTs. There are no reports that a kdump
> +		 * actually failed because of that, so just disable fault
> +		 * reporting in the hardware to get rid of the messages
> +		 */
> +		if (is_kdump_kernel())
> +			set_dev_entry_bit(devid, DEV_ENTRY_NO_PAGE_FAULT);
>  	}
>  }
>  
> diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
> index 4de8f41..4cad9b3 100644
> --- a/drivers/iommu/amd_iommu_types.h
> +++ b/drivers/iommu/amd_iommu_types.h
> @@ -322,6 +322,7 @@
>  #define IOMMU_PTE_IW (1ULL << 62)
>  
>  #define DTE_FLAG_IOTLB	(1ULL << 32)
> +#define DTE_FLAG_SA	(1ULL << 34)
>  #define DTE_FLAG_GV	(1ULL << 55)
>  #define DTE_FLAG_MASK	(0x3ffULL << 32)
>  #define DTE_GLX_SHIFT	(56)
> -- 
> 2.7.4
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel
  2017-06-23  8:57 ` Baoquan He
@ 2017-06-23 11:43   ` Baoquan He
  2017-06-26 10:07     ` Joerg Roedel
  0 siblings, 1 reply; 7+ messages in thread
From: Baoquan He @ 2017-06-23 11:43 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, Joerg Roedel, linux-kernel

Hi Joerg,

On 06/23/17 at 04:57pm, Baoquan He wrote:
> Hi dear Joerg,
> 
> On 06/16/17 at 10:15am, Joerg Roedel wrote:
> > From: Joerg Roedel <jroedel@suse.de>
> > 
> > When booting into a kdump kernel, suppress IO_PAGE_FAULTs by
> > default for all devices. But allow the faults again when a
> > domain is assigned to a device.
> 
> I have two bugs at hand reported by customer, saying their system hang
> with amd iommu on. I remember I borrowed the system and found it hang very
> early so that no one knew what's happened. One time it printed several lines
> of boot message and I found it's amd iommu system, adding amd_iommu=off
> to make the system boot normally.
> 
> And with the kdump fix of amd iommu patchset applied, kdump kernel boots
> well. So maybe suppressing the fault message is not enough.

Do you think whether it's necessary to continue my kdump fix of amd iommu
patchset? Seems my last post was in Jan this year. I know you are very
busy on fixing bugs and reviewing tons of patches. Without your
guidance and reviewing, I absolutely can't make it. So I would like to
hear your suggestions and idea.

I focused on kaslr issues recently, now most of them have been
fixed. My boss discussed with me about the next plan. If you have other
plan, I can sync it to our team about the status of upstream.

Thanks
Baoquan


> > ---
> >  drivers/iommu/amd_iommu.c       | 3 ++-
> >  drivers/iommu/amd_iommu_init.c  | 9 +++++++++
> >  drivers/iommu/amd_iommu_types.h | 1 +
> >  3 files changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> > index 80efa72..623ab53 100644
> > --- a/drivers/iommu/amd_iommu.c
> > +++ b/drivers/iommu/amd_iommu.c
> > @@ -2050,7 +2050,8 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, bool ats)
> >  		flags    |= tmp;
> >  	}
> >  
> > -	flags &= ~(0xffffUL);
> > +
> > +	flags &= ~(DTE_FLAG_SA | 0xffffULL);
> >  	flags |= domain->id;
> >  
> >  	amd_iommu_dev_table[devid].data[1]  = flags;
> > diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> > index 5a11328..d9f5ddd 100644
> > --- a/drivers/iommu/amd_iommu_init.c
> > +++ b/drivers/iommu/amd_iommu_init.c
> > @@ -29,6 +29,7 @@
> >  #include <linux/export.h>
> >  #include <linux/iommu.h>
> >  #include <linux/kmemleak.h>
> > +#include <linux/crash_dump.h>
> >  #include <asm/pci-direct.h>
> >  #include <asm/iommu.h>
> >  #include <asm/gart.h>
> > @@ -1898,6 +1899,14 @@ static void init_device_table_dma(void)
> >  	for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
> >  		set_dev_entry_bit(devid, DEV_ENTRY_VALID);
> >  		set_dev_entry_bit(devid, DEV_ENTRY_TRANSLATION);
> > +		/*
> > +		 * In kdump kernels in-flight DMA from the old kernel might
> > +		 * cause IO_PAGE_FAULTs. There are no reports that a kdump
> > +		 * actually failed because of that, so just disable fault
> > +		 * reporting in the hardware to get rid of the messages
> > +		 */
> > +		if (is_kdump_kernel())
> > +			set_dev_entry_bit(devid, DEV_ENTRY_NO_PAGE_FAULT);
> >  	}
> >  }
> >  
> > diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
> > index 4de8f41..4cad9b3 100644
> > --- a/drivers/iommu/amd_iommu_types.h
> > +++ b/drivers/iommu/amd_iommu_types.h
> > @@ -322,6 +322,7 @@
> >  #define IOMMU_PTE_IW (1ULL << 62)
> >  
> >  #define DTE_FLAG_IOTLB	(1ULL << 32)
> > +#define DTE_FLAG_SA	(1ULL << 34)
> >  #define DTE_FLAG_GV	(1ULL << 55)
> >  #define DTE_FLAG_MASK	(0x3ffULL << 32)
> >  #define DTE_GLX_SHIFT	(56)
> > -- 
> > 2.7.4
> > 
> > _______________________________________________
> > iommu mailing list
> > iommu@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel
  2017-06-23 11:43   ` Baoquan He
@ 2017-06-26 10:07     ` Joerg Roedel
  2017-06-26 10:25       ` Baoquan He
  2017-07-20 13:17       ` Baoquan He
  0 siblings, 2 replies; 7+ messages in thread
From: Joerg Roedel @ 2017-06-26 10:07 UTC (permalink / raw)
  To: Baoquan He; +Cc: Joerg Roedel, iommu, linux-kernel

Hi Baoquan,

On Fri, Jun 23, 2017 at 07:43:10PM +0800, Baoquan He wrote:
> Do you think whether it's necessary to continue my kdump fix of amd iommu
> patchset? Seems my last post was in Jan this year. I know you are very
> busy on fixing bugs and reviewing tons of patches. Without your
> guidance and reviewing, I absolutely can't make it. So I would like to
> hear your suggestions and idea.
> 
> I focused on kaslr issues recently, now most of them have been
> fixed. My boss discussed with me about the next plan. If you have other
> plan, I can sync it to our team about the status of upstream.

Sorry for my silence on the patches, I have not yet found the time to
look deeply into them. I am still interested in them, so how about you
do a rebase/repost after the next merge window and then I will take the
time for a more in depth review and discussion?


Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel
  2017-06-26 10:07     ` Joerg Roedel
@ 2017-06-26 10:25       ` Baoquan He
  2017-07-20 13:17       ` Baoquan He
  1 sibling, 0 replies; 7+ messages in thread
From: Baoquan He @ 2017-06-26 10:25 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Joerg Roedel, iommu, linux-kernel

On 06/26/17 at 12:07pm, Joerg Roedel wrote:
> Hi Baoquan,
> 
> On Fri, Jun 23, 2017 at 07:43:10PM +0800, Baoquan He wrote:
> > Do you think whether it's necessary to continue my kdump fix of amd iommu
> > patchset? Seems my last post was in Jan this year. I know you are very
> > busy on fixing bugs and reviewing tons of patches. Without your
> > guidance and reviewing, I absolutely can't make it. So I would like to
> > hear your suggestions and idea.
> > 
> > I focused on kaslr issues recently, now most of them have been
> > fixed. My boss discussed with me about the next plan. If you have other
> > plan, I can sync it to our team about the status of upstream.
> 
> Sorry for my silence on the patches, I have not yet found the time to
> look deeply into them. I am still interested in them, so how about you
> do a rebase/repost after the next merge window and then I will take the
> time for a more in depth review and discussion?

Thanks a lot for reply. Totally understood, so many patches need be
reviewed, and so many different iommu types.

Sure, let me do it right away. I may need a little time to recall the
details, go through the spec and code again. It won't take too long.

Thanks
Baoquan
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel
  2017-06-26 10:07     ` Joerg Roedel
  2017-06-26 10:25       ` Baoquan He
@ 2017-07-20 13:17       ` Baoquan He
  2017-07-20 13:27         ` Joerg Roedel
  1 sibling, 1 reply; 7+ messages in thread
From: Baoquan He @ 2017-07-20 13:17 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Joerg Roedel, iommu, linux-kernel

Hi Joerg,

On 06/26/17 at 12:07pm, Joerg Roedel wrote:
> Hi Baoquan,
> 
> On Fri, Jun 23, 2017 at 07:43:10PM +0800, Baoquan He wrote:
> > Do you think whether it's necessary to continue my kdump fix of amd iommu
> > patchset? Seems my last post was in Jan this year. I know you are very
> > busy on fixing bugs and reviewing tons of patches. Without your
> > guidance and reviewing, I absolutely can't make it. So I would like to
> > hear your suggestions and idea.
> > 
> > I focused on kaslr issues recently, now most of them have been
> > fixed. My boss discussed with me about the next plan. If you have other
> > plan, I can sync it to our team about the status of upstream.
> 
> Sorry for my silence on the patches, I have not yet found the time to
> look deeply into them. I am still interested in them, so how about you
> do a rebase/repost after the next merge window and then I will take the
> time for a more in depth review and discussion?

I have rebased the amd iommu fix patches of kdump kernel on the latest
upstream kernel, can I send them to you to have a look? Or just send to
iommu and lkml mailing list?

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel
  2017-07-20 13:17       ` Baoquan He
@ 2017-07-20 13:27         ` Joerg Roedel
  0 siblings, 0 replies; 7+ messages in thread
From: Joerg Roedel @ 2017-07-20 13:27 UTC (permalink / raw)
  To: Baoquan He; +Cc: Joerg Roedel, iommu, linux-kernel

Hi Baoquan,

On Thu, Jul 20, 2017 at 09:17:43PM +0800, Baoquan He wrote:
> I have rebased the amd iommu fix patches of kdump kernel on the latest
> upstream kernel, can I send them to you to have a look? Or just send to
> iommu and lkml mailing list?

Please send them to me and Cc iommu and lkml.


Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-07-20 13:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-16  8:15 [PATCH] iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel Joerg Roedel
2017-06-23  8:57 ` Baoquan He
2017-06-23 11:43   ` Baoquan He
2017-06-26 10:07     ` Joerg Roedel
2017-06-26 10:25       ` Baoquan He
2017-07-20 13:17       ` Baoquan He
2017-07-20 13:27         ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).