linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4
@ 2013-01-31 18:33 Shuah Khan
  2013-02-01 13:00 ` Joerg Roedel
  0 siblings, 1 reply; 11+ messages in thread
From: Shuah Khan @ 2013-01-31 18:33 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: LKML, stable, iommu, shuahkhan

Joerg,

I am seeing IO_PAGE_FAULTs on AMD system running releases prior to 3.7.
I focused my debug and testing on 3.4. I am hoping to find a solution
for this problem in 3.4. I don't see any IO_PAGE_FAULTs with 3.7 and
later releases on this system.

On this system BIOS specifies Unity mapped (direct mapped) exclusion
ranges in IVMDs for several devices. These regions are in use during
BIOS hand-off to kernel and continue to be used during kernel boot and
run-time.

Access to these ranges continues to work with no errors until AMD IOMMU
driver disables and re-enables IOMMU in enable_iommus(). These faults
don't persist and appear between the enable_iommus() call and before
amd_iommu_init() gets done printing "AMD-Vi: Lazy IO/TLB flushing
enabled" message.

Read requests from device 02:00.2 and write request from device 03:00.0
to these unity mapped regions fail. The reason appears to be because
domain id is 0.

Domain gets assigned in amd_iommu_init_dma_ops() and unity maps are
handled. I don't see enable_iommus() doing anything to these unity
mapped exclusion ranges. So I am assuming that is not the issue,
however, could domain ids get flushed? More like, why do these faults
show up in this window? These are direct mapped, so there is no need for
any translations.

Please see below for IVMD dump and IO_PAGE_FAULT analysis.

Dump of these ranges from dmesg:

[    5.322280] AMD-Vi: IVMD_TYPE_ALL             devid_start: 00:00.0
devid_end: 04:00.3 range_start: 00000000000f0000 range_end:
0000000000100000 flags: 7
[    5.322367] AMD-Vi: IVMD_TYPE_ALL             devid_start: 00:00.0
devid_end: 04:00.3 range_start: 00000000bff70000 range_end:
00000000bfff0000 flags: 7
[    5.322454] AMD-Vi: IVMD_TYPE_ALL             devid_start: 00:00.0
devid_end: 04:00.3 range_start: 00000000000e8000 range_end:
00000000000e9000 flags: 7
[    5.322540] AMD-Vi: IVMD_TYPE_ALL             devid_start: 00:00.0
devid_end: 04:00.3 range_start: 00000000bdffe000 range_end:
00000000be000000 flags: 7
[    5.322627] AMD-Vi: IVMD_TYPE_ALL             devid_start: 00:00.0
devid_end: 04:00.3 range_start: 00000000bdff9000 range_end:
00000000bdffd000 flags: 7
[    5.322714] AMD-Vi: IVMD_TYPE_ALL             devid_start: 00:00.0
devid_end: 04:00.3 range_start: 00000000bdfe9000 range_end:
00000000bdff9000 flags: 7


Now to IO_PAGE_FAULT analysis: My observations in ""

[   15.281594] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.2
domain=0x0000 address=0x00000000bdffe000 flags=0x0050]
[   15.281861] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.2
domain=0x0000 address=0x00000000bdff9080 flags=0x0050]
[   15.281990] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.2
domain=0x0000 address=0x00000000bdff9100 flags=0x0050]

Domain ID is zero - "PASID not valid"
flags=0x0050 - "Bits PE and PR are set in the Event."
TR: translation TR=0
  "TR is 0 that means it is a transaction request"
RZ: reserved bit RZ=0
  "Since PR is set RZ is meaningful, I/O page fault is due to an invalid
   level encoding"
PE: permission indicator PE=1
  "Device doesn't have permission for this transaction"
RW: read-write RW=0
  "RW is meaningful since PR=1, TR=0, and I=0. It is a Read transaction"
PR: Present PR=1
  "PR = 1 means transaction is to a page marked present"
I: interrupt I=0
  "transaction is a memory request"
US: user-supervisor US=0
  "Supervisor privileges were asserted."
NX: no execute NX=0
  "0 upstream transaction lacks a PASID TLP prefix. Domain ID is zero."
GN: guest/nested GN=0
  "Transaction used a nested address (GPA)."

[   15.281733] AMD-Vi: Event logged [IO_PAGE_FAULT device=03:00.0
domain=0x0000 address=0x00000000bdff9160 flags=0x0070]

Domain ID is zero - "PASID is not valid"
flags=0x0070 - "Bits PE, RW, and PR are set in the Event."
TR: translation TR=0
  "TR is 0 that means it is a transaction request"
RZ: reserved bit RZ=0
  "Since PR is set RZ is meaningful, I/O page fault is due to an invalid
   level encoding"
PE: permission indicator PE=1
  "Device doesn't have permission for this transaction"
RW: read-write RW=1
  "RW is meaningful since PR=1, TR=0, and I=0. It is a Write
transaction"
PR: Present PR=1
  "PR = 1 means transaction is to a page marked present"
I: interrupt I=0
  "transaction is a memory request"
US: user-supervisor US=0
  "Supervisor privileges were asserted."
NX: no execute NX=0
  "0 upstream transaction lacks a PASID TLP prefix. Domain ID is zero."
GN: guest/nested GN=0
  "Transaction used a nested address (GPA)."

Thanks,
-- Shuah


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4
  2013-01-31 18:33 IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4 Shuah Khan
@ 2013-02-01 13:00 ` Joerg Roedel
  2013-02-01 18:31   ` Shuah Khan
  0 siblings, 1 reply; 11+ messages in thread
From: Joerg Roedel @ 2013-02-01 13:00 UTC (permalink / raw)
  To: Shuah Khan; +Cc: LKML, stable, iommu, shuahkhan

Hi Shuah,

On Thu, Jan 31, 2013 at 11:33:30AM -0700, Shuah Khan wrote:
> Access to these ranges continues to work with no errors until AMD IOMMU
> driver disables and re-enables IOMMU in enable_iommus(). These faults
> don't persist and appear between the enable_iommus() call and before
> amd_iommu_init() gets done printing "AMD-Vi: Lazy IO/TLB flushing
> enabled" message.

Hmm, okay. I had a look into the v3.4 sources. This looks like a race
condition. The IOMMUs are enabled in amd_iommu_init_hardware() but the
unity-mapped regions are created later in amd_iommu_init_dma_ops(). This
leaves a small window where the page-faults happen that you see.

But I am not sure why this doesn't hit on 3.7 and above. The race is
still there. Anyway, definitly something that needs to be fixed.


	Joerg



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4
  2013-02-01 13:00 ` Joerg Roedel
@ 2013-02-01 18:31   ` Shuah Khan
  2013-02-05 13:31     ` Joerg Roedel
  0 siblings, 1 reply; 11+ messages in thread
From: Shuah Khan @ 2013-02-01 18:31 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: LKML, stable, iommu, shuahkhan

[-- Attachment #1: Type: text/plain, Size: 2606 bytes --]

On Fri, 2013-02-01 at 14:00 +0100, Joerg Roedel wrote:
> Hi Shuah,
> 
> On Thu, Jan 31, 2013 at 11:33:30AM -0700, Shuah Khan wrote:
> > Access to these ranges continues to work with no errors until AMD IOMMU
> > driver disables and re-enables IOMMU in enable_iommus(). These faults
> > don't persist and appear between the enable_iommus() call and before
> > amd_iommu_init() gets done printing "AMD-Vi: Lazy IO/TLB flushing
> > enabled" message.
> 
> Hmm, okay. I had a look into the v3.4 sources. This looks like a race
> condition. The IOMMUs are enabled in amd_iommu_init_hardware() but the
> unity-mapped regions are created later in amd_iommu_init_dma_ops(). This
> leaves a small window where the page-faults happen that you see.
> 
> But I am not sure why this doesn't hit on 3.7 and above. The race is
> still there. Anyway, definitly something that needs to be fixed.
> 

Hi Joerg,

Yes, 3.7 has the same window of opportunity for this race condition,
however I couldn't figure out why it doesn't happen on 3.7. On 3.7 the
window between amd_iommu_init_hardware() and amd_iommu_init_dma_ops()
might actually be wider than the window in 3.4.

I think understanding why it doesn't happen on 3.7 is probably key. On
3.6, I experimented with back-porting your Split device table
initialization patch (33f28c59e18d83fd2aeef258d211be66b9b80eb3) from 3.7
and the patch that moved iommu_init from subsys_initcall() to
arch_initcall() and that solved the problem on 3.6. I am attaching those
patches. I can't easily back-port either one of those to 3.4 though.

That experiment made me think that this problem has something to do with
when device_table gets initialized vs. dma_ops are initialized. However,
there is no change to when unity mapped regions are created in 3.4 and
3.7.

If you look at 3.4 initialization sequence closely, you will notice that
init_device_table() gets called before init_iommu_all() and
init_memory_definitions() get done.

Another big difference is 3.4 init_device_table() sets DEV_ENTRY_VALID,
and DEV_ENTRY_TRANSLATION bits way earlier than 3.7 and these bits get
set in init_device_table_dma() which is called much later in 3.7.

init_unity_mappings_for_device() has a strong dependency on pci
sub-system having been initialized. Is it possible to move it up closer
to amd_iommu_init_hardware()?

I have a system I can reproduce the problem easily and I have a tried
making a few changes to the initialization sequence, with no results.
Any thoughts what other changes should I be looking at to solve the
problem besides the ones I already tried.

Thanks,
-- Shuah






[-- Attachment #2: 0001-iommu-amd-delay-dma-init-right-before-dma_ops-are-in.patch --]
[-- Type: text/x-patch, Size: 2475 bytes --]

>From a91c02486e5ecb332c4e63d2ec35262c573c6631 Mon Sep 17 00:00:00 2001
From: Shuah Khan <shuahkhan@gmail.com>
Date: Fri, 21 Dec 2012 16:28:03 -0700
Subject: [PATCH] iommu/amd: delay dma init right before dma_ops are
 initialized - backport


Signed-off-by: Shuah Khan <shuahkhan@gmail.com>
---
 drivers/iommu/amd_iommu_init.c |   45 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 18a89b7..33a9a6e 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1308,7 +1308,7 @@ static int __init init_memory_definitions(struct acpi_table_header *table)
  * Init the device table to not allow DMA access for devices and
  * suppress all page faults
  */
-static void init_device_table(void)
+static void init_device_table_dma(void)
 {
 	u32 devid;
 
@@ -1318,6 +1318,32 @@ static void init_device_table(void)
 	}
 }
 
+/*
+Dont' need - probably
+static void __init uninit_device_table_dma(void)
+{
+	u32 devid;
+
+	for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
+		amd_iommu_dev_table[devid].data[0] = 0ULL;
+		amd_iommu_dev_table[devid].data[1] = 0ULL;
+	}
+}
+*/
+
+static void init_device_table(void)
+{
+/*
+Dont' need - probably
+
+	u32 devid;
+
+	for (devid = 0; devid <= amd_iommu_last_bdf; ++devid)
+		set_dev_entry_bit(devid, DEV_ENTRY_IRQ_TBL_EN);
+*/
+	return;
+}
+
 static void iommu_init_flags(struct amd_iommu *iommu)
 {
 	iommu->acpi_flags & IVHD_FLAG_HT_TUN_EN_MASK ?
@@ -1494,6 +1520,16 @@ static void __init free_on_init_error(void)
 #endif
 }
 
+static void __init free_dma_resources(void)
+{
+	amd_iommu_uninit_devices();
+
+	free_pages((unsigned long)amd_iommu_pd_alloc_bitmap,
+		   get_order(MAX_DOMAIN_ID/8));
+
+	free_unity_maps();
+}
+
 /*
  * This is the hardware init function for AMD IOMMU in the system.
  * This function is called either from amd_iommu_init or from the interrupt
@@ -1657,8 +1693,14 @@ static bool detect_ivrs(void)
 
 static int amd_iommu_init_dma(void)
 {
+	struct amd_iommu *iommu;
 	int ret;
 
+	init_device_table_dma();
+
+	for_each_iommu(iommu)
+		iommu_flush_all_caches(iommu);
+
 	if (iommu_pass_through)
 		ret = amd_iommu_init_passthrough();
 	else
@@ -1762,6 +1804,7 @@ static int __init amd_iommu_init(void)
 
 	ret = iommu_go_to_state(IOMMU_INITIALIZED);
 	if (ret) {
+		free_dma_resources();
 		disable_iommus();
 		free_on_init_error();
 	}
-- 
1.7.9.5


[-- Attachment #3: iommu-moving-initialization-earlier.patch --]
[-- Type: text/x-patch, Size: 373 bytes --]

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ddbdaca..1065a1a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -861,7 +861,7 @@ static int __init iommu_init(void)
 
 	return 0;
 }
-subsys_initcall(iommu_init);
+arch_initcall(iommu_init);
 
 int iommu_domain_get_attr(struct iommu_domain *domain,
 			  enum iommu_attr attr, void *data)

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4
  2013-02-01 18:31   ` Shuah Khan
@ 2013-02-05 13:31     ` Joerg Roedel
  2013-02-05 13:57       ` Shuah Khan
  0 siblings, 1 reply; 11+ messages in thread
From: Joerg Roedel @ 2013-02-05 13:31 UTC (permalink / raw)
  To: Shuah Khan; +Cc: LKML, stable, iommu, shuahkhan

Hi Shuah,

On Fri, Feb 01, 2013 at 11:31:59AM -0700, Shuah Khan wrote:
> Yes, 3.7 has the same window of opportunity for this race condition,
> however I couldn't figure out why it doesn't happen on 3.7. On 3.7 the
> window between amd_iommu_init_hardware() and amd_iommu_init_dma_ops()
> might actually be wider than the window in 3.4.

I think this is highly timing related. IOMMU initialization may have
been moved by a few milliseconds between the kernel versions which might
cause the warnings to appear or disappear. I don't think it has much
value to dive deeper into the differences between the initialization
sequences.

As somethimes with such issues there is a simple and a more complex fix
for that. I'll try to come up with a simple fix for the next merge
window and implement the clean and more complex one for the next one.


	Joerg



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4
  2013-02-05 13:31     ` Joerg Roedel
@ 2013-02-05 13:57       ` Shuah Khan
  2013-02-06 12:12         ` Joerg Roedel
  0 siblings, 1 reply; 11+ messages in thread
From: Shuah Khan @ 2013-02-05 13:57 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Shuah Khan, LKML, stable, iommu

On Tue, Feb 5, 2013 at 6:31 AM, Joerg Roedel <joro@8bytes.org> wrote:
> Hi Shuah,
>
> On Fri, Feb 01, 2013 at 11:31:59AM -0700, Shuah Khan wrote:
>> Yes, 3.7 has the same window of opportunity for this race condition,
>> however I couldn't figure out why it doesn't happen on 3.7. On 3.7 the
>> window between amd_iommu_init_hardware() and amd_iommu_init_dma_ops()
>> might actually be wider than the window in 3.4.
>
> I think this is highly timing related. IOMMU initialization may have
> been moved by a few milliseconds between the kernel versions which might
> cause the warnings to appear or disappear. I don't think it has much
> value to dive deeper into the differences between the initialization
> sequences.
>
> As somethimes with such issues there is a simple and a more complex fix
> for that. I'll try to come up with a simple fix for the next merge
> window and implement the clean and more complex one for the next one.
>

Hi Joerg,

Thanks much. I will hang on to this test system for testing your fix.

-- Shuah

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4
  2013-02-05 13:57       ` Shuah Khan
@ 2013-02-06 12:12         ` Joerg Roedel
  2013-02-07  2:40           ` Shuah Khan
  0 siblings, 1 reply; 11+ messages in thread
From: Joerg Roedel @ 2013-02-06 12:12 UTC (permalink / raw)
  To: Shuah Khan; +Cc: Shuah Khan, LKML, stable, iommu

On Tue, Feb 05, 2013 at 06:57:21AM -0700, Shuah Khan wrote:
> Thanks much. I will hang on to this test system for testing your fix.

Okay, here is the simple fix for v3.8-rc6. I guess it is not
straighforward to port it to v3.4, but it should be doable.

>From 2ecf57c85e67e0243b36b787d0490c0b47202ba8 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <joro@8bytes.org>
Date: Wed, 6 Feb 2013 12:55:23 +0100
Subject: [PATCH] iommu/amd: Initialize device table after dma_ops

When dma_ops are initialized the unity mappings are
created. The init_device_table_dma() function makes sure DMA
from all devices is blocked by default. This opens a short
window in time where DMA to unity mapped regions is blocked
by the IOMMU. Make sure this does not happen by initializing
the device table after dma_ops.

Signed-off-by: Joerg Roedel <joro@8bytes.org>
---
 drivers/iommu/amd_iommu_init.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index faf10ba..b6ecddb 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1876,11 +1876,6 @@ static int amd_iommu_init_dma(void)
 	struct amd_iommu *iommu;
 	int ret;
 
-	init_device_table_dma();
-
-	for_each_iommu(iommu)
-		iommu_flush_all_caches(iommu);
-
 	if (iommu_pass_through)
 		ret = amd_iommu_init_passthrough();
 	else
@@ -1889,6 +1884,11 @@ static int amd_iommu_init_dma(void)
 	if (ret)
 		return ret;
 
+	init_device_table_dma();
+
+	for_each_iommu(iommu)
+		iommu_flush_all_caches(iommu);
+
 	amd_iommu_init_api();
 
 	amd_iommu_init_notifier();
-- 
1.7.9.5



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4
  2013-02-06 12:12         ` Joerg Roedel
@ 2013-02-07  2:40           ` Shuah Khan
  2013-02-11 19:49             ` Greg KH
  0 siblings, 1 reply; 11+ messages in thread
From: Shuah Khan @ 2013-02-07  2:40 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: LKML, stable, iommu, Greg KH, shuahkhan

On Wed, 2013-02-06 at 13:12 +0100, Joerg Roedel wrote:
> On Tue, Feb 05, 2013 at 06:57:21AM -0700, Shuah Khan wrote:
> > Thanks much. I will hang on to this test system for testing your fix.
> 
> Okay, here is the simple fix for v3.8-rc6. I guess it is not
> straighforward to port it to v3.4, but it should be doable.
> 
> From 2ecf57c85e67e0243b36b787d0490c0b47202ba8 Mon Sep 17 00:00:00 2001
> From: Joerg Roedel <joro@8bytes.org>
> Date: Wed, 6 Feb 2013 12:55:23 +0100
> Subject: [PATCH] iommu/amd: Initialize device table after dma_ops
> 
> When dma_ops are initialized the unity mappings are
> created. The init_device_table_dma() function makes sure DMA
> from all devices is blocked by default. This opens a short
> window in time where DMA to unity mapped regions is blocked
> by the IOMMU. Make sure this does not happen by initializing
> the device table after dma_ops.
> 
> Signed-off-by: Joerg Roedel <joro@8bytes.org>

Joerg,

I tested your patch on 3.8. I was able to reproduce the problem and then
apply your patch to verify that the problem is fixed. This patch applies
cleanly to 3.7.6, however I could not reproduce the problem on 3.7.6
without the patch. But the window exists on 3.7 as well. Your patch can
be applied to 3.7.6 as is.

I back-ported the patch to 3.4 and 3.0 and tested. I am sending those
patches after this email.

On 3.4.29 and 3.0.62 I was able to reproduce the problem and then
applied the back-ported patch to verify that the problem is fixed.

Thanks again for the fix.

-- Shuah


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4
  2013-02-07  2:40           ` Shuah Khan
@ 2013-02-11 19:49             ` Greg KH
  2013-02-11 20:17               ` Shuah Khan
  2013-02-11 20:57               ` Shuah Khan
  0 siblings, 2 replies; 11+ messages in thread
From: Greg KH @ 2013-02-11 19:49 UTC (permalink / raw)
  To: Shuah Khan; +Cc: Joerg Roedel, LKML, stable, iommu, shuahkhan

On Wed, Feb 06, 2013 at 07:40:50PM -0700, Shuah Khan wrote:
> On Wed, 2013-02-06 at 13:12 +0100, Joerg Roedel wrote:
> > On Tue, Feb 05, 2013 at 06:57:21AM -0700, Shuah Khan wrote:
> > > Thanks much. I will hang on to this test system for testing your fix.
> > 
> > Okay, here is the simple fix for v3.8-rc6. I guess it is not
> > straighforward to port it to v3.4, but it should be doable.
> > 
> > From 2ecf57c85e67e0243b36b787d0490c0b47202ba8 Mon Sep 17 00:00:00 2001
> > From: Joerg Roedel <joro@8bytes.org>
> > Date: Wed, 6 Feb 2013 12:55:23 +0100
> > Subject: [PATCH] iommu/amd: Initialize device table after dma_ops
> > 
> > When dma_ops are initialized the unity mappings are
> > created. The init_device_table_dma() function makes sure DMA
> > from all devices is blocked by default. This opens a short
> > window in time where DMA to unity mapped regions is blocked
> > by the IOMMU. Make sure this does not happen by initializing
> > the device table after dma_ops.
> > 
> > Signed-off-by: Joerg Roedel <joro@8bytes.org>
> 
> Joerg,
> 
> I tested your patch on 3.8. I was able to reproduce the problem and then
> apply your patch to verify that the problem is fixed. This patch applies
> cleanly to 3.7.6, however I could not reproduce the problem on 3.7.6
> without the patch. But the window exists on 3.7 as well. Your patch can
> be applied to 3.7.6 as is.
> 
> I back-ported the patch to 3.4 and 3.0 and tested. I am sending those
> patches after this email.
> 
> On 3.4.29 and 3.0.62 I was able to reproduce the problem and then
> applied the back-ported patch to verify that the problem is fixed.
> 
> Thanks again for the fix.

I'm lost here, why isn't this patch in Linus's tree already?  You seem
to be sending backports for something that isn't there to backport yet.

Please resend these patches to stable@vger.kernel.org, with the upstream
git commit id, when they are in Linus's tree, there's nothing I can do
with them for now, sorry.

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4
  2013-02-11 19:49             ` Greg KH
@ 2013-02-11 20:17               ` Shuah Khan
  2013-02-11 20:57               ` Shuah Khan
  1 sibling, 0 replies; 11+ messages in thread
From: Shuah Khan @ 2013-02-11 20:17 UTC (permalink / raw)
  To: Greg KH; +Cc: Shuah Khan, Joerg Roedel, LKML, stable, iommu

On Mon, Feb 11, 2013 at 12:49 PM, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Wed, Feb 06, 2013 at 07:40:50PM -0700, Shuah Khan wrote:
>> On Wed, 2013-02-06 at 13:12 +0100, Joerg Roedel wrote:
>> > On Tue, Feb 05, 2013 at 06:57:21AM -0700, Shuah Khan wrote:
>> > > Thanks much. I will hang on to this test system for testing your fix.
>> >
>> > Okay, here is the simple fix for v3.8-rc6. I guess it is not
>> > straighforward to port it to v3.4, but it should be doable.
>> >
>> > From 2ecf57c85e67e0243b36b787d0490c0b47202ba8 Mon Sep 17 00:00:00 2001
>> > From: Joerg Roedel <joro@8bytes.org>
>> > Date: Wed, 6 Feb 2013 12:55:23 +0100
>> > Subject: [PATCH] iommu/amd: Initialize device table after dma_ops
>> >
>> > When dma_ops are initialized the unity mappings are
>> > created. The init_device_table_dma() function makes sure DMA
>> > from all devices is blocked by default. This opens a short
>> > window in time where DMA to unity mapped regions is blocked
>> > by the IOMMU. Make sure this does not happen by initializing
>> > the device table after dma_ops.
>> >
>> > Signed-off-by: Joerg Roedel <joro@8bytes.org>
>>
>> Joerg,
>>
>> I tested your patch on 3.8. I was able to reproduce the problem and then
>> apply your patch to verify that the problem is fixed. This patch applies
>> cleanly to 3.7.6, however I could not reproduce the problem on 3.7.6
>> without the patch. But the window exists on 3.7 as well. Your patch can
>> be applied to 3.7.6 as is.
>>
>> I back-ported the patch to 3.4 and 3.0 and tested. I am sending those
>> patches after this email.
>>
>> On 3.4.29 and 3.0.62 I was able to reproduce the problem and then
>> applied the back-ported patch to verify that the problem is fixed.
>>
>> Thanks again for the fix.
>
> I'm lost here, why isn't this patch in Linus's tree already?  You seem
> to be sending backports for something that isn't there to backport yet.
>
> Please resend these patches to stable@vger.kernel.org, with the upstream
> git commit id, when they are in Linus's tree, there's nothing I can do
> with them for now, sorry.
>
> greg k-h

Greg,

I will resend the patch when I have the commit id from Linus's tree.

Thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4
  2013-02-11 19:49             ` Greg KH
  2013-02-11 20:17               ` Shuah Khan
@ 2013-02-11 20:57               ` Shuah Khan
  2013-02-11 22:18                 ` Joerg Roedel
  1 sibling, 1 reply; 11+ messages in thread
From: Shuah Khan @ 2013-02-11 20:57 UTC (permalink / raw)
  To: Greg KH; +Cc: Joerg Roedel, LKML, stable, iommu

On Mon, 2013-02-11 at 11:49 -0800, Greg KH wrote:
> On Wed, Feb 06, 2013 at 07:40:50PM -0700, Shuah Khan wrote:
> > On Wed, 2013-02-06 at 13:12 +0100, Joerg Roedel wrote:
> > > On Tue, Feb 05, 2013 at 06:57:21AM -0700, Shuah Khan wrote:
> > > > Thanks much. I will hang on to this test system for testing your fix.
> > > 
> > > Okay, here is the simple fix for v3.8-rc6. I guess it is not
> > > straighforward to port it to v3.4, but it should be doable.
> > > 
> > > From 2ecf57c85e67e0243b36b787d0490c0b47202ba8 Mon Sep 17 00:00:00 2001
> > > From: Joerg Roedel <joro@8bytes.org>
> > > Date: Wed, 6 Feb 2013 12:55:23 +0100
> > > Subject: [PATCH] iommu/amd: Initialize device table after dma_ops
> > > 
> > > When dma_ops are initialized the unity mappings are
> > > created. The init_device_table_dma() function makes sure DMA
> > > from all devices is blocked by default. This opens a short
> > > window in time where DMA to unity mapped regions is blocked
> > > by the IOMMU. Make sure this does not happen by initializing
> > > the device table after dma_ops.
> > > 
> > > Signed-off-by: Joerg Roedel <joro@8bytes.org>
> > 
> > Joerg,
> > 
> > I tested your patch on 3.8. I was able to reproduce the problem and then
> > apply your patch to verify that the problem is fixed. This patch applies
> > cleanly to 3.7.6, however I could not reproduce the problem on 3.7.6
> > without the patch. But the window exists on 3.7 as well. Your patch can
> > be applied to 3.7.6 as is.
> > 
> > I back-ported the patch to 3.4 and 3.0 and tested. I am sending those
> > patches after this email.
> > 
> > On 3.4.29 and 3.0.62 I was able to reproduce the problem and then
> > applied the back-ported patch to verify that the problem is fixed.
> > 
> > Thanks again for the fix.
> 
> I'm lost here, why isn't this patch in Linus's tree already?  You seem
> to be sending backports for something that isn't there to backport yet.
> 
> Please resend these patches to stable@vger.kernel.org, with the upstream
> git commit id, when they are in Linus's tree, there's nothing I can do
> with them for now, sorry.
> 
> greg k-h

I was hoping Joerg's patch would make it into Linus's tree by now. I
tested the original patch and did the back-port to 3.4 and 3.0 at the
same time, before I loose the test system.

No worries. I will resend the patch when I have the commit id. Sorry for
the confusion.

Thanks,
-- Shuah


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4
  2013-02-11 20:57               ` Shuah Khan
@ 2013-02-11 22:18                 ` Joerg Roedel
  0 siblings, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2013-02-11 22:18 UTC (permalink / raw)
  To: Shuah Khan; +Cc: Greg KH, LKML, stable, iommu

On Mon, Feb 11, 2013 at 01:57:03PM -0700, Shuah Khan wrote:
> I was hoping Joerg's patch would make it into Linus's tree by now. I
> tested the original patch and did the back-port to 3.4 and 3.0 at the
> same time, before I loose the test system.

I will send the patch with the next merge window. It already has the
stable-tag and should be noticed when it hits Linus' tree. Since Shuah's
system is the only one I know of which actually implements IVMD entries
in the IVRS table the problem seemed not to be important enough to send
another pull-request for v3.8.

Regards,

	Joerg



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-02-11 22:18 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-31 18:33 IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4 Shuah Khan
2013-02-01 13:00 ` Joerg Roedel
2013-02-01 18:31   ` Shuah Khan
2013-02-05 13:31     ` Joerg Roedel
2013-02-05 13:57       ` Shuah Khan
2013-02-06 12:12         ` Joerg Roedel
2013-02-07  2:40           ` Shuah Khan
2013-02-11 19:49             ` Greg KH
2013-02-11 20:17               ` Shuah Khan
2013-02-11 20:57               ` Shuah Khan
2013-02-11 22:18                 ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).