linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu
@ 2017-07-21  8:58 Baoquan He
  2017-07-21  8:58 ` [PATCH v8 01/13] iommu/amd: Detect pre enabled translation Baoquan He
                   ` (12 more replies)
  0 siblings, 13 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:58 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

When kernel panicked and jump into the kdump kernel, DMA started by the
1st kernel is not stopped, this is called on-flight DMA. In the current
code it will disable iommu and build new translation table and attach
device to it. This will cause:

 1. IO_PAGE_FAULT warning message can be seen.
 2. transfer data to or from incorrect areas of memory.

Sometime it causes the dump failure or kernel hang.

The principle of the fix is to defer the assignment of device to domain to
device driver initializtion stage. A new call-back is_attach_deferred() is
added to iommu-ops, will check whether we need defer the domain attach/detach
in iommu-core code. If defer is needed, just return directly from amd iommu
attach/detach function. The attachment will be done in device driver
initializaiton stage when calling get_domain().

Change history:
v8:v7:
    Rebase patchset v7 on the latest v4.13-rc1.

    - And re-enable printing IO_PAGE_FAULT message in kdump kernel.
    - Only disable iommu if amd_iommu=off is specified in kdump kernel.


v6->v7:
    Two main changes are made according to Joerg's suggestion:
    - Add is_attach_deferred call-back to iommu-ops. With this domain
      can be deferred to device driver init cleanly.

    - Allocate memory below 4G for dev table if translation pre-enabled.
      AMD engineer pointed out that it's unsafe to update the device-table
      while iommu is enabled. device-table pointer update is split up into
      two 32bit writes in the IOMMU hardware. So updating it while the IOMMU
      is enabled could have some nasty side effects.

v5->v6:
    According to Joerg's comments made several below main changes:
    - Add sanity check when copy old dev tables.

    - If a device is set up with guest translations (DTE.GV=1), then don't
      copy that information but move the device over to an empty guest-cr3
      table and handle the faults in the PPR log (which just answer them
      with INVALID).

v5:
    bnx2 NIC can't reset itself during driver init. Post patch to reset
    it during driver init. IO_PAGE_FAULT can't be seen anymore.

    Below is link of v5 post.
    https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html


Baoquan He (12):
  iommu/amd: Detect pre enabled translation
  iommu/amd: add several helper functions
  Revert "iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel"
  iommu/amd: Define bit fields for DTE particularly
  iommu/amd: Add function copy_dev_tables()
  iommu/amd: copy old trans table from old kernel
  iommu/amd: Do sanity check for irq remap of old dev table entry
  iommu: Add is_attach_deferred call-back to iommu-ops
  iommu/amd: Use is_attach_deferred call-back
  iommu/amd: Allocate memory below 4G for dev table if translation
    pre-enabled
  iommu/amd: Don't copy GCR3 table root pointer
  iommu/amd: Clear out the GV flag when handle deferred domain attach

root (1):
  iommu/amd: Disable iommu only if amd_iommu=off is specified

 drivers/iommu/amd_iommu.c       |  81 ++++++++-------
 drivers/iommu/amd_iommu_init.c  | 212 ++++++++++++++++++++++++++++++++++++----
 drivers/iommu/amd_iommu_proto.h |   2 +
 drivers/iommu/amd_iommu_types.h |  56 ++++++++++-
 drivers/iommu/amd_iommu_v2.c    |  18 +++-
 drivers/iommu/iommu.c           |   8 ++
 include/linux/iommu.h           |   1 +
 7 files changed, 315 insertions(+), 63 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v8 01/13] iommu/amd: Detect pre enabled translation
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
@ 2017-07-21  8:58 ` Baoquan He
  2017-07-27 15:04   ` Joerg Roedel
  2017-07-21  8:59 ` [PATCH v8 02/13] iommu/amd: add several helper functions Baoquan He
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:58 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

Add functions to check whether translation is already enabled in IOMMU.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu_init.c  | 24 ++++++++++++++++++++++++
 drivers/iommu/amd_iommu_proto.h |  1 +
 drivers/iommu/amd_iommu_types.h |  4 ++++
 3 files changed, 29 insertions(+)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 5cc597b383c7..e39857ce6481 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -258,6 +258,25 @@ static int amd_iommu_enable_interrupts(void);
 static int __init iommu_go_to_state(enum iommu_init_state state);
 static void init_device_table_dma(void);
 
+bool translation_pre_enabled(struct amd_iommu *iommu)
+{
+	return (iommu->flags & AMD_IOMMU_FLAG_TRANS_PRE_ENABLED);
+}
+
+static void clear_translation_pre_enabled(struct amd_iommu *iommu)
+{
+	iommu->flags &= ~AMD_IOMMU_FLAG_TRANS_PRE_ENABLED;
+}
+
+static void init_translation_status(struct amd_iommu *iommu)
+{
+	u32 ctrl;
+
+	ctrl = readl(iommu->mmio_base + MMIO_CONTROL_OFFSET);
+	if (ctrl & (1<<CONTROL_IOMMU_EN))
+		iommu->flags |= AMD_IOMMU_FLAG_TRANS_PRE_ENABLED;
+}
+
 static inline void update_last_devid(u16 devid)
 {
 	if (devid > amd_iommu_last_bdf)
@@ -1399,6 +1418,11 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h)
 
 	iommu->int_enabled = false;
 
+	init_translation_status(iommu);
+
+	if (translation_pre_enabled(iommu))
+		pr_warn("Translation is already enabled - trying to copy translation structures\n");
+
 	ret = init_iommu_from_acpi(iommu, h);
 	if (ret)
 		return ret;
diff --git a/drivers/iommu/amd_iommu_proto.h b/drivers/iommu/amd_iommu_proto.h
index 466260f8a1df..a9666d2005bb 100644
--- a/drivers/iommu/amd_iommu_proto.h
+++ b/drivers/iommu/amd_iommu_proto.h
@@ -87,4 +87,5 @@ static inline bool iommu_feature(struct amd_iommu *iommu, u64 f)
 	return !!(iommu->features & f);
 }
 
+extern bool translation_pre_enabled(struct amd_iommu *iommu);
 #endif /* _ASM_X86_AMD_IOMMU_PROTO_H  */
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 294a409e283b..d15966b62b33 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -417,6 +417,7 @@ extern struct kmem_cache *amd_iommu_irq_cache;
 #define APERTURE_PAGE_INDEX(a)	(((a) >> 21) & 0x3fULL)
 
 
+
 /*
  * This struct is used to pass information about
  * incoming PPR faults around.
@@ -435,6 +436,8 @@ struct iommu_domain;
 struct irq_domain;
 struct amd_irte_ops;
 
+#define AMD_IOMMU_FLAG_TRANS_PRE_ENABLED      (1 << 0)
+
 /*
  * This structure contains generic data for  IOMMU protection domains
  * independent of their use.
@@ -569,6 +572,7 @@ struct amd_iommu {
 	struct amd_irte_ops *irte_ops;
 #endif
 
+	u32 flags;
 	volatile u64 __aligned(8) cmd_sem;
 };
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 02/13] iommu/amd: add several helper functions
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
  2017-07-21  8:58 ` [PATCH v8 01/13] iommu/amd: Detect pre enabled translation Baoquan He
@ 2017-07-21  8:59 ` Baoquan He
  2017-07-27 15:06   ` Joerg Roedel
  2017-07-21  8:59 ` [PATCH v8 03/13] Revert "iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel" Baoquan He
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:59 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

Move single iommu enabling codes into a wrapper function early_enable_iommu().
This can make later kdump change easier.

And also add iommu_disable_command_buffer and iommu_disable_event_buffer
for later usage.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu_init.c | 42 +++++++++++++++++++++++++++++++-----------
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index e39857ce6481..4ca6e3257d92 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -634,6 +634,14 @@ static void iommu_enable_command_buffer(struct amd_iommu *iommu)
 	amd_iommu_reset_cmd_buffer(iommu);
 }
 
+/*
+ * This function disables the command buffer
+ */
+static void iommu_disable_command_buffer(struct amd_iommu *iommu)
+{
+	iommu_feature_disable(iommu, CONTROL_CMDBUF_EN);
+}
+
 static void __init free_command_buffer(struct amd_iommu *iommu)
 {
 	free_pages((unsigned long)iommu->cmd_buf, get_order(CMD_BUFFER_SIZE));
@@ -666,6 +674,14 @@ static void iommu_enable_event_buffer(struct amd_iommu *iommu)
 	iommu_feature_enable(iommu, CONTROL_EVT_LOG_EN);
 }
 
+/*
+ * This function disables the command buffer
+ */
+static void iommu_disable_event_buffer(struct amd_iommu *iommu)
+{
+	iommu_feature_disable(iommu, CONTROL_EVT_LOG_EN);
+}
+
 static void __init free_event_buffer(struct amd_iommu *iommu)
 {
 	free_pages((unsigned long)iommu->evt_buf, get_order(EVT_BUFFER_SIZE));
@@ -2046,6 +2062,19 @@ static void iommu_enable_ga(struct amd_iommu *iommu)
 #endif
 }
 
+static void early_enable_iommu(struct amd_iommu *iommu)
+{
+	iommu_disable(iommu);
+	iommu_init_flags(iommu);
+	iommu_set_device_table(iommu);
+	iommu_enable_command_buffer(iommu);
+	iommu_enable_event_buffer(iommu);
+	iommu_set_exclusion_range(iommu);
+	iommu_enable_ga(iommu);
+	iommu_enable(iommu);
+	iommu_flush_all_caches(iommu);
+}
+
 /*
  * This function finally enables all IOMMUs found in the system after
  * they have been initialized
@@ -2054,17 +2083,8 @@ static void early_enable_iommus(void)
 {
 	struct amd_iommu *iommu;
 
-	for_each_iommu(iommu) {
-		iommu_disable(iommu);
-		iommu_init_flags(iommu);
-		iommu_set_device_table(iommu);
-		iommu_enable_command_buffer(iommu);
-		iommu_enable_event_buffer(iommu);
-		iommu_set_exclusion_range(iommu);
-		iommu_enable_ga(iommu);
-		iommu_enable(iommu);
-		iommu_flush_all_caches(iommu);
-	}
+	for_each_iommu(iommu)
+		early_enable_iommu(iommu);
 
 #ifdef CONFIG_IRQ_REMAP
 	if (AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir))
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 03/13] Revert "iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel"
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
  2017-07-21  8:58 ` [PATCH v8 01/13] iommu/amd: Detect pre enabled translation Baoquan He
  2017-07-21  8:59 ` [PATCH v8 02/13] iommu/amd: add several helper functions Baoquan He
@ 2017-07-21  8:59 ` Baoquan He
  2017-07-21  8:59 ` [PATCH v8 04/13] iommu/amd: Define bit fields for DTE particularly Baoquan He
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:59 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

This reverts commit 54bd63570484167cb13edf81e31fff107b879981.

We still need the IO_PAGE_FAULT message to warn error after the
issue of on-flight dma in kdump kernel is fixed.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu.c       | 3 +--
 drivers/iommu/amd_iommu_init.c  | 9 ---------
 drivers/iommu/amd_iommu_types.h | 1 -
 3 files changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 688e77576e5a..e8a6d8109564 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2086,8 +2086,7 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, bool ats)
 		flags    |= tmp;
 	}
 
-
-	flags &= ~(DTE_FLAG_SA | 0xffffULL);
+	flags &= ~(0xffffUL);
 	flags |= domain->id;
 
 	amd_iommu_dev_table[devid].data[1]  = flags;
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 4ca6e3257d92..f6da5fe03b31 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -29,7 +29,6 @@
 #include <linux/export.h>
 #include <linux/iommu.h>
 #include <linux/kmemleak.h>
-#include <linux/crash_dump.h>
 #include <asm/pci-direct.h>
 #include <asm/iommu.h>
 #include <asm/gart.h>
@@ -1942,14 +1941,6 @@ static void init_device_table_dma(void)
 	for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
 		set_dev_entry_bit(devid, DEV_ENTRY_VALID);
 		set_dev_entry_bit(devid, DEV_ENTRY_TRANSLATION);
-		/*
-		 * In kdump kernels in-flight DMA from the old kernel might
-		 * cause IO_PAGE_FAULTs. There are no reports that a kdump
-		 * actually failed because of that, so just disable fault
-		 * reporting in the hardware to get rid of the messages
-		 */
-		if (is_kdump_kernel())
-			set_dev_entry_bit(devid, DEV_ENTRY_NO_PAGE_FAULT);
 	}
 }
 
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index d15966b62b33..608e81ca5e92 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -322,7 +322,6 @@
 #define IOMMU_PTE_IW (1ULL << 62)
 
 #define DTE_FLAG_IOTLB	(1ULL << 32)
-#define DTE_FLAG_SA	(1ULL << 34)
 #define DTE_FLAG_GV	(1ULL << 55)
 #define DTE_FLAG_MASK	(0x3ffULL << 32)
 #define DTE_GLX_SHIFT	(56)
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 04/13] iommu/amd: Define bit fields for DTE particularly
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
                   ` (2 preceding siblings ...)
  2017-07-21  8:59 ` [PATCH v8 03/13] Revert "iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel" Baoquan He
@ 2017-07-21  8:59 ` Baoquan He
  2017-07-21  8:59 ` [PATCH v8 05/13] iommu/amd: Add function copy_dev_tables() Baoquan He
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:59 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

In AMD-Vi spec several bits of IO PTE fields and DTE fields are similar
so that both of them can share the same MACRO definition. However
defining them respectively can make code more read-able. Do it now.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu.c       |  8 ++++----
 drivers/iommu/amd_iommu_types.h | 18 ++++++++++++++----
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index e8a6d8109564..e5a03f259986 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1537,9 +1537,9 @@ static int iommu_map_page(struct protection_domain *dom,
 
 	if (count > 1) {
 		__pte = PAGE_SIZE_PTE(phys_addr, page_size);
-		__pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_P | IOMMU_PTE_FC;
+		__pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_PR | IOMMU_PTE_FC;
 	} else
-		__pte = phys_addr | IOMMU_PTE_P | IOMMU_PTE_FC;
+		__pte = phys_addr | IOMMU_PTE_PR | IOMMU_PTE_FC;
 
 	if (prot & IOMMU_PROT_IR)
 		__pte |= IOMMU_PTE_IR;
@@ -2053,7 +2053,7 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, bool ats)
 
 	pte_root |= (domain->mode & DEV_ENTRY_MODE_MASK)
 		    << DEV_ENTRY_MODE_SHIFT;
-	pte_root |= IOMMU_PTE_IR | IOMMU_PTE_IW | IOMMU_PTE_P | IOMMU_PTE_TV;
+	pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
 
 	flags = amd_iommu_dev_table[devid].data[1];
 
@@ -2096,7 +2096,7 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, bool ats)
 static void clear_dte_entry(u16 devid)
 {
 	/* remove entry from the device table seen by the hardware */
-	amd_iommu_dev_table[devid].data[0]  = IOMMU_PTE_P | IOMMU_PTE_TV;
+	amd_iommu_dev_table[devid].data[0]  = DTE_FLAG_V | DTE_FLAG_TV;
 	amd_iommu_dev_table[devid].data[1] &= DTE_FLAG_MASK;
 
 	amd_iommu_apply_erratum_63(devid);
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 608e81ca5e92..ea36af19b5b9 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -265,7 +265,7 @@
 #define PM_LEVEL_INDEX(x, a)	(((a) >> PM_LEVEL_SHIFT((x))) & 0x1ffULL)
 #define PM_LEVEL_ENC(x)		(((x) << 9) & 0xe00ULL)
 #define PM_LEVEL_PDE(x, a)	((a) | PM_LEVEL_ENC((x)) | \
-				 IOMMU_PTE_P | IOMMU_PTE_IR | IOMMU_PTE_IW)
+				 IOMMU_PTE_PR | IOMMU_PTE_IR | IOMMU_PTE_IW)
 #define PM_PTE_LEVEL(pte)	(((pte) >> 9) & 0x7ULL)
 
 #define PM_MAP_4k		0
@@ -314,13 +314,23 @@
 #define PTE_LEVEL_PAGE_SIZE(level)			\
 	(1ULL << (12 + (9 * (level))))
 
-#define IOMMU_PTE_P  (1ULL << 0)
-#define IOMMU_PTE_TV (1ULL << 1)
+/*
+ * Bit value definition for I/O PTE fields
+ */
+#define IOMMU_PTE_PR (1ULL << 0)
 #define IOMMU_PTE_U  (1ULL << 59)
 #define IOMMU_PTE_FC (1ULL << 60)
 #define IOMMU_PTE_IR (1ULL << 61)
 #define IOMMU_PTE_IW (1ULL << 62)
 
+/*
+ * Bit value definition for DTE fields
+ */
+#define DTE_FLAG_V  (1ULL << 0)
+#define DTE_FLAG_TV (1ULL << 1)
+#define DTE_FLAG_IR (1ULL << 61)
+#define DTE_FLAG_IW (1ULL << 62)
+
 #define DTE_FLAG_IOTLB	(1ULL << 32)
 #define DTE_FLAG_GV	(1ULL << 55)
 #define DTE_FLAG_MASK	(0x3ffULL << 32)
@@ -342,7 +352,7 @@
 #define GCR3_VALID		0x01ULL
 
 #define IOMMU_PAGE_MASK (((1ULL << 52) - 1) & ~0xfffULL)
-#define IOMMU_PTE_PRESENT(pte) ((pte) & IOMMU_PTE_P)
+#define IOMMU_PTE_PRESENT(pte) ((pte) & IOMMU_PTE_PR)
 #define IOMMU_PTE_PAGE(pte) (phys_to_virt((pte) & IOMMU_PAGE_MASK))
 #define IOMMU_PTE_MODE(pte) (((pte) >> 9) & 0x07)
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 05/13] iommu/amd: Add function copy_dev_tables()
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
                   ` (3 preceding siblings ...)
  2017-07-21  8:59 ` [PATCH v8 04/13] iommu/amd: Define bit fields for DTE particularly Baoquan He
@ 2017-07-21  8:59 ` Baoquan He
  2017-07-27 15:29   ` Joerg Roedel
  2017-07-21  8:59 ` [PATCH v8 06/13] iommu/amd: copy old trans table from old kernel Baoquan He
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:59 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

Add function copy_dev_tables to copy the old DEV table entries of the panicked
kernel to the new allocated DEV table. Since all iommus share the same DTE table
the copy only need be done one time. Besides, we also need to:

  - Check whether all IOMMUs actually use the same device table with the same size

  - Verify that the size of the old device table is the expected size.

  - Reserve the old domain id occupied in 1st kernel to avoid touching the old
    io-page tables. Then on-flight DMA can continue looking it up.

And also define MACRO DEV_DOMID_MASK to replace magic number 0xffffULL, it can be
reused in copy_dev_tables().

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu.c       |  2 +-
 drivers/iommu/amd_iommu_init.c  | 55 +++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/amd_iommu_types.h |  1 +
 3 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index e5a03f259986..4d00f1bda900 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2086,7 +2086,7 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, bool ats)
 		flags    |= tmp;
 	}
 
-	flags &= ~(0xffffUL);
+	flags &= ~DEV_DOMID_MASK;
 	flags |= domain->id;
 
 	amd_iommu_dev_table[devid].data[1]  = flags;
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index f6da5fe03b31..c58f091ce232 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -842,6 +842,61 @@ static int get_dev_entry_bit(u16 devid, u8 bit)
 }
 
 
+static int copy_dev_tables(void)
+{
+	struct dev_table_entry *old_devtb = NULL;
+	u32 lo, hi, devid, old_devtb_size;
+	phys_addr_t old_devtb_phys;
+	u64 entry, last_entry = 0;
+	struct amd_iommu *iommu;
+	u16 dom_id, dte_v;
+	static int copied;
+
+	for_each_iommu(iommu) {
+		if (!translation_pre_enabled(iommu)) {
+			pr_err("IOMMU:%d is not pre-enabled!/n",
+				iommu->index);
+			return -1;
+		}
+
+		/* All IOMMUs should use the same device table with the same size */
+		lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET);
+		hi = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET + 4);
+		entry = (((u64) hi) << 32) + lo;
+		if (last_entry && last_entry != entry) {
+			pr_err("IOMMU:%d should use the same dev table as others!/n",
+				iommu->index);
+			return -1;
+		}
+		last_entry = entry;
+
+		old_devtb_size = ((entry & ~PAGE_MASK) + 1) << 12;
+		if (old_devtb_size != dev_table_size) {
+			pr_err("The device table size of IOMMU:%d is not expected!/n",
+				iommu->index);
+			return -1;
+		}
+
+		old_devtb_phys = entry & PAGE_MASK;
+		old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
+		if (!old_devtb)
+			return -1;
+
+		if (copied)
+			continue;
+		for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
+			amd_iommu_dev_table[devid] = old_devtb[devid];
+			dom_id = old_devtb[devid].data[1] & DEV_DOMID_MASK;
+			dte_v = old_devtb[devid].data[0] & DTE_FLAG_V;
+			if (dte_v && dom_id)
+				__set_bit(dom_id, amd_iommu_pd_alloc_bitmap);
+		}
+		memunmap(old_devtb);
+		copied = 1;
+	}
+	return 0;
+}
+
 void amd_iommu_apply_erratum_63(u16 devid)
 {
 	int sysmgt;
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index ea36af19b5b9..1c06bcc06f5c 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -336,6 +336,7 @@
 #define DTE_FLAG_MASK	(0x3ffULL << 32)
 #define DTE_GLX_SHIFT	(56)
 #define DTE_GLX_MASK	(3)
+#define DEV_DOMID_MASK	0xffffULL
 
 #define DTE_GCR3_VAL_A(x)	(((x) >> 12) & 0x00007ULL)
 #define DTE_GCR3_VAL_B(x)	(((x) >> 15) & 0x0ffffULL)
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 06/13] iommu/amd: copy old trans table from old kernel
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
                   ` (4 preceding siblings ...)
  2017-07-21  8:59 ` [PATCH v8 05/13] iommu/amd: Add function copy_dev_tables() Baoquan He
@ 2017-07-21  8:59 ` Baoquan He
  2017-07-27 15:38   ` Joerg Roedel
  2017-07-21  8:59 ` [PATCH v8 07/13] iommu/amd: Do sanity check for irq remap of old dev table entry Baoquan He
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:59 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

Here several things need be done:
- If iommu is pre-enabled in a normal kernel, just disable it and print
  warning.

- If failed to copy dev table of old kernel, continue to proceed as
  it does in normal kernel.

- Disable and Re-enable event/cmd buffer,  install the copied DTE table
  to reg, and detect and enable guest vapic.

- Flush all caches

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu_init.c | 51 ++++++++++++++++++++++++++++++++++++------
 1 file changed, 44 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index c58f091ce232..aa9b5918d11f 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -37,6 +37,7 @@
 #include <asm/io_apic.h>
 #include <asm/irq_remapping.h>
 
+#include <linux/crash_dump.h>
 #include "amd_iommu_proto.h"
 #include "amd_iommu_types.h"
 #include "irq_remapping.h"
@@ -1489,9 +1490,12 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h)
 	iommu->int_enabled = false;
 
 	init_translation_status(iommu);
-
-	if (translation_pre_enabled(iommu))
-		pr_warn("Translation is already enabled - trying to copy translation structures\n");
+	if (translation_pre_enabled(iommu) && !is_kdump_kernel()) {
+		iommu_disable(iommu);
+		clear_translation_pre_enabled(iommu);
+		pr_warn("Translation was enabled for IOMMU:%d but we are not in kdump mode\n",
+			iommu->index);
+	}
 
 	ret = init_iommu_from_acpi(iommu, h);
 	if (ret)
@@ -1986,8 +1990,7 @@ static int __init init_memory_definitions(struct acpi_table_header *table)
 }
 
 /*
- * Init the device table to not allow DMA access for devices and
- * suppress all page faults
+ * Init the device table to not allow DMA access for devices
  */
 static void init_device_table_dma(void)
 {
@@ -2128,9 +2131,43 @@ static void early_enable_iommu(struct amd_iommu *iommu)
 static void early_enable_iommus(void)
 {
 	struct amd_iommu *iommu;
+	bool is_pre_enabled = false;
 
-	for_each_iommu(iommu)
-		early_enable_iommu(iommu);
+	for_each_iommu(iommu) {
+		if (translation_pre_enabled(iommu)) {
+			is_pre_enabled = true;
+			break;
+		}
+	}
+
+	if (!is_pre_enabled) {
+		for_each_iommu(iommu)
+			early_enable_iommu(iommu);
+	} else {
+		pr_warn("Translation is already enabled - trying to copy translation structures\n");
+		if (copy_dev_tables()) {
+			pr_err("Failed to copy DEV table from previous kernel.\n");
+			/*
+			 * If failed to copy dev tables from old kernel, continue to proceed
+			 * as it does in normal kernel.
+			 */
+			for_each_iommu(iommu) {
+				clear_translation_pre_enabled(iommu);
+				early_enable_iommu(iommu);
+			}
+		} else {
+			pr_info("Copied DEV table from previous kernel.\n");
+			for_each_iommu(iommu) {
+				iommu_disable_command_buffer(iommu);
+				iommu_disable_event_buffer(iommu);
+				iommu_enable_command_buffer(iommu);
+				iommu_enable_event_buffer(iommu);
+				iommu_enable_ga(iommu);
+				iommu_set_device_table(iommu);
+				iommu_flush_all_caches(iommu);
+			}
+		}
+	}
 
 #ifdef CONFIG_IRQ_REMAP
 	if (AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir))
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 07/13] iommu/amd: Do sanity check for irq remap of old dev table entry
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
                   ` (5 preceding siblings ...)
  2017-07-21  8:59 ` [PATCH v8 06/13] iommu/amd: copy old trans table from old kernel Baoquan He
@ 2017-07-21  8:59 ` Baoquan He
  2017-07-21  8:59 ` [PATCH v8 08/13] iommu: Add is_attach_deferred call-back to iommu-ops Baoquan He
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:59 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

Firstly split the dev table entry copy into address translation part and
irq remapping part. Because these two parts could be enabled
independently.

Secondly check if IntCtl and IntTabLen are 10b and 1000b if they are
set.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu.c       |  5 -----
 drivers/iommu/amd_iommu_init.c  | 25 ++++++++++++++++++++++---
 drivers/iommu/amd_iommu_types.h |  8 ++++++++
 3 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4d00f1bda900..5508f57a3e4f 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3776,11 +3776,6 @@ EXPORT_SYMBOL(amd_iommu_device_info);
 
 static struct irq_chip amd_ir_chip;
 
-#define DTE_IRQ_PHYS_ADDR_MASK	(((1ULL << 45)-1) << 6)
-#define DTE_IRQ_REMAP_INTCTL    (2ULL << 60)
-#define DTE_IRQ_TABLE_LEN       (8ULL << 1)
-#define DTE_IRQ_REMAP_ENABLE    1ULL
-
 static void set_dte_irq_entry(u16 devid, struct irq_remap_table *table)
 {
 	u64 dte;
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index aa9b5918d11f..052fa4a977d8 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -845,12 +845,12 @@ static int get_dev_entry_bit(u16 devid, u8 bit)
 
 static int copy_dev_tables(void)
 {
+	u64 int_ctl, int_tab_len, entry, last_entry = 0;
 	struct dev_table_entry *old_devtb = NULL;
 	u32 lo, hi, devid, old_devtb_size;
 	phys_addr_t old_devtb_phys;
-	u64 entry, last_entry = 0;
 	struct amd_iommu *iommu;
-	u16 dom_id, dte_v;
+	u16 dom_id, dte_v, irq_v;
 	static int copied;
 
 	for_each_iommu(iommu) {
@@ -889,8 +889,27 @@ static int copy_dev_tables(void)
 			amd_iommu_dev_table[devid] = old_devtb[devid];
 			dom_id = old_devtb[devid].data[1] & DEV_DOMID_MASK;
 			dte_v = old_devtb[devid].data[0] & DTE_FLAG_V;
-			if (dte_v && dom_id)
+			if (dte_v && dom_id) {
+				amd_iommu_dev_table[devid].data[0]
+					= old_devtb[devid].data[0];
+				amd_iommu_dev_table[devid].data[1]
+					= old_devtb[devid].data[1];
 				__set_bit(dom_id, amd_iommu_pd_alloc_bitmap);
+			}
+
+			irq_v = old_devtb[devid].data[2] & DTE_IRQ_REMAP_ENABLE;
+			int_ctl = old_devtb[devid].data[2] & DTE_IRQ_REMAP_INTCTL_MASK;
+			int_tab_len = old_devtb[devid].data[2] & DTE_IRQ_TABLE_LEN_MASK;
+			if (irq_v && (int_ctl || int_tab_len)) {
+				if ((int_ctl != DTE_IRQ_REMAP_INTCTL) ||
+					 (int_tab_len != DTE_IRQ_TABLE_LEN)) {
+					pr_err("Wrong old irq remapping flag: %#x\n", devid);
+					return -1;
+				}
+
+				amd_iommu_dev_table[devid].data[2]
+					= old_devtb[devid].data[2];
+			}
 		}
 		memunmap(old_devtb);
 		copied = 1;
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 1c06bcc06f5c..7149fa52063f 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -250,6 +250,14 @@
 
 #define GA_GUEST_NR		0x1
 
+/* Bit value definition for dte irq remapping fields*/
+#define DTE_IRQ_PHYS_ADDR_MASK	(((1ULL << 45)-1) << 6)
+#define DTE_IRQ_REMAP_INTCTL_MASK	(0x3ULL << 60)
+#define DTE_IRQ_TABLE_LEN_MASK	(0xfULL << 1)
+#define DTE_IRQ_REMAP_INTCTL    (2ULL << 60)
+#define DTE_IRQ_TABLE_LEN       (8ULL << 1)
+#define DTE_IRQ_REMAP_ENABLE    1ULL
+
 #define PAGE_MODE_NONE    0x00
 #define PAGE_MODE_1_LEVEL 0x01
 #define PAGE_MODE_2_LEVEL 0x02
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 08/13] iommu: Add is_attach_deferred call-back to iommu-ops
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
                   ` (6 preceding siblings ...)
  2017-07-21  8:59 ` [PATCH v8 07/13] iommu/amd: Do sanity check for irq remap of old dev table entry Baoquan He
@ 2017-07-21  8:59 ` Baoquan He
  2017-07-21  8:59 ` [PATCH v8 09/13] iommu/amd: Use is_attach_deferred call-back Baoquan He
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:59 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

This new call-back will be used to check if the domain attach need be
deferred for now. If yes, the domain attach/detach will return directly.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/iommu.c | 8 ++++++++
 include/linux/iommu.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3f6ea160afed..86581b115b92 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1283,6 +1283,10 @@ static int __iommu_attach_device(struct iommu_domain *domain,
 				 struct device *dev)
 {
 	int ret;
+	if ((domain->ops->is_attach_deferred != NULL) &&
+	    domain->ops->is_attach_deferred(domain, dev))
+		return 0;
+
 	if (unlikely(domain->ops->attach_dev == NULL))
 		return -ENODEV;
 
@@ -1324,6 +1328,10 @@ EXPORT_SYMBOL_GPL(iommu_attach_device);
 static void __iommu_detach_device(struct iommu_domain *domain,
 				  struct device *dev)
 {
+	if ((domain->ops->is_attach_deferred != NULL) &&
+	    domain->ops->is_attach_deferred(domain, dev))
+		return;
+
 	if (unlikely(domain->ops->detach_dev == NULL))
 		return;
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 2cb54adc4a33..63983c9e6c3a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -225,6 +225,7 @@ struct iommu_ops {
 	u32 (*domain_get_windows)(struct iommu_domain *domain);
 
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
+	bool (*is_attach_deferred)(struct iommu_domain *domain, struct device *dev);
 
 	unsigned long pgsize_bitmap;
 };
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 09/13] iommu/amd: Use is_attach_deferred call-back
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
                   ` (7 preceding siblings ...)
  2017-07-21  8:59 ` [PATCH v8 08/13] iommu: Add is_attach_deferred call-back to iommu-ops Baoquan He
@ 2017-07-21  8:59 ` Baoquan He
  2017-07-27 15:53   ` Joerg Roedel
  2017-07-21  8:59 ` [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled Baoquan He
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:59 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

Implement call-back is_attach_deferred and use it to defer the
domain attach from iommu driver init to device driver init when
iommu is pre-enabled in kdump kernel.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 5508f57a3e4f..4ff413b34b51 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -121,6 +121,7 @@ struct iommu_dev_data {
 					     PPR completions */
 	u32 errata;			  /* Bitmap for errata to apply */
 	bool use_vapic;			  /* Enable device to use vapic mode */
+	bool defer_attach;
 
 	struct ratelimit_state rs;	  /* Ratelimit IOPF messages */
 };
@@ -371,12 +372,17 @@ static u16 get_alias(struct device *dev)
 static struct iommu_dev_data *find_dev_data(u16 devid)
 {
 	struct iommu_dev_data *dev_data;
+	struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
 
 	dev_data = search_dev_data(devid);
 
-	if (dev_data == NULL)
+	if (dev_data == NULL) {
 		dev_data = alloc_dev_data(devid);
 
+		if (translation_pre_enabled(iommu))
+			dev_data->defer_attach = true;
+	}
+
 	return dev_data;
 }
 
@@ -2477,11 +2483,18 @@ static struct iommu_group *amd_iommu_device_group(struct device *dev)
 static struct protection_domain *get_domain(struct device *dev)
 {
 	struct protection_domain *domain;
+	struct iommu_domain *io_domain;
 
 	if (!check_device(dev))
 		return ERR_PTR(-EINVAL);
 
 	domain = get_dev_data(dev)->domain;
+	if (domain == NULL && get_dev_data(dev)->defer_attach) {
+		get_dev_data(dev)->defer_attach = false;
+		io_domain = iommu_get_domain_for_dev(dev);
+		domain = to_pdomain(io_domain);
+		attach_device(dev, domain);
+	}
 	if (!dma_ops_domain(domain))
 		return ERR_PTR(-EBUSY);
 
@@ -3372,6 +3385,13 @@ static void amd_iommu_apply_resv_region(struct device *dev,
 	WARN_ON_ONCE(reserve_iova(&dma_dom->iovad, start, end) == NULL);
 }
 
+static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
+					 struct device *dev)
+{
+	struct iommu_dev_data *dev_data = dev->archdata.iommu;
+	return dev_data->defer_attach;
+}
+
 const struct iommu_ops amd_iommu_ops = {
 	.capable = amd_iommu_capable,
 	.domain_alloc = amd_iommu_domain_alloc,
@@ -3388,6 +3408,7 @@ const struct iommu_ops amd_iommu_ops = {
 	.get_resv_regions = amd_iommu_get_resv_regions,
 	.put_resv_regions = amd_iommu_put_resv_regions,
 	.apply_resv_region = amd_iommu_apply_resv_region,
+	.is_attach_deferred = amd_iommu_is_attach_deferred,
 	.pgsize_bitmap	= AMD_IOMMU_PGSIZES,
 };
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
                   ` (8 preceding siblings ...)
  2017-07-21  8:59 ` [PATCH v8 09/13] iommu/amd: Use is_attach_deferred call-back Baoquan He
@ 2017-07-21  8:59 ` Baoquan He
  2017-07-27 15:55   ` Joerg Roedel
  2017-07-21  8:59 ` [PATCH v8 11/13] iommu/amd: Don't copy GCR3 table root pointer Baoquan He
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:59 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

AMD pointed out it's unsafe to update the device-table while iommu
is enabled. It turns out that device-table pointer update is split
up into two 32bit writes in the IOMMU hardware. So updating it while
the IOMMU is enabled could have some nasty side effects.

The only way to work around this is to allocate the device-table below
4GB if translation is pre-enabled in kdump kernel. If allocation failed,
still use the old one.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu_init.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 052fa4a977d8..d7c301d0d672 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2149,11 +2149,23 @@ static void early_enable_iommu(struct amd_iommu *iommu)
  */
 static void early_enable_iommus(void)
 {
+	struct dev_table_entry *dev_tbl;
 	struct amd_iommu *iommu;
 	bool is_pre_enabled = false;
 
 	for_each_iommu(iommu) {
 		if (translation_pre_enabled(iommu)) {
+			gfp_t gfp_flag = GFP_KERNEL | __GFP_ZERO | GFP_DMA32;;
+
+			dev_tbl = (void *)__get_free_pages(gfp_flag,
+						get_order(dev_table_size));
+			if (dev_tbl != NULL) {
+				memcpy(dev_tbl, amd_iommu_dev_table, dev_table_size);
+				free_pages((unsigned long)amd_iommu_dev_table,
+						get_order(dev_table_size));
+				amd_iommu_dev_table = dev_tbl;
+			}
+
 			is_pre_enabled = true;
 			break;
 		}
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 11/13] iommu/amd: Don't copy GCR3 table root pointer
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
                   ` (9 preceding siblings ...)
  2017-07-21  8:59 ` [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled Baoquan He
@ 2017-07-21  8:59 ` Baoquan He
  2017-07-23 22:33   ` kbuild test robot
  2017-07-27 15:57   ` Joerg Roedel
  2017-07-21  8:59 ` [PATCH v8 12/13] iommu/amd: Clear out the GV flag when handle deferred domain attach Baoquan He
  2017-07-21  8:59 ` [PATCH v8 13/13] iommu/amd: Disable iommu only if amd_iommu=off is specified Baoquan He
  12 siblings, 2 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:59 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

When iommu is pre_enabled in kdump kernel, if a device is set up with
guest translations (DTE.GV=1), then don't copy GCR3 table root pointer
but move the device over to an empty guest-cr3 table and handle the
faults in the PPR log (which answer them with INVALID). After all these
PPR faults are recoverable for the device and we should not allow the
device to change old-kernels data when we don't have to.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu.c       | 28 +++-------------------------
 drivers/iommu/amd_iommu_init.c  | 11 +++++++++++
 drivers/iommu/amd_iommu_proto.h |  1 +
 drivers/iommu/amd_iommu_types.h | 24 ++++++++++++++++++++++++
 drivers/iommu/amd_iommu_v2.c    | 18 +++++++++++++++++-
 5 files changed, 56 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 4ff413b34b51..46d077784da0 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -103,30 +103,6 @@ int amd_iommu_max_glx_val = -1;
 static const struct dma_map_ops amd_iommu_dma_ops;
 
 /*
- * This struct contains device specific data for the IOMMU
- */
-struct iommu_dev_data {
-	struct list_head list;		  /* For domain->dev_list */
-	struct list_head dev_data_list;	  /* For global dev_data_list */
-	struct protection_domain *domain; /* Domain the device is bound to */
-	u16 devid;			  /* PCI Device ID */
-	u16 alias;			  /* Alias Device ID */
-	bool iommu_v2;			  /* Device can make use of IOMMUv2 */
-	bool passthrough;		  /* Device is identity mapped */
-	struct {
-		bool enabled;
-		int qdep;
-	} ats;				  /* ATS state */
-	bool pri_tlp;			  /* PASID TLB required for
-					     PPR completions */
-	u32 errata;			  /* Bitmap for errata to apply */
-	bool use_vapic;			  /* Enable device to use vapic mode */
-	bool defer_attach;
-
-	struct ratelimit_state rs;	  /* Ratelimit IOPF messages */
-};
-
-/*
  * general struct to manage commands send to an IOMMU
  */
 struct iommu_cmd {
@@ -386,10 +362,11 @@ static struct iommu_dev_data *find_dev_data(u16 devid)
 	return dev_data;
 }
 
-static struct iommu_dev_data *get_dev_data(struct device *dev)
+struct iommu_dev_data *get_dev_data(struct device *dev)
 {
 	return dev->archdata.iommu;
 }
+EXPORT_SYMBOL(get_dev_data);
 
 /*
 * Find or create an IOMMU group for a acpihid device.
@@ -2540,6 +2517,7 @@ static int dir2prot(enum dma_data_direction direction)
 	else
 		return 0;
 }
+
 /*
  * This function contains common code for mapping of a physically
  * contiguous memory region into DMA address space. It is used by all
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index d7c301d0d672..8b4bac978062 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -209,6 +209,7 @@ u16 *amd_iommu_alias_table;
  * for a specific device. It is also indexed by the PCI device id.
  */
 struct amd_iommu **amd_iommu_rlookup_table;
+EXPORT_SYMBOL(amd_iommu_rlookup_table);
 
 /*
  * This table is used to find the irq remapping table for a given device id
@@ -262,6 +263,7 @@ bool translation_pre_enabled(struct amd_iommu *iommu)
 {
 	return (iommu->flags & AMD_IOMMU_FLAG_TRANS_PRE_ENABLED);
 }
+EXPORT_SYMBOL(translation_pre_enabled);
 
 static void clear_translation_pre_enabled(struct amd_iommu *iommu)
 {
@@ -852,6 +854,7 @@ static int copy_dev_tables(void)
 	struct amd_iommu *iommu;
 	u16 dom_id, dte_v, irq_v;
 	static int copied;
+	u64 tmp;
 
 	for_each_iommu(iommu) {
 		if (!translation_pre_enabled(iommu)) {
@@ -895,6 +898,14 @@ static int copy_dev_tables(void)
 				amd_iommu_dev_table[devid].data[1]
 					= old_devtb[devid].data[1];
 				__set_bit(dom_id, amd_iommu_pd_alloc_bitmap);
+				/* If gcr3 table existed, mask it out */
+				if (old_devtb[devid].data[0] & DTE_FLAG_GV) {
+					tmp = DTE_GCR3_VAL_B(~0ULL) << DTE_GCR3_SHIFT_B;
+					tmp |= DTE_GCR3_VAL_C(~0ULL) << DTE_GCR3_SHIFT_C;
+					amd_iommu_dev_table[devid].data[1] &= ~tmp;
+					tmp = DTE_GCR3_VAL_A(~0ULL) << DTE_GCR3_SHIFT_A;
+					amd_iommu_dev_table[devid].data[0] &= ~tmp;
+				}
 			}
 
 			irq_v = old_devtb[devid].data[2] & DTE_IRQ_REMAP_ENABLE;
diff --git a/drivers/iommu/amd_iommu_proto.h b/drivers/iommu/amd_iommu_proto.h
index a9666d2005bb..90e62e9b01c5 100644
--- a/drivers/iommu/amd_iommu_proto.h
+++ b/drivers/iommu/amd_iommu_proto.h
@@ -88,4 +88,5 @@ static inline bool iommu_feature(struct amd_iommu *iommu, u64 f)
 }
 
 extern bool translation_pre_enabled(struct amd_iommu *iommu);
+extern struct iommu_dev_data *get_dev_data(struct device *dev);
 #endif /* _ASM_X86_AMD_IOMMU_PROTO_H  */
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 7149fa52063f..32c68f2fc1dc 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -619,6 +619,30 @@ struct devid_map {
 	bool cmd_line;
 };
 
+/*
+ * This struct contains device specific data for the IOMMU
+ */
+struct iommu_dev_data {
+	struct list_head list;		  /* For domain->dev_list */
+	struct list_head dev_data_list;	  /* For global dev_data_list */
+	struct protection_domain *domain; /* Domain the device is bound to */
+	u16 devid;			  /* PCI Device ID */
+	u16 alias;			  /* Alias Device ID */
+	bool iommu_v2;			  /* Device can make use of IOMMUv2 */
+	bool passthrough;		  /* Device is identity mapped */
+	struct {
+		bool enabled;
+		int qdep;
+	} ats;				  /* ATS state */
+	bool pri_tlp;			  /* PASID TLB required for
+					     PPR completions */
+	u32 errata;			  /* Bitmap for errata to apply */
+	bool use_vapic;			  /* Enable device to use vapic mode */
+	bool defer_attach;
+
+	struct ratelimit_state rs;        /* Ratelimit IOPF messages */
+};
+
 /* Map HPET and IOAPIC ids to the devid used by the IOMMU */
 extern struct list_head ioapic_map;
 extern struct list_head hpet_map;
diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 6629c472eafd..0245d414a7b3 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -562,13 +562,29 @@ static int ppr_notifier(struct notifier_block *nb, unsigned long e, void *data)
 	unsigned long flags;
 	struct fault *fault;
 	bool finish;
-	u16 tag;
+	u16 tag, devid;
 	int ret;
+	struct iommu_dev_data *dev_data;
+	struct pci_dev *pdev = NULL;
 
 	iommu_fault = data;
 	tag         = iommu_fault->tag & 0x1ff;
 	finish      = (iommu_fault->tag >> 9) & 1;
 
+	devid = iommu_fault->device_id;
+	pdev = pci_get_bus_and_slot(PCI_BUS_NUM(devid), devid & 0xff);
+	if (!pdev)
+		return -ENODEV;
+	dev_data = get_dev_data(&pdev->dev);
+
+	/* In kdump kernel pci dev is not initialized yet -> send INVALID */
+	if (translation_pre_enabled(amd_iommu_rlookup_table[devid])
+		&& dev_data->defer_attach) {
+		amd_iommu_complete_ppr(pdev, iommu_fault->pasid,
+				       PPR_INVALID, tag);
+		goto out;
+	}
+
 	ret = NOTIFY_DONE;
 	dev_state = get_device_state(iommu_fault->device_id);
 	if (dev_state == NULL)
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 12/13] iommu/amd: Clear out the GV flag when handle deferred domain attach
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
                   ` (10 preceding siblings ...)
  2017-07-21  8:59 ` [PATCH v8 11/13] iommu/amd: Don't copy GCR3 table root pointer Baoquan He
@ 2017-07-21  8:59 ` Baoquan He
  2017-07-21  8:59 ` [PATCH v8 13/13] iommu/amd: Disable iommu only if amd_iommu=off is specified Baoquan He
  12 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:59 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, Baoquan He

When handle deferred domain attach, we need check if the domain is
v2. If not, should try to clear out the GV flag which could be
copied from the old device table entry.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 46d077784da0..98aaccecbb76 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2085,6 +2085,11 @@ static void clear_dte_entry(u16 devid)
 	amd_iommu_apply_erratum_63(devid);
 }
 
+static void clear_dte_flag_gv(u16 devid)
+{
+	amd_iommu_dev_table[devid].data[0] &= (~DTE_FLAG_GV);
+}
+
 static void do_attach(struct iommu_dev_data *dev_data,
 		      struct protection_domain *domain)
 {
@@ -2459,6 +2464,7 @@ static struct iommu_group *amd_iommu_device_group(struct device *dev)
  */
 static struct protection_domain *get_domain(struct device *dev)
 {
+	struct iommu_dev_data *dev_data = get_dev_data(dev);
 	struct protection_domain *domain;
 	struct iommu_domain *io_domain;
 
@@ -2466,11 +2472,21 @@ static struct protection_domain *get_domain(struct device *dev)
 		return ERR_PTR(-EINVAL);
 
 	domain = get_dev_data(dev)->domain;
-	if (domain == NULL && get_dev_data(dev)->defer_attach) {
+	if (domain == NULL && dev_data->defer_attach) {
+		u16 alias = amd_iommu_alias_table[dev_data->devid];
 		get_dev_data(dev)->defer_attach = false;
 		io_domain = iommu_get_domain_for_dev(dev);
 		domain = to_pdomain(io_domain);
 		attach_device(dev, domain);
+		/*
+		 * If the deferred attached domain is not v2, should clear out
+		 * the old GV flag.
+		 */
+		if (!(domain->flags & PD_IOMMUV2_MASK)) {
+			clear_dte_flag_gv(dev_data->devid);
+			if (alias != dev_data->devid)
+				clear_dte_flag_gv(dev_data->devid);
+		}
 	}
 	if (!dma_ops_domain(domain))
 		return ERR_PTR(-EBUSY);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v8 13/13] iommu/amd: Disable iommu only if amd_iommu=off is specified
  2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
                   ` (11 preceding siblings ...)
  2017-07-21  8:59 ` [PATCH v8 12/13] iommu/amd: Clear out the GV flag when handle deferred domain attach Baoquan He
@ 2017-07-21  8:59 ` Baoquan He
  2017-07-27 15:58   ` Joerg Roedel
  12 siblings, 1 reply; 41+ messages in thread
From: Baoquan He @ 2017-07-21  8:59 UTC (permalink / raw)
  To: jroedel; +Cc: iommu, linux-kernel, root, Baoquan He

From: root <root@amd-dinar-02.lab.bos.redhat.com>

It's ok to disable iommu in normal kernel. While there's no need
to disable it in kdump kernel after the on-flight dma issue has
heen fixed.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu_init.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 8b4bac978062..880f693c809b 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2494,7 +2494,8 @@ static int __init early_amd_iommu_init(void)
 		goto out;
 
 	/* Disable any previously enabled IOMMUs */
-	disable_iommus();
+	if (amd_iommu_disabled)
+		disable_iommus();
 
 	if (amd_iommu_irq_remap)
 		amd_iommu_irq_remap = check_ioapic_information();
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 11/13] iommu/amd: Don't copy GCR3 table root pointer
  2017-07-21  8:59 ` [PATCH v8 11/13] iommu/amd: Don't copy GCR3 table root pointer Baoquan He
@ 2017-07-23 22:33   ` kbuild test robot
  2017-07-23 23:53     ` Baoquan He
  2017-07-27 15:57   ` Joerg Roedel
  1 sibling, 1 reply; 41+ messages in thread
From: kbuild test robot @ 2017-07-23 22:33 UTC (permalink / raw)
  To: Baoquan He; +Cc: kbuild-all, jroedel, iommu, linux-kernel, Baoquan He

[-- Attachment #1: Type: text/plain, Size: 7065 bytes --]

Hi Baoquan,

[auto build test WARNING on iommu/next]
[also build test WARNING on v4.13-rc1]
[cannot apply to next-20170721]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Baoquan-He/Fix-the-on-flight-DMA-issue-on-system-with-amd-iommu/20170724-060048
base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
config: x86_64-randconfig-x005-201730 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   drivers//iommu/amd_iommu_v2.c: In function 'ppr_notifier':
>> drivers//iommu/amd_iommu_v2.c:566:6: warning: 'ret' may be used uninitialized in this function [-Wmaybe-uninitialized]
     int ret;
         ^~~

vim +/ret +566 drivers//iommu/amd_iommu_v2.c

028eeacc Joerg Roedel    2011-11-24  556  
028eeacc Joerg Roedel    2011-11-24  557  static int ppr_notifier(struct notifier_block *nb, unsigned long e, void *data)
028eeacc Joerg Roedel    2011-11-24  558  {
028eeacc Joerg Roedel    2011-11-24  559  	struct amd_iommu_fault *iommu_fault;
028eeacc Joerg Roedel    2011-11-24  560  	struct pasid_state *pasid_state;
028eeacc Joerg Roedel    2011-11-24  561  	struct device_state *dev_state;
028eeacc Joerg Roedel    2011-11-24  562  	unsigned long flags;
028eeacc Joerg Roedel    2011-11-24  563  	struct fault *fault;
028eeacc Joerg Roedel    2011-11-24  564  	bool finish;
49f3c23d Baoquan He      2017-07-21  565  	u16 tag, devid;
028eeacc Joerg Roedel    2011-11-24 @566  	int ret;
49f3c23d Baoquan He      2017-07-21  567  	struct iommu_dev_data *dev_data;
49f3c23d Baoquan He      2017-07-21  568  	struct pci_dev *pdev = NULL;
028eeacc Joerg Roedel    2011-11-24  569  
028eeacc Joerg Roedel    2011-11-24  570  	iommu_fault = data;
028eeacc Joerg Roedel    2011-11-24  571  	tag         = iommu_fault->tag & 0x1ff;
028eeacc Joerg Roedel    2011-11-24  572  	finish      = (iommu_fault->tag >> 9) & 1;
028eeacc Joerg Roedel    2011-11-24  573  
49f3c23d Baoquan He      2017-07-21  574  	devid = iommu_fault->device_id;
49f3c23d Baoquan He      2017-07-21  575  	pdev = pci_get_bus_and_slot(PCI_BUS_NUM(devid), devid & 0xff);
49f3c23d Baoquan He      2017-07-21  576  	if (!pdev)
49f3c23d Baoquan He      2017-07-21  577  		return -ENODEV;
49f3c23d Baoquan He      2017-07-21  578  	dev_data = get_dev_data(&pdev->dev);
49f3c23d Baoquan He      2017-07-21  579  
49f3c23d Baoquan He      2017-07-21  580  	/* In kdump kernel pci dev is not initialized yet -> send INVALID */
49f3c23d Baoquan He      2017-07-21  581  	if (translation_pre_enabled(amd_iommu_rlookup_table[devid])
49f3c23d Baoquan He      2017-07-21  582  		&& dev_data->defer_attach) {
49f3c23d Baoquan He      2017-07-21  583  		amd_iommu_complete_ppr(pdev, iommu_fault->pasid,
49f3c23d Baoquan He      2017-07-21  584  				       PPR_INVALID, tag);
49f3c23d Baoquan He      2017-07-21  585  		goto out;
49f3c23d Baoquan He      2017-07-21  586  	}
49f3c23d Baoquan He      2017-07-21  587  
028eeacc Joerg Roedel    2011-11-24  588  	ret = NOTIFY_DONE;
028eeacc Joerg Roedel    2011-11-24  589  	dev_state = get_device_state(iommu_fault->device_id);
028eeacc Joerg Roedel    2011-11-24  590  	if (dev_state == NULL)
028eeacc Joerg Roedel    2011-11-24  591  		goto out;
028eeacc Joerg Roedel    2011-11-24  592  
028eeacc Joerg Roedel    2011-11-24  593  	pasid_state = get_pasid_state(dev_state, iommu_fault->pasid);
53d340ef Joerg Roedel    2014-07-08  594  	if (pasid_state == NULL || pasid_state->invalid) {
028eeacc Joerg Roedel    2011-11-24  595  		/* We know the device but not the PASID -> send INVALID */
028eeacc Joerg Roedel    2011-11-24  596  		amd_iommu_complete_ppr(dev_state->pdev, iommu_fault->pasid,
028eeacc Joerg Roedel    2011-11-24  597  				       PPR_INVALID, tag);
028eeacc Joerg Roedel    2011-11-24  598  		goto out_drop_state;
028eeacc Joerg Roedel    2011-11-24  599  	}
028eeacc Joerg Roedel    2011-11-24  600  
028eeacc Joerg Roedel    2011-11-24  601  	spin_lock_irqsave(&pasid_state->lock, flags);
028eeacc Joerg Roedel    2011-11-24  602  	atomic_inc(&pasid_state->pri[tag].inflight);
028eeacc Joerg Roedel    2011-11-24  603  	if (finish)
028eeacc Joerg Roedel    2011-11-24  604  		pasid_state->pri[tag].finish = true;
028eeacc Joerg Roedel    2011-11-24  605  	spin_unlock_irqrestore(&pasid_state->lock, flags);
028eeacc Joerg Roedel    2011-11-24  606  
028eeacc Joerg Roedel    2011-11-24  607  	fault = kzalloc(sizeof(*fault), GFP_ATOMIC);
028eeacc Joerg Roedel    2011-11-24  608  	if (fault == NULL) {
028eeacc Joerg Roedel    2011-11-24  609  		/* We are OOM - send success and let the device re-fault */
028eeacc Joerg Roedel    2011-11-24  610  		finish_pri_tag(dev_state, pasid_state, tag);
028eeacc Joerg Roedel    2011-11-24  611  		goto out_drop_state;
028eeacc Joerg Roedel    2011-11-24  612  	}
028eeacc Joerg Roedel    2011-11-24  613  
028eeacc Joerg Roedel    2011-11-24  614  	fault->dev_state = dev_state;
028eeacc Joerg Roedel    2011-11-24  615  	fault->address   = iommu_fault->address;
028eeacc Joerg Roedel    2011-11-24  616  	fault->state     = pasid_state;
028eeacc Joerg Roedel    2011-11-24  617  	fault->tag       = tag;
028eeacc Joerg Roedel    2011-11-24  618  	fault->finish    = finish;
b00675b8 Alexey Skidanov 2014-07-08  619  	fault->pasid     = iommu_fault->pasid;
028eeacc Joerg Roedel    2011-11-24  620  	fault->flags     = iommu_fault->flags;
028eeacc Joerg Roedel    2011-11-24  621  	INIT_WORK(&fault->work, do_fault);
028eeacc Joerg Roedel    2011-11-24  622  
028eeacc Joerg Roedel    2011-11-24  623  	queue_work(iommu_wq, &fault->work);
028eeacc Joerg Roedel    2011-11-24  624  
028eeacc Joerg Roedel    2011-11-24  625  	ret = NOTIFY_OK;
028eeacc Joerg Roedel    2011-11-24  626  
028eeacc Joerg Roedel    2011-11-24  627  out_drop_state:
dc88db7e Joerg Roedel    2014-07-08  628  
dc88db7e Joerg Roedel    2014-07-08  629  	if (ret != NOTIFY_OK && pasid_state)
dc88db7e Joerg Roedel    2014-07-08  630  		put_pasid_state(pasid_state);
dc88db7e Joerg Roedel    2014-07-08  631  
028eeacc Joerg Roedel    2011-11-24  632  	put_device_state(dev_state);
028eeacc Joerg Roedel    2011-11-24  633  
028eeacc Joerg Roedel    2011-11-24  634  out:
028eeacc Joerg Roedel    2011-11-24  635  	return ret;
028eeacc Joerg Roedel    2011-11-24  636  }
028eeacc Joerg Roedel    2011-11-24  637  

:::::: The code at line 566 was first introduced by commit
:::::: 028eeacc412a8bebf6711e58629b0cab56a9ba87 iommu/amd: Implement IO page-fault handler

:::::: TO: Joerg Roedel <joerg.roedel@amd.com>
:::::: CC: Joerg Roedel <joerg.roedel@amd.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25833 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 11/13] iommu/amd: Don't copy GCR3 table root pointer
  2017-07-23 22:33   ` kbuild test robot
@ 2017-07-23 23:53     ` Baoquan He
  0 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-23 23:53 UTC (permalink / raw)
  To: kbuild test robot; +Cc: kbuild-all, jroedel, iommu, linux-kernel

On 07/24/17 at 06:33am, kbuild test robot wrote:
> Hi Baoquan,
> 
> [auto build test WARNING on iommu/next]
> [also build test WARNING on v4.13-rc1]
> [cannot apply to next-20170721]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> 
> url:    https://github.com/0day-ci/linux/commits/Baoquan-He/Fix-the-on-flight-DMA-issue-on-system-with-amd-iommu/20170724-060048
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
> config: x86_64-randconfig-x005-201730 (attached as .config)
> compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
> reproduce:
>         # save the attached .config to linux build tree
>         make ARCH=x86_64 
> 
> Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
> http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings
> 
> All warnings (new ones prefixed by >>):
> 
>    drivers//iommu/amd_iommu_v2.c: In function 'ppr_notifier':
> >> drivers//iommu/amd_iommu_v2.c:566:6: warning: 'ret' may be used uninitialized in this function [-Wmaybe-uninitialized]
>      int ret;
>          ^~~
> 
> vim +/ret +566 drivers//iommu/amd_iommu_v2.c

Thanks, it should return NOTIFY_DONE anyway when ppr faults is handled
in kdump kernel since the GCR3 table root pointer has been made NULL
intentionally.

I will add this into patch 11/13 when repost need be done.


>From 742d8a51d8832e12884800840c4ebe802767d808 Mon Sep 17 00:00:00 2001
From: Baoquan He <bhe@redhat.com>
Date: Mon, 24 Jul 2017 07:48:10 +0800
Subject: [PATCH] iommu/amd: The ppr faults handled in kdump kernel should
 return NOTIFY_DONE

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 drivers/iommu/amd_iommu_v2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 0245d414a7b3..e705fac89cb4 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -578,6 +578,7 @@ static int ppr_notifier(struct notifier_block *nb, unsigned long e, void *data)
 	dev_data = get_dev_data(&pdev->dev);
 
 	/* In kdump kernel pci dev is not initialized yet -> send INVALID */
+	ret = NOTIFY_DONE;
 	if (translation_pre_enabled(amd_iommu_rlookup_table[devid])
 		&& dev_data->defer_attach) {
 		amd_iommu_complete_ppr(pdev, iommu_fault->pasid,
@@ -585,7 +586,6 @@ static int ppr_notifier(struct notifier_block *nb, unsigned long e, void *data)
 		goto out;
 	}
 
-	ret = NOTIFY_DONE;
 	dev_state = get_device_state(iommu_fault->device_id);
 	if (dev_state == NULL)
 		goto out;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 01/13] iommu/amd: Detect pre enabled translation
  2017-07-21  8:58 ` [PATCH v8 01/13] iommu/amd: Detect pre enabled translation Baoquan He
@ 2017-07-27 15:04   ` Joerg Roedel
  2017-07-28  2:31     ` Baoquan He
  0 siblings, 1 reply; 41+ messages in thread
From: Joerg Roedel @ 2017-07-27 15:04 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, linux-kernel

On Fri, Jul 21, 2017 at 04:58:59PM +0800, Baoquan He wrote:
> diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
> index 294a409e283b..d15966b62b33 100644
> --- a/drivers/iommu/amd_iommu_types.h
> +++ b/drivers/iommu/amd_iommu_types.h
> @@ -417,6 +417,7 @@ extern struct kmem_cache *amd_iommu_irq_cache;
>  #define APERTURE_PAGE_INDEX(a)	(((a) >> 21) & 0x3fULL)
>  
>  
> +

Forgot to remove that from the diff?

>  /*
>   * This struct is used to pass information about
>   * incoming PPR faults around.
> @@ -435,6 +436,8 @@ struct iommu_domain;
>  struct irq_domain;
>  struct amd_irte_ops;
>  
> +#define AMD_IOMMU_FLAG_TRANS_PRE_ENABLED      (1 << 0)
> +
>  /*
>   * This structure contains generic data for  IOMMU protection domains
>   * independent of their use.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 02/13] iommu/amd: add several helper functions
  2017-07-21  8:59 ` [PATCH v8 02/13] iommu/amd: add several helper functions Baoquan He
@ 2017-07-27 15:06   ` Joerg Roedel
  2017-07-28  2:36     ` Baoquan He
  2017-07-31 10:01     ` Baoquan He
  0 siblings, 2 replies; 41+ messages in thread
From: Joerg Roedel @ 2017-07-27 15:06 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, linux-kernel

On Fri, Jul 21, 2017 at 04:59:00PM +0800, Baoquan He wrote:
> Move single iommu enabling codes into a wrapper function early_enable_iommu().
> This can make later kdump change easier.
> 
> And also add iommu_disable_command_buffer and iommu_disable_event_buffer
> for later usage.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  drivers/iommu/amd_iommu_init.c | 42 +++++++++++++++++++++++++++++++-----------
>  1 file changed, 31 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index e39857ce6481..4ca6e3257d92 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -634,6 +634,14 @@ static void iommu_enable_command_buffer(struct amd_iommu *iommu)
>  	amd_iommu_reset_cmd_buffer(iommu);
>  }
>  
> +/*
> + * This function disables the command buffer
> + */
> +static void iommu_disable_command_buffer(struct amd_iommu *iommu)
> +{
> +	iommu_feature_disable(iommu, CONTROL_CMDBUF_EN);
> +}
> +
>  static void __init free_command_buffer(struct amd_iommu *iommu)
>  {
>  	free_pages((unsigned long)iommu->cmd_buf, get_order(CMD_BUFFER_SIZE));
> @@ -666,6 +674,14 @@ static void iommu_enable_event_buffer(struct amd_iommu *iommu)
>  	iommu_feature_enable(iommu, CONTROL_EVT_LOG_EN);
>  }
>  
> +/*
> + * This function disables the command buffer

s/command buffer/event log/

> + */
> +static void iommu_disable_event_buffer(struct amd_iommu *iommu)

Please also use event_log here.

> +{
> +	iommu_feature_disable(iommu, CONTROL_EVT_LOG_EN);
> +}
> +

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 05/13] iommu/amd: Add function copy_dev_tables()
  2017-07-21  8:59 ` [PATCH v8 05/13] iommu/amd: Add function copy_dev_tables() Baoquan He
@ 2017-07-27 15:29   ` Joerg Roedel
  2017-07-28  3:59     ` Baoquan He
  0 siblings, 1 reply; 41+ messages in thread
From: Joerg Roedel @ 2017-07-27 15:29 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, linux-kernel

On Fri, Jul 21, 2017 at 04:59:03PM +0800, Baoquan He wrote:
> Add function copy_dev_tables to copy the old DEV table entries of the panicked

Since there is only one (for now), you can name the function in
singular: copy_dev_table() or copy_device_table().

> kernel to the new allocated DEV table. Since all iommus share the same DTE table
> the copy only need be done one time. Besides, we also need to:
> 
>   - Check whether all IOMMUs actually use the same device table with the same size
> 
>   - Verify that the size of the old device table is the expected size.
> 
>   - Reserve the old domain id occupied in 1st kernel to avoid touching the old
>     io-page tables. Then on-flight DMA can continue looking it up.
> 
> And also define MACRO DEV_DOMID_MASK to replace magic number 0xffffULL, it can be
> reused in copy_dev_tables().
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  drivers/iommu/amd_iommu.c       |  2 +-
>  drivers/iommu/amd_iommu_init.c  | 55 +++++++++++++++++++++++++++++++++++++++++
>  drivers/iommu/amd_iommu_types.h |  1 +
>  3 files changed, 57 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index e5a03f259986..4d00f1bda900 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2086,7 +2086,7 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, bool ats)
>  		flags    |= tmp;
>  	}
>  
> -	flags &= ~(0xffffUL);
> +	flags &= ~DEV_DOMID_MASK;
>  	flags |= domain->id;
>  
>  	amd_iommu_dev_table[devid].data[1]  = flags;
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index f6da5fe03b31..c58f091ce232 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -842,6 +842,61 @@ static int get_dev_entry_bit(u16 devid, u8 bit)
>  }
>  
>  
> +static int copy_dev_tables(void)
> +{
> +	struct dev_table_entry *old_devtb = NULL;
> +	u32 lo, hi, devid, old_devtb_size;
> +	phys_addr_t old_devtb_phys;
> +	u64 entry, last_entry = 0;
> +	struct amd_iommu *iommu;
> +	u16 dom_id, dte_v;
> +	static int copied;
> +
> +	for_each_iommu(iommu) {
> +		if (!translation_pre_enabled(iommu)) {
> +			pr_err("IOMMU:%d is not pre-enabled!/n",
> +				iommu->index);
> +			return -1;
> +		}
> +
> +		/* All IOMMUs should use the same device table with the same size */
> +		lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET);
> +		hi = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET + 4);
> +		entry = (((u64) hi) << 32) + lo;
> +		if (last_entry && last_entry != entry) {
> +			pr_err("IOMMU:%d should use the same dev table as others!/n",
> +				iommu->index);
> +			return -1;
> +		}
> +		last_entry = entry;
> +
> +		old_devtb_size = ((entry & ~PAGE_MASK) + 1) << 12;
> +		if (old_devtb_size != dev_table_size) {
> +			pr_err("The device table size of IOMMU:%d is not expected!/n",
> +				iommu->index);
> +			return -1;
> +		}

I think the loop can end here. There is only one table to copy, so you
don't need to copy it for every iommu. Just do the checks in the loop
and the copy once after the loop.

> +
> +		old_devtb_phys = entry & PAGE_MASK;
> +		old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
> +		if (!old_devtb)
> +			return -1;
> +
> +		if (copied)
> +			continue;
> +		for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
> +			amd_iommu_dev_table[devid] = old_devtb[devid];
> +			dom_id = old_devtb[devid].data[1] & DEV_DOMID_MASK;
> +			dte_v = old_devtb[devid].data[0] & DTE_FLAG_V;
> +			if (dte_v && dom_id)
> +				__set_bit(dom_id, amd_iommu_pd_alloc_bitmap);
> +		}

Please only copy the DTE when DTE.V is set in the old table. Also
don't copy any of the IOMMUv2 enablement bits from the old DTE. PRI
faults are recoverable by design and a device shouldn't fail on a
negative answer from the IOMMU.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 06/13] iommu/amd: copy old trans table from old kernel
  2017-07-21  8:59 ` [PATCH v8 06/13] iommu/amd: copy old trans table from old kernel Baoquan He
@ 2017-07-27 15:38   ` Joerg Roedel
  2017-07-28  6:52     ` Baoquan He
  0 siblings, 1 reply; 41+ messages in thread
From: Joerg Roedel @ 2017-07-27 15:38 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, linux-kernel

On Fri, Jul 21, 2017 at 04:59:04PM +0800, Baoquan He wrote:
> @@ -2128,9 +2131,43 @@ static void early_enable_iommu(struct amd_iommu *iommu)
>  static void early_enable_iommus(void)
>  {
>  	struct amd_iommu *iommu;
> +	bool is_pre_enabled = false;
>  
> -	for_each_iommu(iommu)
> -		early_enable_iommu(iommu);
> +	for_each_iommu(iommu) {
> +		if (translation_pre_enabled(iommu)) {
> +			is_pre_enabled = true;
> +			break;
> +		}
> +	}

is_pre_enabled should only be true when _all_ iommus are pre-enabled. If
only one is found disabled just disable the others and continue without
copying the device table.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 09/13] iommu/amd: Use is_attach_deferred call-back
  2017-07-21  8:59 ` [PATCH v8 09/13] iommu/amd: Use is_attach_deferred call-back Baoquan He
@ 2017-07-27 15:53   ` Joerg Roedel
  2017-07-28  4:02     ` Baoquan He
  0 siblings, 1 reply; 41+ messages in thread
From: Joerg Roedel @ 2017-07-27 15:53 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, linux-kernel

On Fri, Jul 21, 2017 at 04:59:07PM +0800, Baoquan He wrote:
> +static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
> +					 struct device *dev)
> +{
> +	struct iommu_dev_data *dev_data = dev->archdata.iommu;
> +	return dev_data->defer_attach;
> +}

If the redundant newline from patch 1 needs a new home, put it here :-)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled
  2017-07-21  8:59 ` [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled Baoquan He
@ 2017-07-27 15:55   ` Joerg Roedel
  2017-07-28  9:06     ` Baoquan He
  0 siblings, 1 reply; 41+ messages in thread
From: Joerg Roedel @ 2017-07-27 15:55 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, linux-kernel

On Fri, Jul 21, 2017 at 04:59:08PM +0800, Baoquan He wrote:
> AMD pointed out it's unsafe to update the device-table while iommu
> is enabled. It turns out that device-table pointer update is split
> up into two 32bit writes in the IOMMU hardware. So updating it while
> the IOMMU is enabled could have some nasty side effects.
> 
> The only way to work around this is to allocate the device-table below
> 4GB if translation is pre-enabled in kdump kernel. If allocation failed,
> still use the old one.

Not only for the kdump kernel. The old device table must also be below
4GB so that its pointer can be updated with a 32bit write.

If the old table is above 4GB you still need the second write to zero
the upper parts of the pointer in hardware.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 11/13] iommu/amd: Don't copy GCR3 table root pointer
  2017-07-21  8:59 ` [PATCH v8 11/13] iommu/amd: Don't copy GCR3 table root pointer Baoquan He
  2017-07-23 22:33   ` kbuild test robot
@ 2017-07-27 15:57   ` Joerg Roedel
  2017-07-28  9:07     ` Baoquan He
  1 sibling, 1 reply; 41+ messages in thread
From: Joerg Roedel @ 2017-07-27 15:57 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, linux-kernel

On Fri, Jul 21, 2017 at 04:59:09PM +0800, Baoquan He wrote:
> When iommu is pre_enabled in kdump kernel, if a device is set up with
> guest translations (DTE.GV=1), then don't copy GCR3 table root pointer
> but move the device over to an empty guest-cr3 table and handle the
> faults in the PPR log (which answer them with INVALID). After all these
> PPR faults are recoverable for the device and we should not allow the
> device to change old-kernels data when we don't have to.

Okay, forget my previous comment about disabling IOMMUv2 features while
copying. Its done in this patch.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 13/13] iommu/amd: Disable iommu only if amd_iommu=off is specified
  2017-07-21  8:59 ` [PATCH v8 13/13] iommu/amd: Disable iommu only if amd_iommu=off is specified Baoquan He
@ 2017-07-27 15:58   ` Joerg Roedel
  2017-07-28  9:08     ` Baoquan He
  0 siblings, 1 reply; 41+ messages in thread
From: Joerg Roedel @ 2017-07-27 15:58 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, linux-kernel, root

On Fri, Jul 21, 2017 at 04:59:11PM +0800, Baoquan He wrote:
> From: root <root@amd-dinar-02.lab.bos.redhat.com>

You probaly need to reset the author on this one.

> 
> It's ok to disable iommu in normal kernel. While there's no need
> to disable it in kdump kernel after the on-flight dma issue has
> heen fixed.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  drivers/iommu/amd_iommu_init.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 8b4bac978062..880f693c809b 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -2494,7 +2494,8 @@ static int __init early_amd_iommu_init(void)
>  		goto out;
>  
>  	/* Disable any previously enabled IOMMUs */
> -	disable_iommus();
> +	if (amd_iommu_disabled)
> +		disable_iommus();
>  
>  	if (amd_iommu_irq_remap)
>  		amd_iommu_irq_remap = check_ioapic_information();
> -- 
> 2.5.5

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 01/13] iommu/amd: Detect pre enabled translation
  2017-07-27 15:04   ` Joerg Roedel
@ 2017-07-28  2:31     ` Baoquan He
  0 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-28  2:31 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On 07/27/17 at 05:04pm, Joerg Roedel wrote:
> On Fri, Jul 21, 2017 at 04:58:59PM +0800, Baoquan He wrote:
> > diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
> > index 294a409e283b..d15966b62b33 100644
> > --- a/drivers/iommu/amd_iommu_types.h
> > +++ b/drivers/iommu/amd_iommu_types.h
> > @@ -417,6 +417,7 @@ extern struct kmem_cache *amd_iommu_irq_cache;
> >  #define APERTURE_PAGE_INDEX(a)	(((a) >> 21) & 0x3fULL)
> >  
> >  
> > +
> 
> Forgot to remove that from the diff?

Many thanks for your reviewing and great suggestions, Joerg!

Will withdraw this change in this patch.

> 
> >  /*
> >   * This struct is used to pass information about
> >   * incoming PPR faults around.
> > @@ -435,6 +436,8 @@ struct iommu_domain;
> >  struct irq_domain;
> >  struct amd_irte_ops;
> >  
> > +#define AMD_IOMMU_FLAG_TRANS_PRE_ENABLED      (1 << 0)
> > +
> >  /*
> >   * This structure contains generic data for  IOMMU protection domains
> >   * independent of their use.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 02/13] iommu/amd: add several helper functions
  2017-07-27 15:06   ` Joerg Roedel
@ 2017-07-28  2:36     ` Baoquan He
  2017-07-31 10:01     ` Baoquan He
  1 sibling, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-28  2:36 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On 07/27/17 at 05:06pm, Joerg Roedel wrote:
> On Fri, Jul 21, 2017 at 04:59:00PM +0800, Baoquan He wrote:
> > Move single iommu enabling codes into a wrapper function early_enable_iommu().
> > This can make later kdump change easier.
> > 
> > And also add iommu_disable_command_buffer and iommu_disable_event_buffer
> > for later usage.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  drivers/iommu/amd_iommu_init.c | 42 +++++++++++++++++++++++++++++++-----------
> >  1 file changed, 31 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> > index e39857ce6481..4ca6e3257d92 100644
> > --- a/drivers/iommu/amd_iommu_init.c
> > +++ b/drivers/iommu/amd_iommu_init.c
> > @@ -634,6 +634,14 @@ static void iommu_enable_command_buffer(struct amd_iommu *iommu)
> >  	amd_iommu_reset_cmd_buffer(iommu);
> >  }
> >  
> > +/*
> > + * This function disables the command buffer
> > + */
> > +static void iommu_disable_command_buffer(struct amd_iommu *iommu)
> > +{
> > +	iommu_feature_disable(iommu, CONTROL_CMDBUF_EN);
> > +}
> > +
> >  static void __init free_command_buffer(struct amd_iommu *iommu)
> >  {
> >  	free_pages((unsigned long)iommu->cmd_buf, get_order(CMD_BUFFER_SIZE));
> > @@ -666,6 +674,14 @@ static void iommu_enable_event_buffer(struct amd_iommu *iommu)
> >  	iommu_feature_enable(iommu, CONTROL_EVT_LOG_EN);
> >  }
> >  
> > +/*
> > + * This function disables the command buffer
> 
> s/command buffer/event log/

Forgot changing it after copying from command buffer code.

> 
> > + */
> > +static void iommu_disable_event_buffer(struct amd_iommu *iommu)
> 
> Please also use event_log here.

Sure, will change. Thanks.

> 
> > +{
> > +	iommu_feature_disable(iommu, CONTROL_EVT_LOG_EN);
> > +}
> > +

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 05/13] iommu/amd: Add function copy_dev_tables()
  2017-07-27 15:29   ` Joerg Roedel
@ 2017-07-28  3:59     ` Baoquan He
  0 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-28  3:59 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On 07/27/17 at 05:29pm, Joerg Roedel wrote:
> On Fri, Jul 21, 2017 at 04:59:03PM +0800, Baoquan He wrote:
> > Add function copy_dev_tables to copy the old DEV table entries of the panicked
> 
> Since there is only one (for now), you can name the function in
> singular: copy_dev_table() or copy_device_table().

Make sense, will change it to copy_device_table(). Thanks.

> 
> > kernel to the new allocated DEV table. Since all iommus share the same DTE table
> > the copy only need be done one time. Besides, we also need to:
> > 
> >   - Check whether all IOMMUs actually use the same device table with the same size
> > 
> >   - Verify that the size of the old device table is the expected size.
> > 
> >   - Reserve the old domain id occupied in 1st kernel to avoid touching the old
> >     io-page tables. Then on-flight DMA can continue looking it up.
> > 
> > And also define MACRO DEV_DOMID_MASK to replace magic number 0xffffULL, it can be
> > reused in copy_dev_tables().
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  drivers/iommu/amd_iommu.c       |  2 +-
> >  drivers/iommu/amd_iommu_init.c  | 55 +++++++++++++++++++++++++++++++++++++++++
> >  drivers/iommu/amd_iommu_types.h |  1 +
> >  3 files changed, 57 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> > index e5a03f259986..4d00f1bda900 100644
> > --- a/drivers/iommu/amd_iommu.c
> > +++ b/drivers/iommu/amd_iommu.c
> > @@ -2086,7 +2086,7 @@ static void set_dte_entry(u16 devid, struct protection_domain *domain, bool ats)
> >  		flags    |= tmp;
> >  	}
> >  
> > -	flags &= ~(0xffffUL);
> > +	flags &= ~DEV_DOMID_MASK;
> >  	flags |= domain->id;
> >  
> >  	amd_iommu_dev_table[devid].data[1]  = flags;
> > diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> > index f6da5fe03b31..c58f091ce232 100644
> > --- a/drivers/iommu/amd_iommu_init.c
> > +++ b/drivers/iommu/amd_iommu_init.c
> > @@ -842,6 +842,61 @@ static int get_dev_entry_bit(u16 devid, u8 bit)
> >  }
> >  
> >  
> > +static int copy_dev_tables(void)
> > +{
> > +	struct dev_table_entry *old_devtb = NULL;
> > +	u32 lo, hi, devid, old_devtb_size;
> > +	phys_addr_t old_devtb_phys;
> > +	u64 entry, last_entry = 0;
> > +	struct amd_iommu *iommu;
> > +	u16 dom_id, dte_v;
> > +	static int copied;
> > +
> > +	for_each_iommu(iommu) {
> > +		if (!translation_pre_enabled(iommu)) {
> > +			pr_err("IOMMU:%d is not pre-enabled!/n",
> > +				iommu->index);
> > +			return -1;
> > +		}
> > +
> > +		/* All IOMMUs should use the same device table with the same size */
> > +		lo = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET);
> > +		hi = readl(iommu->mmio_base + MMIO_DEV_TABLE_OFFSET + 4);
> > +		entry = (((u64) hi) << 32) + lo;
> > +		if (last_entry && last_entry != entry) {
> > +			pr_err("IOMMU:%d should use the same dev table as others!/n",
> > +				iommu->index);
> > +			return -1;
> > +		}
> > +		last_entry = entry;
> > +
> > +		old_devtb_size = ((entry & ~PAGE_MASK) + 1) << 12;
> > +		if (old_devtb_size != dev_table_size) {
> > +			pr_err("The device table size of IOMMU:%d is not expected!/n",
> > +				iommu->index);
> > +			return -1;
> > +		}
> 
> I think the loop can end here. There is only one table to copy, so you
> don't need to copy it for every iommu. Just do the checks in the loop
> and the copy once after the loop.

Ok, will change. I worried that device table addr might not be stored
into iommu register correctly. I am fine we only do the check only at
the first time.

> 
> > +
> > +		old_devtb_phys = entry & PAGE_MASK;
> > +		old_devtb = memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
> > +		if (!old_devtb)
> > +			return -1;
> > +
> > +		if (copied)
> > +			continue;
> > +		for (devid = 0; devid <= amd_iommu_last_bdf; ++devid) {
> > +			amd_iommu_dev_table[devid] = old_devtb[devid];
> > +			dom_id = old_devtb[devid].data[1] & DEV_DOMID_MASK;
> > +			dte_v = old_devtb[devid].data[0] & DTE_FLAG_V;
> > +			if (dte_v && dom_id)
> > +				__set_bit(dom_id, amd_iommu_pd_alloc_bitmap);
> > +		}
> 
> Please only copy the DTE when DTE.V is set in the old table. Also

Sorry, the sanity check about DTE.V has been done in patch 7/13. I
should rewrite the subject and git log for patch 7/13.

> don't copy any of the IOMMUv2 enablement bits from the old DTE. PRI
> faults are recoverable by design and a device shouldn't fail on a
> negative answer from the IOMMU.

Yes, this is done in patch 11/13. Saw your comment in patch 11/13.

> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 09/13] iommu/amd: Use is_attach_deferred call-back
  2017-07-27 15:53   ` Joerg Roedel
@ 2017-07-28  4:02     ` Baoquan He
  0 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-28  4:02 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On 07/27/17 at 05:53pm, Joerg Roedel wrote:
> On Fri, Jul 21, 2017 at 04:59:07PM +0800, Baoquan He wrote:
> > +static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
> > +					 struct device *dev)
> > +{
> > +	struct iommu_dev_data *dev_data = dev->archdata.iommu;
> > +	return dev_data->defer_attach;
> > +}
> 
> If the redundant newline from patch 1 needs a new home, put it here :-)

Got it, will add a newline at this place. Thanks.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 06/13] iommu/amd: copy old trans table from old kernel
  2017-07-27 15:38   ` Joerg Roedel
@ 2017-07-28  6:52     ` Baoquan He
  0 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-28  6:52 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On 07/27/17 at 05:38pm, Joerg Roedel wrote:
> On Fri, Jul 21, 2017 at 04:59:04PM +0800, Baoquan He wrote:
> > @@ -2128,9 +2131,43 @@ static void early_enable_iommu(struct amd_iommu *iommu)
> >  static void early_enable_iommus(void)
> >  {
> >  	struct amd_iommu *iommu;
> > +	bool is_pre_enabled = false;
> >  
> > -	for_each_iommu(iommu)
> > -		early_enable_iommu(iommu);
> > +	for_each_iommu(iommu) {
> > +		if (translation_pre_enabled(iommu)) {
> > +			is_pre_enabled = true;
> > +			break;
> > +		}
> > +	}
> 
> is_pre_enabled should only be true when _all_ iommus are pre-enabled. If
> only one is found disabled just disable the others and continue without
> copying the device table.

OK, will change as you suggested.

> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled
  2017-07-27 15:55   ` Joerg Roedel
@ 2017-07-28  9:06     ` Baoquan He
  2017-07-28 11:14       ` Joerg Roedel
  2017-07-31 10:15       ` Baoquan He
  0 siblings, 2 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-28  9:06 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

Hi Joerg,

On 07/27/17 at 05:55pm, Joerg Roedel wrote:
> On Fri, Jul 21, 2017 at 04:59:08PM +0800, Baoquan He wrote:
> > AMD pointed out it's unsafe to update the device-table while iommu
> > is enabled. It turns out that device-table pointer update is split
> > up into two 32bit writes in the IOMMU hardware. So updating it while
> > the IOMMU is enabled could have some nasty side effects.
> > 
> > The only way to work around this is to allocate the device-table below
> > 4GB if translation is pre-enabled in kdump kernel. If allocation failed,
> > still use the old one.
> 
> Not only for the kdump kernel. The old device table must also be below
> 4GB so that its pointer can be updated with a 32bit write.
> 
> If the old table is above 4GB you still need the second write to zero
> the upper parts of the pointer in hardware.

Do you mean the allocation of amd_iommu_dev_table in
early_amd_iommu_init() also need be addressed for 1st kernel? Seems we
don't make sure that for 1st kernel, like adding GFP_DMA32 flag when
allocate amd_iommu_dev_table in amd_iommu_dev_table
early_amd_iommu_init().

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 11/13] iommu/amd: Don't copy GCR3 table root pointer
  2017-07-27 15:57   ` Joerg Roedel
@ 2017-07-28  9:07     ` Baoquan He
  0 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-28  9:07 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On 07/27/17 at 05:57pm, Joerg Roedel wrote:
> On Fri, Jul 21, 2017 at 04:59:09PM +0800, Baoquan He wrote:
> > When iommu is pre_enabled in kdump kernel, if a device is set up with
> > guest translations (DTE.GV=1), then don't copy GCR3 table root pointer
> > but move the device over to an empty guest-cr3 table and handle the
> > faults in the PPR log (which answer them with INVALID). After all these
> > PPR faults are recoverable for the device and we should not allow the
> > device to change old-kernels data when we don't have to.
> 
> Okay, forget my previous comment about disabling IOMMUv2 features while
> copying. Its done in this patch.

Yeah, as you said. Thanks.

> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 13/13] iommu/amd: Disable iommu only if amd_iommu=off is specified
  2017-07-27 15:58   ` Joerg Roedel
@ 2017-07-28  9:08     ` Baoquan He
  0 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-28  9:08 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel, root

On 07/27/17 at 05:58pm, Joerg Roedel wrote:
> On Fri, Jul 21, 2017 at 04:59:11PM +0800, Baoquan He wrote:
> > From: root <root@amd-dinar-02.lab.bos.redhat.com>
> 
> You probaly need to reset the author on this one.

Oops, sorry. I made this patch on a testing machine. Will correct it.

Thanks a lot for all these comments and great suggestions.

> 
> > 
> > It's ok to disable iommu in normal kernel. While there's no need
> > to disable it in kdump kernel after the on-flight dma issue has
> > heen fixed.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  drivers/iommu/amd_iommu_init.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> > index 8b4bac978062..880f693c809b 100644
> > --- a/drivers/iommu/amd_iommu_init.c
> > +++ b/drivers/iommu/amd_iommu_init.c
> > @@ -2494,7 +2494,8 @@ static int __init early_amd_iommu_init(void)
> >  		goto out;
> >  
> >  	/* Disable any previously enabled IOMMUs */
> > -	disable_iommus();
> > +	if (amd_iommu_disabled)
> > +		disable_iommus();
> >  
> >  	if (amd_iommu_irq_remap)
> >  		amd_iommu_irq_remap = check_ioapic_information();
> > -- 
> > 2.5.5

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled
  2017-07-28  9:06     ` Baoquan He
@ 2017-07-28 11:14       ` Joerg Roedel
  2017-07-28 11:15         ` Baoquan He
  2017-07-31 10:15       ` Baoquan He
  1 sibling, 1 reply; 41+ messages in thread
From: Joerg Roedel @ 2017-07-28 11:14 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, linux-kernel

On Fri, Jul 28, 2017 at 05:06:19PM +0800, Baoquan He wrote:
> Do you mean the allocation of amd_iommu_dev_table in
> early_amd_iommu_init() also need be addressed for 1st kernel? Seems we
> don't make sure that for 1st kernel, like adding GFP_DMA32 flag when
> allocate amd_iommu_dev_table in amd_iommu_dev_table
> early_amd_iommu_init().

Yes, exactly, the first device table also needs to be allocated with
GFP_DMA32 so that it ends up below 4GB.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled
  2017-07-28 11:14       ` Joerg Roedel
@ 2017-07-28 11:15         ` Baoquan He
  2017-07-28 11:18           ` Joerg Roedel
  0 siblings, 1 reply; 41+ messages in thread
From: Baoquan He @ 2017-07-28 11:15 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On 07/28/17 at 01:14pm, Joerg Roedel wrote:
> On Fri, Jul 28, 2017 at 05:06:19PM +0800, Baoquan He wrote:
> > Do you mean the allocation of amd_iommu_dev_table in
> > early_amd_iommu_init() also need be addressed for 1st kernel? Seems we
> > don't make sure that for 1st kernel, like adding GFP_DMA32 flag when
> > allocate amd_iommu_dev_table in amd_iommu_dev_table
> > early_amd_iommu_init().
> 
> Yes, exactly, the first device table also needs to be allocated with
> GFP_DMA32 so that it ends up below 4GB.

Got it, will do. Thanks!

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled
  2017-07-28 11:15         ` Baoquan He
@ 2017-07-28 11:18           ` Joerg Roedel
  2017-07-28 11:26             ` Baoquan He
  0 siblings, 1 reply; 41+ messages in thread
From: Joerg Roedel @ 2017-07-28 11:18 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, linux-kernel

On Fri, Jul 28, 2017 at 07:15:53PM +0800, Baoquan He wrote:
> On 07/28/17 at 01:14pm, Joerg Roedel wrote:
> > Yes, exactly, the first device table also needs to be allocated with
> > GFP_DMA32 so that it ends up below 4GB.
> 
> Got it, will do. Thanks!

Oh, and you also need to check in the kdump kernel whether the old
device-table is really below 4GB and abort the copy otherwise.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled
  2017-07-28 11:18           ` Joerg Roedel
@ 2017-07-28 11:26             ` Baoquan He
  0 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-28 11:26 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On 07/28/17 at 01:18pm, Joerg Roedel wrote:
> On Fri, Jul 28, 2017 at 07:15:53PM +0800, Baoquan He wrote:
> > On 07/28/17 at 01:14pm, Joerg Roedel wrote:
> > > Yes, exactly, the first device table also needs to be allocated with
> > > GFP_DMA32 so that it ends up below 4GB.
> > 
> > Got it, will do. Thanks!
> 
> Oh, and you also need to check in the kdump kernel whether the old
> device-table is really below 4GB and abort the copy otherwise.

Ah, indeed. Will change it. Thanks.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 02/13] iommu/amd: add several helper functions
  2017-07-27 15:06   ` Joerg Roedel
  2017-07-28  2:36     ` Baoquan He
@ 2017-07-31 10:01     ` Baoquan He
  2017-07-31 10:07       ` Joerg Roedel
  1 sibling, 1 reply; 41+ messages in thread
From: Baoquan He @ 2017-07-31 10:01 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

Hi Joerg,

On 07/27/17 at 05:06pm, Joerg Roedel wrote:
> On Fri, Jul 21, 2017 at 04:59:00PM +0800, Baoquan He wrote:
> > Move single iommu enabling codes into a wrapper function early_enable_iommu().
> > This can make later kdump change easier.
> > 
> > And also add iommu_disable_command_buffer and iommu_disable_event_buffer
> > for later usage.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  drivers/iommu/amd_iommu_init.c | 42 +++++++++++++++++++++++++++++++-----------
> >  1 file changed, 31 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> > index e39857ce6481..4ca6e3257d92 100644
> > --- a/drivers/iommu/amd_iommu_init.c
> > +++ b/drivers/iommu/amd_iommu_init.c
> > @@ -634,6 +634,14 @@ static void iommu_enable_command_buffer(struct amd_iommu *iommu)
> >  	amd_iommu_reset_cmd_buffer(iommu);
> >  }
> >  
> > +/*
> > + * This function disables the command buffer
> > + */
> > +static void iommu_disable_command_buffer(struct amd_iommu *iommu)
> > +{
> > +	iommu_feature_disable(iommu, CONTROL_CMDBUF_EN);
> > +}
> > +
> >  static void __init free_command_buffer(struct amd_iommu *iommu)
> >  {
> >  	free_pages((unsigned long)iommu->cmd_buf, get_order(CMD_BUFFER_SIZE));
> > @@ -666,6 +674,14 @@ static void iommu_enable_event_buffer(struct amd_iommu *iommu)
> >  	iommu_feature_enable(iommu, CONTROL_EVT_LOG_EN);
> >  }
> >  
> > +/*
> > + * This function disables the command buffer
> 
> s/command buffer/event log/
> 
> > + */
> > +static void iommu_disable_event_buffer(struct amd_iommu *iommu)
> 
> Please also use event_log here.

I found the event log related handling functions all take
xxx_event_buffer names, like:
alloc_event_buffer()
iommu_enable_event_buffer()
free_event_buffer()

So for consistency, I plan to still use iommu_disable_event_buffer().
Maybe later we can change them in a clean up patch independently.

Thanks
Baoquan

> 
> > +{
> > +	iommu_feature_disable(iommu, CONTROL_EVT_LOG_EN);
> > +}
> > +

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 02/13] iommu/amd: add several helper functions
  2017-07-31 10:01     ` Baoquan He
@ 2017-07-31 10:07       ` Joerg Roedel
  0 siblings, 0 replies; 41+ messages in thread
From: Joerg Roedel @ 2017-07-31 10:07 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, linux-kernel

Hi Baoquan,

On Mon, Jul 31, 2017 at 06:01:11PM +0800, Baoquan He wrote:
> I found the event log related handling functions all take
> xxx_event_buffer names, like:
> alloc_event_buffer()
> iommu_enable_event_buffer()
> free_event_buffer()
> 
> So for consistency, I plan to still use iommu_disable_event_buffer().
> Maybe later we can change them in a clean up patch independently.

Yeah, makes sense. Go ahead with it.



	Joerg

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled
  2017-07-28  9:06     ` Baoquan He
  2017-07-28 11:14       ` Joerg Roedel
@ 2017-07-31 10:15       ` Baoquan He
  2017-07-31 10:21         ` Joerg Roedel
  1 sibling, 1 reply; 41+ messages in thread
From: Baoquan He @ 2017-07-31 10:15 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

Hi Joerg,

On 07/28/17 at 05:06pm, Baoquan He wrote:
> Hi Joerg,
> 
> On 07/27/17 at 05:55pm, Joerg Roedel wrote:
> > On Fri, Jul 21, 2017 at 04:59:08PM +0800, Baoquan He wrote:
> > > AMD pointed out it's unsafe to update the device-table while iommu
> > > is enabled. It turns out that device-table pointer update is split
> > > up into two 32bit writes in the IOMMU hardware. So updating it while
> > > the IOMMU is enabled could have some nasty side effects.
> > > 
> > > The only way to work around this is to allocate the device-table below
> > > 4GB if translation is pre-enabled in kdump kernel. If allocation failed,
> > > still use the old one.
> > 
> > Not only for the kdump kernel. The old device table must also be below
> > 4GB so that its pointer can be updated with a 32bit write.
> > 
> > If the old table is above 4GB you still need the second write to zero
> > the upper parts of the pointer in hardware.
> 
> Do you mean the allocation of amd_iommu_dev_table in
> early_amd_iommu_init() also need be addressed for 1st kernel? Seems we
> don't make sure that for 1st kernel, like adding GFP_DMA32 flag when
> allocate amd_iommu_dev_table in amd_iommu_dev_table
> early_amd_iommu_init().

I plan to add GFP_DMA32 when allocate amd_iommu_dev_table in
early_amd_iommu_init() as below. Then in kdump kernel we don't need to
worry if the old amd_iommu_dev_table could be above 4G, right? And might
not need to check if it's above 4G, right?

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 781a138..85d6445 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2436,7 +2436,8 @@ static int __init early_amd_iommu_init(void)
 
 	/* Device table - directly used by all IOMMUs */
 	ret = -ENOMEM;
-	amd_iommu_dev_table = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+	amd_iommu_dev_table = (void *)__get_free_pages(
+				      GFP_KERNEL | __GFP_ZERO | GFP_DMA32,
 				      get_order(dev_table_size));
 	if (amd_iommu_dev_table == NULL)
 		goto out;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled
  2017-07-31 10:15       ` Baoquan He
@ 2017-07-31 10:21         ` Joerg Roedel
  2017-07-31 10:29           ` Baoquan He
  0 siblings, 1 reply; 41+ messages in thread
From: Joerg Roedel @ 2017-07-31 10:21 UTC (permalink / raw)
  To: Baoquan He; +Cc: iommu, linux-kernel

Hi Baoquan,

On Mon, Jul 31, 2017 at 06:15:30PM +0800, Baoquan He wrote:
> I plan to add GFP_DMA32 when allocate amd_iommu_dev_table in
> early_amd_iommu_init() as below. Then in kdump kernel we don't need to
> worry if the old amd_iommu_dev_table could be above 4G, right? And might
> not need to check if it's above 4G, right?
> 
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 781a138..85d6445 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -2436,7 +2436,8 @@ static int __init early_amd_iommu_init(void)
>  
>  	/* Device table - directly used by all IOMMUs */
>  	ret = -ENOMEM;
> -	amd_iommu_dev_table = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
> +	amd_iommu_dev_table = (void *)__get_free_pages(
> +				      GFP_KERNEL | __GFP_ZERO | GFP_DMA32,
>  				      get_order(dev_table_size));
>  	if (amd_iommu_dev_table == NULL)
>  		goto out;

Yeah, adding GFP_DMA32 is right. But you still need to check it in the
kdump path. Not checking it would mean you trust the old kernel, but
since it paniced there is no reason to put any trust in what happened
before.



	Joerg

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled
  2017-07-31 10:21         ` Joerg Roedel
@ 2017-07-31 10:29           ` Baoquan He
  0 siblings, 0 replies; 41+ messages in thread
From: Baoquan He @ 2017-07-31 10:29 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, linux-kernel

On 07/31/17 at 12:21pm, Joerg Roedel wrote:
> Hi Baoquan,
> 
> On Mon, Jul 31, 2017 at 06:15:30PM +0800, Baoquan He wrote:
> > I plan to add GFP_DMA32 when allocate amd_iommu_dev_table in
> > early_amd_iommu_init() as below. Then in kdump kernel we don't need to
> > worry if the old amd_iommu_dev_table could be above 4G, right? And might
> > not need to check if it's above 4G, right?
> > 
> > diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> > index 781a138..85d6445 100644
> > --- a/drivers/iommu/amd_iommu_init.c
> > +++ b/drivers/iommu/amd_iommu_init.c
> > @@ -2436,7 +2436,8 @@ static int __init early_amd_iommu_init(void)
> >  
> >  	/* Device table - directly used by all IOMMUs */
> >  	ret = -ENOMEM;
> > -	amd_iommu_dev_table = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
> > +	amd_iommu_dev_table = (void *)__get_free_pages(
> > +				      GFP_KERNEL | __GFP_ZERO | GFP_DMA32,
> >  				      get_order(dev_table_size));
> >  	if (amd_iommu_dev_table == NULL)
> >  		goto out;
> 
> Yeah, adding GFP_DMA32 is right. But you still need to check it in the
> kdump path. Not checking it would mean you trust the old kernel, but
> since it paniced there is no reason to put any trust in what happened
> before.

You are right, it could be touched accidentally. It must be checked.
Thanks a lot for your answer!

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2017-07-31 10:29 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-21  8:58 [PATCH v8 00/13] Fix the on-flight DMA issue on system with amd iommu Baoquan He
2017-07-21  8:58 ` [PATCH v8 01/13] iommu/amd: Detect pre enabled translation Baoquan He
2017-07-27 15:04   ` Joerg Roedel
2017-07-28  2:31     ` Baoquan He
2017-07-21  8:59 ` [PATCH v8 02/13] iommu/amd: add several helper functions Baoquan He
2017-07-27 15:06   ` Joerg Roedel
2017-07-28  2:36     ` Baoquan He
2017-07-31 10:01     ` Baoquan He
2017-07-31 10:07       ` Joerg Roedel
2017-07-21  8:59 ` [PATCH v8 03/13] Revert "iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel" Baoquan He
2017-07-21  8:59 ` [PATCH v8 04/13] iommu/amd: Define bit fields for DTE particularly Baoquan He
2017-07-21  8:59 ` [PATCH v8 05/13] iommu/amd: Add function copy_dev_tables() Baoquan He
2017-07-27 15:29   ` Joerg Roedel
2017-07-28  3:59     ` Baoquan He
2017-07-21  8:59 ` [PATCH v8 06/13] iommu/amd: copy old trans table from old kernel Baoquan He
2017-07-27 15:38   ` Joerg Roedel
2017-07-28  6:52     ` Baoquan He
2017-07-21  8:59 ` [PATCH v8 07/13] iommu/amd: Do sanity check for irq remap of old dev table entry Baoquan He
2017-07-21  8:59 ` [PATCH v8 08/13] iommu: Add is_attach_deferred call-back to iommu-ops Baoquan He
2017-07-21  8:59 ` [PATCH v8 09/13] iommu/amd: Use is_attach_deferred call-back Baoquan He
2017-07-27 15:53   ` Joerg Roedel
2017-07-28  4:02     ` Baoquan He
2017-07-21  8:59 ` [PATCH v8 10/13] iommu/amd: Allocate memory below 4G for dev table if translation pre-enabled Baoquan He
2017-07-27 15:55   ` Joerg Roedel
2017-07-28  9:06     ` Baoquan He
2017-07-28 11:14       ` Joerg Roedel
2017-07-28 11:15         ` Baoquan He
2017-07-28 11:18           ` Joerg Roedel
2017-07-28 11:26             ` Baoquan He
2017-07-31 10:15       ` Baoquan He
2017-07-31 10:21         ` Joerg Roedel
2017-07-31 10:29           ` Baoquan He
2017-07-21  8:59 ` [PATCH v8 11/13] iommu/amd: Don't copy GCR3 table root pointer Baoquan He
2017-07-23 22:33   ` kbuild test robot
2017-07-23 23:53     ` Baoquan He
2017-07-27 15:57   ` Joerg Roedel
2017-07-28  9:07     ` Baoquan He
2017-07-21  8:59 ` [PATCH v8 12/13] iommu/amd: Clear out the GV flag when handle deferred domain attach Baoquan He
2017-07-21  8:59 ` [PATCH v8 13/13] iommu/amd: Disable iommu only if amd_iommu=off is specified Baoquan He
2017-07-27 15:58   ` Joerg Roedel
2017-07-28  9:08     ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).