linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] iommu/amd: Always allow to map huge pages
@ 2018-11-09 11:07 Joerg Roedel
  2018-11-09 11:07 ` [PATCH 1/7] iommu/amd: Collect page-table pages in freelist Joerg Roedel
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Joerg Roedel @ 2018-11-09 11:07 UTC (permalink / raw)
  To: Alex Williamson, Joerg Roedel; +Cc: iommu, kvm, linux-kernel, jroedel

Hi,

the AMD IOMMU driver had an issue for a long time where it
didn't allow to map a huge-page when smaller mappings
existed at that address range before. The VFIO driver even
had a workaround for that behavior.

These patches fix the issue and remove the workaround from
the VFIO driver.

Please review.

Thanks,

	Joerg

Joerg Roedel (7):
  iommu/amd: Collect page-table pages in freelist
  iommu/amd: Introduce free_sub_pt() function
  iommu/amd: Ignore page-mode 7 in free_sub_pt()
  iommu/amd: Allow downgrading page-sizes in alloc_pte()
  iommu/amd: Restart loop if cmpxchg64 succeeded in alloc_pte()
  iommu/amd: Allow to upgrade page-size
  vfio/type1: Remove map_try_harder() code path

 drivers/iommu/amd_iommu.c       | 204 +++++++++++++++++++++-----------
 drivers/iommu/amd_iommu_types.h |   1 +
 drivers/vfio/vfio_iommu_type1.c |  33 +-----
 3 files changed, 137 insertions(+), 101 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/7] iommu/amd: Collect page-table pages in freelist
  2018-11-09 11:07 [PATCH 0/7] iommu/amd: Always allow to map huge pages Joerg Roedel
@ 2018-11-09 11:07 ` Joerg Roedel
  2018-11-09 11:07 ` [PATCH 2/7] iommu/amd: Introduce free_sub_pt() function Joerg Roedel
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2018-11-09 11:07 UTC (permalink / raw)
  To: Alex Williamson, Joerg Roedel; +Cc: iommu, kvm, linux-kernel, jroedel

From: Joerg Roedel <jroedel@suse.de>

Collect all pages that belong to a page-table in a list and
free them after the tree has been traversed. This allows to
implement safer page-table updates in subsequent patches.
Also move the functions for page-table freeing a bit upwards
in the file so that they are usable from the iommu_map() path.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 144 ++++++++++++++++++++++----------------
 1 file changed, 83 insertions(+), 61 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 1167ff0416cf..2655bd91af93 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1317,6 +1317,89 @@ static void domain_flush_devices(struct protection_domain *domain)
  *
  ****************************************************************************/
 
+static void free_page_list(struct page *freelist)
+{
+	while (freelist != NULL) {
+		unsigned long p = (unsigned long)page_address(freelist);
+		freelist = freelist->freelist;
+		free_page(p);
+	}
+}
+
+static struct page *free_pt_page(unsigned long pt, struct page *freelist)
+{
+	struct page *p = virt_to_page((void *)pt);
+
+	p->freelist = freelist;
+
+	return p;
+}
+
+#define DEFINE_FREE_PT_FN(LVL, FN)						\
+static struct page *free_pt_##LVL (unsigned long __pt, struct page *freelist)	\
+{										\
+	unsigned long p;							\
+	u64 *pt;								\
+	int i;									\
+										\
+	pt = (u64 *)__pt;							\
+										\
+	for (i = 0; i < 512; ++i) {						\
+		/* PTE present? */						\
+		if (!IOMMU_PTE_PRESENT(pt[i]))					\
+			continue;						\
+										\
+		/* Large PTE? */						\
+		if (PM_PTE_LEVEL(pt[i]) == 0 ||					\
+		    PM_PTE_LEVEL(pt[i]) == 7)					\
+			continue;						\
+										\
+		p = (unsigned long)IOMMU_PTE_PAGE(pt[i]);			\
+		freelist = FN(p, freelist);					\
+	}									\
+										\
+	return free_pt_page((unsigned long)pt, freelist);			\
+}
+
+DEFINE_FREE_PT_FN(l2, free_pt_page)
+DEFINE_FREE_PT_FN(l3, free_pt_l2)
+DEFINE_FREE_PT_FN(l4, free_pt_l3)
+DEFINE_FREE_PT_FN(l5, free_pt_l4)
+DEFINE_FREE_PT_FN(l6, free_pt_l5)
+
+static void free_pagetable(struct protection_domain *domain)
+{
+	unsigned long root = (unsigned long)domain->pt_root;
+	struct page *freelist = NULL;
+
+	switch (domain->mode) {
+	case PAGE_MODE_NONE:
+		break;
+	case PAGE_MODE_1_LEVEL:
+		freelist = free_pt_page(root, freelist);
+		break;
+	case PAGE_MODE_2_LEVEL:
+		freelist = free_pt_l2(root, freelist);
+		break;
+	case PAGE_MODE_3_LEVEL:
+		freelist = free_pt_l3(root, freelist);
+		break;
+	case PAGE_MODE_4_LEVEL:
+		freelist = free_pt_l4(root, freelist);
+		break;
+	case PAGE_MODE_5_LEVEL:
+		freelist = free_pt_l5(root, freelist);
+		break;
+	case PAGE_MODE_6_LEVEL:
+		freelist = free_pt_l6(root, freelist);
+		break;
+	default:
+		BUG();
+	}
+
+	free_page_list(freelist);
+}
+
 /*
  * This function is used to add another level to an IO page table. Adding
  * another level increases the size of the address space by 9 bits to a size up
@@ -1638,67 +1721,6 @@ static void domain_id_free(int id)
 	spin_unlock(&pd_bitmap_lock);
 }
 
-#define DEFINE_FREE_PT_FN(LVL, FN)				\
-static void free_pt_##LVL (unsigned long __pt)			\
-{								\
-	unsigned long p;					\
-	u64 *pt;						\
-	int i;							\
-								\
-	pt = (u64 *)__pt;					\
-								\
-	for (i = 0; i < 512; ++i) {				\
-		/* PTE present? */				\
-		if (!IOMMU_PTE_PRESENT(pt[i]))			\
-			continue;				\
-								\
-		/* Large PTE? */				\
-		if (PM_PTE_LEVEL(pt[i]) == 0 ||			\
-		    PM_PTE_LEVEL(pt[i]) == 7)			\
-			continue;				\
-								\
-		p = (unsigned long)IOMMU_PTE_PAGE(pt[i]);	\
-		FN(p);						\
-	}							\
-	free_page((unsigned long)pt);				\
-}
-
-DEFINE_FREE_PT_FN(l2, free_page)
-DEFINE_FREE_PT_FN(l3, free_pt_l2)
-DEFINE_FREE_PT_FN(l4, free_pt_l3)
-DEFINE_FREE_PT_FN(l5, free_pt_l4)
-DEFINE_FREE_PT_FN(l6, free_pt_l5)
-
-static void free_pagetable(struct protection_domain *domain)
-{
-	unsigned long root = (unsigned long)domain->pt_root;
-
-	switch (domain->mode) {
-	case PAGE_MODE_NONE:
-		break;
-	case PAGE_MODE_1_LEVEL:
-		free_page(root);
-		break;
-	case PAGE_MODE_2_LEVEL:
-		free_pt_l2(root);
-		break;
-	case PAGE_MODE_3_LEVEL:
-		free_pt_l3(root);
-		break;
-	case PAGE_MODE_4_LEVEL:
-		free_pt_l4(root);
-		break;
-	case PAGE_MODE_5_LEVEL:
-		free_pt_l5(root);
-		break;
-	case PAGE_MODE_6_LEVEL:
-		free_pt_l6(root);
-		break;
-	default:
-		BUG();
-	}
-}
-
 static void free_gcr3_tbl_level1(u64 *tbl)
 {
 	u64 *ptr;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/7] iommu/amd: Introduce free_sub_pt() function
  2018-11-09 11:07 [PATCH 0/7] iommu/amd: Always allow to map huge pages Joerg Roedel
  2018-11-09 11:07 ` [PATCH 1/7] iommu/amd: Collect page-table pages in freelist Joerg Roedel
@ 2018-11-09 11:07 ` Joerg Roedel
  2018-11-09 11:07 ` [PATCH 3/7] iommu/amd: Ignore page-mode 7 in free_sub_pt() Joerg Roedel
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2018-11-09 11:07 UTC (permalink / raw)
  To: Alex Williamson, Joerg Roedel; +Cc: iommu, kvm, linux-kernel, jroedel

From: Joerg Roedel <jroedel@suse.de>

The function is a more generic version of free_pagetable()
and will be used to free only specific sub-trees of a
page-table.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 2655bd91af93..1186571f77e1 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1367,12 +1367,10 @@ DEFINE_FREE_PT_FN(l4, free_pt_l3)
 DEFINE_FREE_PT_FN(l5, free_pt_l4)
 DEFINE_FREE_PT_FN(l6, free_pt_l5)
 
-static void free_pagetable(struct protection_domain *domain)
+static struct page *free_sub_pt(unsigned long root, int mode,
+				struct page *freelist)
 {
-	unsigned long root = (unsigned long)domain->pt_root;
-	struct page *freelist = NULL;
-
-	switch (domain->mode) {
+	switch (mode) {
 	case PAGE_MODE_NONE:
 		break;
 	case PAGE_MODE_1_LEVEL:
@@ -1397,6 +1395,16 @@ static void free_pagetable(struct protection_domain *domain)
 		BUG();
 	}
 
+	return freelist;
+}
+
+static void free_pagetable(struct protection_domain *domain)
+{
+	unsigned long root = (unsigned long)domain->pt_root;
+	struct page *freelist = NULL;
+
+	free_sub_pt(root, domain->mode, freelist);
+
 	free_page_list(freelist);
 }
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/7] iommu/amd: Ignore page-mode 7 in free_sub_pt()
  2018-11-09 11:07 [PATCH 0/7] iommu/amd: Always allow to map huge pages Joerg Roedel
  2018-11-09 11:07 ` [PATCH 1/7] iommu/amd: Collect page-table pages in freelist Joerg Roedel
  2018-11-09 11:07 ` [PATCH 2/7] iommu/amd: Introduce free_sub_pt() function Joerg Roedel
@ 2018-11-09 11:07 ` Joerg Roedel
  2018-11-09 11:07 ` [PATCH 4/7] iommu/amd: Allow downgrading page-sizes in alloc_pte() Joerg Roedel
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2018-11-09 11:07 UTC (permalink / raw)
  To: Alex Williamson, Joerg Roedel; +Cc: iommu, kvm, linux-kernel, jroedel

From: Joerg Roedel <jroedel@suse.de>

The page-mode 7 is a special one as it marks a final PTE to
a page with an intermediary size.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c       | 4 ++++
 drivers/iommu/amd_iommu_types.h | 1 +
 2 files changed, 5 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 1186571f77e1..49b5d3115e56 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1372,6 +1372,7 @@ static struct page *free_sub_pt(unsigned long root, int mode,
 {
 	switch (mode) {
 	case PAGE_MODE_NONE:
+	case PAGE_MODE_7_LEVEL:
 		break;
 	case PAGE_MODE_1_LEVEL:
 		freelist = free_pt_page(root, freelist);
@@ -1403,6 +1404,9 @@ static void free_pagetable(struct protection_domain *domain)
 	unsigned long root = (unsigned long)domain->pt_root;
 	struct page *freelist = NULL;
 
+	BUG_ON(domain->mode < PAGE_MODE_NONE ||
+	       domain->mode > PAGE_MODE_6_LEVEL);
+
 	free_sub_pt(root, domain->mode, freelist);
 
 	free_page_list(freelist);
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index e2b342e65a7b..eae0741f72dc 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -269,6 +269,7 @@
 #define PAGE_MODE_4_LEVEL 0x04
 #define PAGE_MODE_5_LEVEL 0x05
 #define PAGE_MODE_6_LEVEL 0x06
+#define PAGE_MODE_7_LEVEL 0x07
 
 #define PM_LEVEL_SHIFT(x)	(12 + ((x) * 9))
 #define PM_LEVEL_SIZE(x)	(((x) < 6) ? \
-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 4/7] iommu/amd: Allow downgrading page-sizes in alloc_pte()
  2018-11-09 11:07 [PATCH 0/7] iommu/amd: Always allow to map huge pages Joerg Roedel
                   ` (2 preceding siblings ...)
  2018-11-09 11:07 ` [PATCH 3/7] iommu/amd: Ignore page-mode 7 in free_sub_pt() Joerg Roedel
@ 2018-11-09 11:07 ` Joerg Roedel
  2018-11-09 11:07 ` [PATCH 5/7] iommu/amd: Restart loop if cmpxchg64 succeeded " Joerg Roedel
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2018-11-09 11:07 UTC (permalink / raw)
  To: Alex Williamson, Joerg Roedel; +Cc: iommu, kvm, linux-kernel, jroedel

From: Joerg Roedel <jroedel@suse.de>

Before this patch it was not possible the downgrade a
mapping established with page-mode 7 to a mapping using
smaller page-sizes, because the pte_level != level check
prevented that.

Treat page-mode 7 like a non-present mapping and allow to
overwrite it in alloc_pte().

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 49b5d3115e56..6a88ba9321d1 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1460,10 +1460,13 @@ static u64 *alloc_pte(struct protection_domain *domain,
 
 	while (level > end_lvl) {
 		u64 __pte, __npte;
+		int pte_level;
 
-		__pte = *pte;
+		__pte     = *pte;
+		pte_level = PM_PTE_LEVEL(__pte);
 
-		if (!IOMMU_PTE_PRESENT(__pte)) {
+		if (!IOMMU_PTE_PRESENT(__pte) ||
+		    pte_level == PAGE_MODE_7_LEVEL) {
 			page = (u64 *)get_zeroed_page(gfp);
 			if (!page)
 				return NULL;
@@ -1475,10 +1478,13 @@ static u64 *alloc_pte(struct protection_domain *domain,
 				free_page((unsigned long)page);
 				continue;
 			}
+
+			if (pte_level == PAGE_MODE_7_LEVEL)
+				domain->updated = true;
 		}
 
 		/* No level skipping support yet */
-		if (PM_PTE_LEVEL(*pte) != level)
+		if (pte_level != level)
 			return NULL;
 
 		level -= 1;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 5/7] iommu/amd: Restart loop if cmpxchg64 succeeded in alloc_pte()
  2018-11-09 11:07 [PATCH 0/7] iommu/amd: Always allow to map huge pages Joerg Roedel
                   ` (3 preceding siblings ...)
  2018-11-09 11:07 ` [PATCH 4/7] iommu/amd: Allow downgrading page-sizes in alloc_pte() Joerg Roedel
@ 2018-11-09 11:07 ` Joerg Roedel
  2018-11-09 11:07 ` [PATCH 6/7] iommu/amd: Allow to upgrade page-size Joerg Roedel
  2018-11-09 11:07 ` [PATCH 7/7] vfio/type1: Remove map_try_harder() code path Joerg Roedel
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2018-11-09 11:07 UTC (permalink / raw)
  To: Alex Williamson, Joerg Roedel; +Cc: iommu, kvm, linux-kernel, jroedel

From: Joerg Roedel <jroedel@suse.de>

This makes sure that __pte always contains the correct value
when the pointer to the next page-table level is derived.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 6a88ba9321d1..c539b2a59019 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1474,13 +1474,12 @@ static u64 *alloc_pte(struct protection_domain *domain,
 			__npte = PM_LEVEL_PDE(level, iommu_virt_to_phys(page));
 
 			/* pte could have been changed somewhere. */
-			if (cmpxchg64(pte, __pte, __npte) != __pte) {
+			if (cmpxchg64(pte, __pte, __npte) != __pte)
 				free_page((unsigned long)page);
-				continue;
-			}
-
-			if (pte_level == PAGE_MODE_7_LEVEL)
+			else if (pte_level == PAGE_MODE_7_LEVEL)
 				domain->updated = true;
+
+			continue;
 		}
 
 		/* No level skipping support yet */
@@ -1489,7 +1488,7 @@ static u64 *alloc_pte(struct protection_domain *domain,
 
 		level -= 1;
 
-		pte = IOMMU_PTE_PAGE(*pte);
+		pte = IOMMU_PTE_PAGE(__pte);
 
 		if (pte_page && level == end_lvl)
 			*pte_page = pte;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 6/7] iommu/amd: Allow to upgrade page-size
  2018-11-09 11:07 [PATCH 0/7] iommu/amd: Always allow to map huge pages Joerg Roedel
                   ` (4 preceding siblings ...)
  2018-11-09 11:07 ` [PATCH 5/7] iommu/amd: Restart loop if cmpxchg64 succeeded " Joerg Roedel
@ 2018-11-09 11:07 ` Joerg Roedel
  2018-11-09 11:07 ` [PATCH 7/7] vfio/type1: Remove map_try_harder() code path Joerg Roedel
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2018-11-09 11:07 UTC (permalink / raw)
  To: Alex Williamson, Joerg Roedel; +Cc: iommu, kvm, linux-kernel, jroedel

From: Joerg Roedel <jroedel@suse.de>

Before this patch the iommu_map_page() function failed when
it tried to map a huge-page where smaller mappings existed
before.

With this change the page-table pages of the old mappings
are teared down, so that the huge-page can be mapped.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index c539b2a59019..71797b3d67e5 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1557,6 +1557,25 @@ static u64 *fetch_pte(struct protection_domain *domain,
 	return pte;
 }
 
+static struct page *free_clear_pte(u64 *pte, u64 pteval, struct page *freelist)
+{
+	unsigned long pt;
+	int mode;
+
+	while (cmpxchg64(pte, pteval, 0) != pteval) {
+		pr_warn("AMD-Vi: IOMMU pte changed since we read it\n");
+		pteval = *pte;
+	}
+
+	if (!IOMMU_PTE_PRESENT(pteval))
+		return freelist;
+
+	pt   = (unsigned long)IOMMU_PTE_PAGE(pteval);
+	mode = IOMMU_PTE_MODE(pteval);
+
+	return free_sub_pt(pt, mode, freelist);
+}
+
 /*
  * Generic mapping functions. It maps a physical address into a DMA
  * address space. It allocates the page table pages if necessary.
@@ -1571,6 +1590,7 @@ static int iommu_map_page(struct protection_domain *dom,
 			  int prot,
 			  gfp_t gfp)
 {
+	struct page *freelist = NULL;
 	u64 __pte, *pte;
 	int i, count;
 
@@ -1587,8 +1607,10 @@ static int iommu_map_page(struct protection_domain *dom,
 		return -ENOMEM;
 
 	for (i = 0; i < count; ++i)
-		if (IOMMU_PTE_PRESENT(pte[i]))
-			return -EBUSY;
+		freelist = free_clear_pte(&pte[i], pte[i], freelist);
+
+	if (freelist != NULL)
+		dom->updated = true;
 
 	if (count > 1) {
 		__pte = PAGE_SIZE_PTE(__sme_set(phys_addr), page_size);
@@ -1606,6 +1628,9 @@ static int iommu_map_page(struct protection_domain *dom,
 
 	update_domain(dom);
 
+	/* Everything flushed out, free pages now */
+	free_page_list(freelist);
+
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 7/7] vfio/type1: Remove map_try_harder() code path
  2018-11-09 11:07 [PATCH 0/7] iommu/amd: Always allow to map huge pages Joerg Roedel
                   ` (5 preceding siblings ...)
  2018-11-09 11:07 ` [PATCH 6/7] iommu/amd: Allow to upgrade page-size Joerg Roedel
@ 2018-11-09 11:07 ` Joerg Roedel
  2018-11-09 16:23   ` Alex Williamson
  6 siblings, 1 reply; 10+ messages in thread
From: Joerg Roedel @ 2018-11-09 11:07 UTC (permalink / raw)
  To: Alex Williamson, Joerg Roedel; +Cc: iommu, kvm, linux-kernel, jroedel

From: Joerg Roedel <jroedel@suse.de>

The AMD IOMMU driver can now map a huge-page where smaller
mappings existed before, so this code-path is no longer
triggered.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/vfio/vfio_iommu_type1.c | 33 ++-------------------------------
 1 file changed, 2 insertions(+), 31 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index d9fd3188615d..7651cfb14836 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -978,32 +978,6 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
 	return ret;
 }
 
-/*
- * Turns out AMD IOMMU has a page table bug where it won't map large pages
- * to a region that previously mapped smaller pages.  This should be fixed
- * soon, so this is just a temporary workaround to break mappings down into
- * PAGE_SIZE.  Better to map smaller pages than nothing.
- */
-static int map_try_harder(struct vfio_domain *domain, dma_addr_t iova,
-			  unsigned long pfn, long npage, int prot)
-{
-	long i;
-	int ret = 0;
-
-	for (i = 0; i < npage; i++, pfn++, iova += PAGE_SIZE) {
-		ret = iommu_map(domain->domain, iova,
-				(phys_addr_t)pfn << PAGE_SHIFT,
-				PAGE_SIZE, prot | domain->prot);
-		if (ret)
-			break;
-	}
-
-	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE)
-		iommu_unmap(domain->domain, iova, PAGE_SIZE);
-
-	return ret;
-}
-
 static int vfio_iommu_map(struct vfio_iommu *iommu, dma_addr_t iova,
 			  unsigned long pfn, long npage, int prot)
 {
@@ -1013,11 +987,8 @@ static int vfio_iommu_map(struct vfio_iommu *iommu, dma_addr_t iova,
 	list_for_each_entry(d, &iommu->domain_list, next) {
 		ret = iommu_map(d->domain, iova, (phys_addr_t)pfn << PAGE_SHIFT,
 				npage << PAGE_SHIFT, prot | d->prot);
-		if (ret) {
-			if (ret != -EBUSY ||
-			    map_try_harder(d, iova, pfn, npage, prot))
-				goto unwind;
-		}
+		if (ret)
+			goto unwind;
 
 		cond_resched();
 	}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 7/7] vfio/type1: Remove map_try_harder() code path
  2018-11-09 11:07 ` [PATCH 7/7] vfio/type1: Remove map_try_harder() code path Joerg Roedel
@ 2018-11-09 16:23   ` Alex Williamson
  2018-11-15 15:55     ` Joerg Roedel
  0 siblings, 1 reply; 10+ messages in thread
From: Alex Williamson @ 2018-11-09 16:23 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, kvm, linux-kernel, jroedel

On Fri,  9 Nov 2018 12:07:12 +0100
Joerg Roedel <joro@8bytes.org> wrote:

> From: Joerg Roedel <jroedel@suse.de>
> 
> The AMD IOMMU driver can now map a huge-page where smaller
> mappings existed before, so this code-path is no longer
> triggered.
> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 33 ++-------------------------------
>  1 file changed, 2 insertions(+), 31 deletions(-)

Cool, glad to see this finally fixed.  My "should be fixed soon"
comment turned out to be a little optimistic with the fix finally
coming 5 years later.  We could of course keep this code as it really
doesn't harm anything, but I'm in favor trying to remove it if we think
it's dead now.  In order to expedite into one pull:

Acked-by: Alex Williamson <alex.williamson@redhat.com>

Thanks,
Alex
 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index d9fd3188615d..7651cfb14836 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -978,32 +978,6 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>  	return ret;
>  }
>  
> -/*
> - * Turns out AMD IOMMU has a page table bug where it won't map large pages
> - * to a region that previously mapped smaller pages.  This should be fixed
> - * soon, so this is just a temporary workaround to break mappings down into
> - * PAGE_SIZE.  Better to map smaller pages than nothing.
> - */
> -static int map_try_harder(struct vfio_domain *domain, dma_addr_t iova,
> -			  unsigned long pfn, long npage, int prot)
> -{
> -	long i;
> -	int ret = 0;
> -
> -	for (i = 0; i < npage; i++, pfn++, iova += PAGE_SIZE) {
> -		ret = iommu_map(domain->domain, iova,
> -				(phys_addr_t)pfn << PAGE_SHIFT,
> -				PAGE_SIZE, prot | domain->prot);
> -		if (ret)
> -			break;
> -	}
> -
> -	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE)
> -		iommu_unmap(domain->domain, iova, PAGE_SIZE);
> -
> -	return ret;
> -}
> -
>  static int vfio_iommu_map(struct vfio_iommu *iommu, dma_addr_t iova,
>  			  unsigned long pfn, long npage, int prot)
>  {
> @@ -1013,11 +987,8 @@ static int vfio_iommu_map(struct vfio_iommu *iommu, dma_addr_t iova,
>  	list_for_each_entry(d, &iommu->domain_list, next) {
>  		ret = iommu_map(d->domain, iova, (phys_addr_t)pfn << PAGE_SHIFT,
>  				npage << PAGE_SHIFT, prot | d->prot);
> -		if (ret) {
> -			if (ret != -EBUSY ||
> -			    map_try_harder(d, iova, pfn, npage, prot))
> -				goto unwind;
> -		}
> +		if (ret)
> +			goto unwind;
>  
>  		cond_resched();
>  	}


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 7/7] vfio/type1: Remove map_try_harder() code path
  2018-11-09 16:23   ` Alex Williamson
@ 2018-11-15 15:55     ` Joerg Roedel
  0 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2018-11-15 15:55 UTC (permalink / raw)
  To: Alex Williamson; +Cc: iommu, kvm, linux-kernel, jroedel

Hi Alex,

On Fri, Nov 09, 2018 at 09:23:29AM -0700, Alex Williamson wrote:
> Cool, glad to see this finally fixed.  My "should be fixed soon"
> comment turned out to be a little optimistic with the fix finally
> coming 5 years later.  We could of course keep this code as it really
> doesn't harm anything, but I'm in favor trying to remove it if we think
> it's dead now.

Yeah, it took a while to fix that :) And we can easily revert this patch
if it turns out someone else is also relying on the workaround.

> In order to expedite into one pull:
> 
> Acked-by: Alex Williamson <alex.williamson@redhat.com>

Thanks a lot. I've queued these patches into my tree now.


Regards,

	Joerg

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-11-15 15:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-09 11:07 [PATCH 0/7] iommu/amd: Always allow to map huge pages Joerg Roedel
2018-11-09 11:07 ` [PATCH 1/7] iommu/amd: Collect page-table pages in freelist Joerg Roedel
2018-11-09 11:07 ` [PATCH 2/7] iommu/amd: Introduce free_sub_pt() function Joerg Roedel
2018-11-09 11:07 ` [PATCH 3/7] iommu/amd: Ignore page-mode 7 in free_sub_pt() Joerg Roedel
2018-11-09 11:07 ` [PATCH 4/7] iommu/amd: Allow downgrading page-sizes in alloc_pte() Joerg Roedel
2018-11-09 11:07 ` [PATCH 5/7] iommu/amd: Restart loop if cmpxchg64 succeeded " Joerg Roedel
2018-11-09 11:07 ` [PATCH 6/7] iommu/amd: Allow to upgrade page-size Joerg Roedel
2018-11-09 11:07 ` [PATCH 7/7] vfio/type1: Remove map_try_harder() code path Joerg Roedel
2018-11-09 16:23   ` Alex Williamson
2018-11-15 15:55     ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).