linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] AMD IOMMU Updates for v4.1
@ 2015-04-01 12:58 Joerg Roedel
  2015-04-01 12:58 ` [PATCH 1/9] iommu/amd: Use BUS_NOTIFY_REMOVED_DEVICE Joerg Roedel
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Joerg Roedel @ 2015-04-01 12:58 UTC (permalink / raw)
  To: iommu; +Cc: Suravee.Suthikulpanit, linux-kernel, Joerg Roedel

Hi,

here are a few fixes and enhancements for the AMD IOMMU
driver for the next merge window. They are tested on
different versions of AMD IOMMUs. Please review.

Thanks,

	Joerg

Joerg Roedel (9):
  iommu/amd: Use BUS_NOTIFY_REMOVED_DEVICE
  iommu/amd: Ignore BUS_NOTIFY_UNBOUND_DRIVER event
  iommu/amd: Don't allocate with __GFP_ZERO in alloc_coherent
  iommu/amd: Add support for contiguous dma allocator
  iommu/amd: Return the pte page-size in fetch_pte
  iommu/amd: Optimize iommu_unmap_page for new fetch_pte interface
  iommu/amd: Optimize alloc_new_range for new fetch_pte interface
  iommu/amd: Optimize amd_iommu_iova_to_phys for new fetch_pte interface
  iommu/amd: Correctly encode huge pages in iommu page tables

 drivers/iommu/amd_iommu.c       | 166 +++++++++++++++++++---------------------
 drivers/iommu/amd_iommu_types.h |   6 ++
 2 files changed, 83 insertions(+), 89 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/9] iommu/amd: Use BUS_NOTIFY_REMOVED_DEVICE
  2015-04-01 12:58 [PATCH 0/9] AMD IOMMU Updates for v4.1 Joerg Roedel
@ 2015-04-01 12:58 ` Joerg Roedel
  2015-04-01 12:58 ` [PATCH 2/9] iommu/amd: Ignore BUS_NOTIFY_UNBOUND_DRIVER event Joerg Roedel
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2015-04-01 12:58 UTC (permalink / raw)
  To: iommu; +Cc: Suravee.Suthikulpanit, linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Use the new device-notifier event instead of the old
BUS_NOTIFY_DEL_DEVICE to make sure the device driver had a
chance to uninit the device before all its mappings are
teared down.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 48882c1..8a1dea4 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2467,7 +2467,7 @@ static int device_change_notifier(struct notifier_block *nb,
 		dev->archdata.dma_ops = &amd_iommu_dma_ops;
 
 		break;
-	case BUS_NOTIFY_DEL_DEVICE:
+	case BUS_NOTIFY_REMOVED_DEVICE:
 
 		iommu_uninit_device(dev);
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/9] iommu/amd: Ignore BUS_NOTIFY_UNBOUND_DRIVER event
  2015-04-01 12:58 [PATCH 0/9] AMD IOMMU Updates for v4.1 Joerg Roedel
  2015-04-01 12:58 ` [PATCH 1/9] iommu/amd: Use BUS_NOTIFY_REMOVED_DEVICE Joerg Roedel
@ 2015-04-01 12:58 ` Joerg Roedel
  2015-04-01 12:58 ` [PATCH 3/9] iommu/amd: Don't allocate with __GFP_ZERO in alloc_coherent Joerg Roedel
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2015-04-01 12:58 UTC (permalink / raw)
  To: iommu; +Cc: Suravee.Suthikulpanit, linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Detaching a device from its domain at this event is
problematic for several reasons:

	* The device might me in an alias group and
	  detaching it will also detach all other devices in
	  the group. This removes valid DMA mappings from
	  the other devices causing io-page-faults and lets
	  these devices fail.

	* Devices might have unity mappings specified by the
	  IVRS table. These mappings are required for the
	  device even when no device driver is attached.
	  Detaching the device from its domain in driver
	  unbind will also remove these unity mappings.

This patch removes the handling of the BUS_NOTIFY_UNBOUND_DRIVER
event to prevent these issues and align it better with the
behavior of the VT-d driver.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 8a1dea4..994cc7d 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2422,16 +2422,6 @@ static int device_change_notifier(struct notifier_block *nb,
 	dev_data = get_dev_data(dev);
 
 	switch (action) {
-	case BUS_NOTIFY_UNBOUND_DRIVER:
-
-		domain = domain_for_device(dev);
-
-		if (!domain)
-			goto out;
-		if (dev_data->passthrough)
-			break;
-		detach_device(dev);
-		break;
 	case BUS_NOTIFY_ADD_DEVICE:
 
 		iommu_init_device(dev);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/9] iommu/amd: Don't allocate with __GFP_ZERO in alloc_coherent
  2015-04-01 12:58 [PATCH 0/9] AMD IOMMU Updates for v4.1 Joerg Roedel
  2015-04-01 12:58 ` [PATCH 1/9] iommu/amd: Use BUS_NOTIFY_REMOVED_DEVICE Joerg Roedel
  2015-04-01 12:58 ` [PATCH 2/9] iommu/amd: Ignore BUS_NOTIFY_UNBOUND_DRIVER event Joerg Roedel
@ 2015-04-01 12:58 ` Joerg Roedel
  2015-04-01 12:58 ` [PATCH 4/9] iommu/amd: Add support for contiguous dma allocator Joerg Roedel
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2015-04-01 12:58 UTC (permalink / raw)
  To: iommu; +Cc: Suravee.Suthikulpanit, linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Don't explicitly add __GFP_ZERO to the allocator flags.
Leave this up to the caller.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 994cc7d..c2e6f13 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2931,7 +2931,6 @@ static void *alloc_coherent(struct device *dev, size_t size,
 
 	dma_mask  = dev->coherent_dma_mask;
 	flag     &= ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32);
-	flag     |= __GFP_ZERO;
 
 	virt_addr = (void *)__get_free_pages(flag, get_order(size));
 	if (!virt_addr)
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/9] iommu/amd: Add support for contiguous dma allocator
  2015-04-01 12:58 [PATCH 0/9] AMD IOMMU Updates for v4.1 Joerg Roedel
                   ` (2 preceding siblings ...)
  2015-04-01 12:58 ` [PATCH 3/9] iommu/amd: Don't allocate with __GFP_ZERO in alloc_coherent Joerg Roedel
@ 2015-04-01 12:58 ` Joerg Roedel
  2015-04-01 12:58 ` [PATCH 5/9] iommu/amd: Return the pte page-size in fetch_pte Joerg Roedel
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2015-04-01 12:58 UTC (permalink / raw)
  To: iommu; +Cc: Suravee.Suthikulpanit, linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Add code to allocate memory from the contiguous memory
allocator to support coherent allocations larger than 8MB.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 44 ++++++++++++++++++++++++++++----------------
 1 file changed, 28 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index c2e6f13..49ecf00 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -33,6 +33,7 @@
 #include <linux/export.h>
 #include <linux/irq.h>
 #include <linux/msi.h>
+#include <linux/dma-contiguous.h>
 #include <asm/irq_remapping.h>
 #include <asm/io_apic.h>
 #include <asm/apic.h>
@@ -2913,37 +2914,42 @@ static void *alloc_coherent(struct device *dev, size_t size,
 			    dma_addr_t *dma_addr, gfp_t flag,
 			    struct dma_attrs *attrs)
 {
-	unsigned long flags;
-	void *virt_addr;
-	struct protection_domain *domain;
-	phys_addr_t paddr;
 	u64 dma_mask = dev->coherent_dma_mask;
+	struct protection_domain *domain;
+	unsigned long flags;
+	struct page *page;
 
 	INC_STATS_COUNTER(cnt_alloc_coherent);
 
 	domain = get_domain(dev);
 	if (PTR_ERR(domain) == -EINVAL) {
-		virt_addr = (void *)__get_free_pages(flag, get_order(size));
-		*dma_addr = __pa(virt_addr);
-		return virt_addr;
+		page = alloc_pages(flag, get_order(size));
+		*dma_addr = page_to_phys(page);
+		return page_address(page);
 	} else if (IS_ERR(domain))
 		return NULL;
 
+	size	  = PAGE_ALIGN(size);
 	dma_mask  = dev->coherent_dma_mask;
 	flag     &= ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32);
 
-	virt_addr = (void *)__get_free_pages(flag, get_order(size));
-	if (!virt_addr)
-		return NULL;
+	page = alloc_pages(flag | __GFP_NOWARN,  get_order(size));
+	if (!page) {
+		if (!(flag & __GFP_WAIT))
+			return NULL;
 
-	paddr = virt_to_phys(virt_addr);
+		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
+						 get_order(size));
+		if (!page)
+			return NULL;
+	}
 
 	if (!dma_mask)
 		dma_mask = *dev->dma_mask;
 
 	spin_lock_irqsave(&domain->lock, flags);
 
-	*dma_addr = __map_single(dev, domain->priv, paddr,
+	*dma_addr = __map_single(dev, domain->priv, page_to_phys(page),
 				 size, DMA_BIDIRECTIONAL, true, dma_mask);
 
 	if (*dma_addr == DMA_ERROR_CODE) {
@@ -2955,11 +2961,12 @@ static void *alloc_coherent(struct device *dev, size_t size,
 
 	spin_unlock_irqrestore(&domain->lock, flags);
 
-	return virt_addr;
+	return page_address(page);
 
 out_free:
 
-	free_pages((unsigned long)virt_addr, get_order(size));
+	if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
+		__free_pages(page, get_order(size));
 
 	return NULL;
 }
@@ -2971,11 +2978,15 @@ static void free_coherent(struct device *dev, size_t size,
 			  void *virt_addr, dma_addr_t dma_addr,
 			  struct dma_attrs *attrs)
 {
-	unsigned long flags;
 	struct protection_domain *domain;
+	unsigned long flags;
+	struct page *page;
 
 	INC_STATS_COUNTER(cnt_free_coherent);
 
+	page = virt_to_page(virt_addr);
+	size = PAGE_ALIGN(size);
+
 	domain = get_domain(dev);
 	if (IS_ERR(domain))
 		goto free_mem;
@@ -2989,7 +3000,8 @@ static void free_coherent(struct device *dev, size_t size,
 	spin_unlock_irqrestore(&domain->lock, flags);
 
 free_mem:
-	free_pages((unsigned long)virt_addr, get_order(size));
+	if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT))
+		__free_pages(page, get_order(size));
 }
 
 /*
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/9] iommu/amd: Return the pte page-size in fetch_pte
  2015-04-01 12:58 [PATCH 0/9] AMD IOMMU Updates for v4.1 Joerg Roedel
                   ` (3 preceding siblings ...)
  2015-04-01 12:58 ` [PATCH 4/9] iommu/amd: Add support for contiguous dma allocator Joerg Roedel
@ 2015-04-01 12:58 ` Joerg Roedel
  2015-04-01 12:58 ` [PATCH 6/9] iommu/amd: Optimize iommu_unmap_page for new fetch_pte interface Joerg Roedel
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2015-04-01 12:58 UTC (permalink / raw)
  To: iommu; +Cc: Suravee.Suthikulpanit, linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Extend the fetch_pte function to also return the page-size
that is mapped by the returned pte.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c       | 52 ++++++++++++++++++++++++-----------------
 drivers/iommu/amd_iommu_types.h |  6 +++++
 2 files changed, 36 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 49ecf00..24ef9e6 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1322,7 +1322,9 @@ static u64 *alloc_pte(struct protection_domain *domain,
  * This function checks if there is a PTE for a given dma address. If
  * there is one, it returns the pointer to it.
  */
-static u64 *fetch_pte(struct protection_domain *domain, unsigned long address)
+static u64 *fetch_pte(struct protection_domain *domain,
+		      unsigned long address,
+		      unsigned long *page_size)
 {
 	int level;
 	u64 *pte;
@@ -1330,8 +1332,9 @@ static u64 *fetch_pte(struct protection_domain *domain, unsigned long address)
 	if (address > PM_LEVEL_SIZE(domain->mode))
 		return NULL;
 
-	level   =  domain->mode - 1;
-	pte     = &domain->pt_root[PM_LEVEL_INDEX(level, address)];
+	level	   =  domain->mode - 1;
+	pte	   = &domain->pt_root[PM_LEVEL_INDEX(level, address)];
+	*page_size =  PTE_LEVEL_PAGE_SIZE(level);
 
 	while (level > 0) {
 
@@ -1340,19 +1343,9 @@ static u64 *fetch_pte(struct protection_domain *domain, unsigned long address)
 			return NULL;
 
 		/* Large PTE */
-		if (PM_PTE_LEVEL(*pte) == 0x07) {
-			unsigned long pte_mask, __pte;
-
-			/*
-			 * If we have a series of large PTEs, make
-			 * sure to return a pointer to the first one.
-			 */
-			pte_mask = PTE_PAGE_SIZE(*pte);
-			pte_mask = ~((PAGE_SIZE_PTE_COUNT(pte_mask) << 3) - 1);
-			__pte    = ((unsigned long)pte) & pte_mask;
-
-			return (u64 *)__pte;
-		}
+		if (PM_PTE_LEVEL(*pte) == 7 ||
+		    PM_PTE_LEVEL(*pte) == 0)
+			break;
 
 		/* No level skipping support yet */
 		if (PM_PTE_LEVEL(*pte) != level)
@@ -1361,8 +1354,21 @@ static u64 *fetch_pte(struct protection_domain *domain, unsigned long address)
 		level -= 1;
 
 		/* Walk to the next level */
-		pte = IOMMU_PTE_PAGE(*pte);
-		pte = &pte[PM_LEVEL_INDEX(level, address)];
+		pte	   = IOMMU_PTE_PAGE(*pte);
+		pte	   = &pte[PM_LEVEL_INDEX(level, address)];
+		*page_size = PTE_LEVEL_PAGE_SIZE(level);
+	}
+
+	if (PM_PTE_LEVEL(*pte) == 0x07) {
+		unsigned long pte_mask;
+
+		/*
+		 * If we have a series of large PTEs, make
+		 * sure to return a pointer to the first one.
+		 */
+		*page_size = pte_mask = PTE_PAGE_SIZE(*pte);
+		pte_mask   = ~((PAGE_SIZE_PTE_COUNT(pte_mask) << 3) - 1);
+		pte        = (u64 *)(((unsigned long)pte) & pte_mask);
 	}
 
 	return pte;
@@ -1423,6 +1429,7 @@ static unsigned long iommu_unmap_page(struct protection_domain *dom,
 				      unsigned long page_size)
 {
 	unsigned long long unmap_size, unmapped;
+	unsigned long pte_pgsize;
 	u64 *pte;
 
 	BUG_ON(!is_power_of_2(page_size));
@@ -1431,7 +1438,7 @@ static unsigned long iommu_unmap_page(struct protection_domain *dom,
 
 	while (unmapped < page_size) {
 
-		pte = fetch_pte(dom, bus_addr);
+		pte = fetch_pte(dom, bus_addr, &pte_pgsize);
 
 		if (!pte) {
 			/*
@@ -1674,7 +1681,8 @@ static int alloc_new_range(struct dma_ops_domain *dma_dom,
 	for (i = dma_dom->aperture[index]->offset;
 	     i < dma_dom->aperture_size;
 	     i += PAGE_SIZE) {
-		u64 *pte = fetch_pte(&dma_dom->domain, i);
+		unsigned long pte_pgsize;
+		u64 *pte = fetch_pte(&dma_dom->domain, i, &pte_pgsize);
 		if (!pte || !IOMMU_PTE_PRESENT(*pte))
 			continue;
 
@@ -3382,14 +3390,14 @@ static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom,
 					  dma_addr_t iova)
 {
 	struct protection_domain *domain = dom->priv;
-	unsigned long offset_mask;
+	unsigned long offset_mask, pte_pgsize;
 	phys_addr_t paddr;
 	u64 *pte, __pte;
 
 	if (domain->mode == PAGE_MODE_NONE)
 		return iova;
 
-	pte = fetch_pte(domain, iova);
+	pte = fetch_pte(domain, iova, &pte_pgsize);
 
 	if (!pte || !IOMMU_PTE_PRESENT(*pte))
 		return 0;
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index c4fffb7..60e87d2 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -282,6 +282,12 @@
 #define PTE_PAGE_SIZE(pte) \
 	(1ULL << (1 + ffz(((pte) | 0xfffULL))))
 
+/*
+ * Takes a page-table level and returns the default page-size for this level
+ */
+#define PTE_LEVEL_PAGE_SIZE(level)			\
+	(1ULL << (12 + (9 * (level))))
+
 #define IOMMU_PTE_P  (1ULL << 0)
 #define IOMMU_PTE_TV (1ULL << 1)
 #define IOMMU_PTE_U  (1ULL << 59)
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/9] iommu/amd: Optimize iommu_unmap_page for new fetch_pte interface
  2015-04-01 12:58 [PATCH 0/9] AMD IOMMU Updates for v4.1 Joerg Roedel
                   ` (4 preceding siblings ...)
  2015-04-01 12:58 ` [PATCH 5/9] iommu/amd: Return the pte page-size in fetch_pte Joerg Roedel
@ 2015-04-01 12:58 ` Joerg Roedel
  2015-04-01 12:58 ` [PATCH 7/9] iommu/amd: Optimize alloc_new_range " Joerg Roedel
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2015-04-01 12:58 UTC (permalink / raw)
  To: iommu; +Cc: Suravee.Suthikulpanit, linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Now that fetch_pte returns the page-size of the pte, this
function can be optimized a lot.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 32 ++++++++------------------------
 1 file changed, 8 insertions(+), 24 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 24ef9e6..c9ee444 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1428,8 +1428,8 @@ static unsigned long iommu_unmap_page(struct protection_domain *dom,
 				      unsigned long bus_addr,
 				      unsigned long page_size)
 {
-	unsigned long long unmap_size, unmapped;
-	unsigned long pte_pgsize;
+	unsigned long long unmapped;
+	unsigned long unmap_size;
 	u64 *pte;
 
 	BUG_ON(!is_power_of_2(page_size));
@@ -1438,28 +1438,12 @@ static unsigned long iommu_unmap_page(struct protection_domain *dom,
 
 	while (unmapped < page_size) {
 
-		pte = fetch_pte(dom, bus_addr, &pte_pgsize);
-
-		if (!pte) {
-			/*
-			 * No PTE for this address
-			 * move forward in 4kb steps
-			 */
-			unmap_size = PAGE_SIZE;
-		} else if (PM_PTE_LEVEL(*pte) == 0) {
-			/* 4kb PTE found for this address */
-			unmap_size = PAGE_SIZE;
-			*pte       = 0ULL;
-		} else {
-			int count, i;
-
-			/* Large PTE found which maps this address */
-			unmap_size = PTE_PAGE_SIZE(*pte);
-
-			/* Only unmap from the first pte in the page */
-			if ((unmap_size - 1) & bus_addr)
-				break;
-			count      = PAGE_SIZE_PTE_COUNT(unmap_size);
+		pte = fetch_pte(dom, bus_addr, &unmap_size);
+
+		if (pte) {
+			int i, count;
+
+			count = PAGE_SIZE_PTE_COUNT(unmap_size);
 			for (i = 0; i < count; i++)
 				pte[i] = 0ULL;
 		}
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 7/9] iommu/amd: Optimize alloc_new_range for new fetch_pte interface
  2015-04-01 12:58 [PATCH 0/9] AMD IOMMU Updates for v4.1 Joerg Roedel
                   ` (5 preceding siblings ...)
  2015-04-01 12:58 ` [PATCH 6/9] iommu/amd: Optimize iommu_unmap_page for new fetch_pte interface Joerg Roedel
@ 2015-04-01 12:58 ` Joerg Roedel
  2015-04-01 12:58 ` [PATCH 8/9] iommu/amd: Optimize amd_iommu_iova_to_phys " Joerg Roedel
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2015-04-01 12:58 UTC (permalink / raw)
  To: iommu; +Cc: Suravee.Suthikulpanit, linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Now that fetch_pte returns the page-size of the pte, the
call in this function can also be optimized a little bit.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index c9ee444..f97441b 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1591,7 +1591,7 @@ static int alloc_new_range(struct dma_ops_domain *dma_dom,
 {
 	int index = dma_dom->aperture_size >> APERTURE_RANGE_SHIFT;
 	struct amd_iommu *iommu;
-	unsigned long i, old_size;
+	unsigned long i, old_size, pte_pgsize;
 
 #ifdef CONFIG_IOMMU_STRESS
 	populate = false;
@@ -1664,13 +1664,13 @@ static int alloc_new_range(struct dma_ops_domain *dma_dom,
 	 */
 	for (i = dma_dom->aperture[index]->offset;
 	     i < dma_dom->aperture_size;
-	     i += PAGE_SIZE) {
-		unsigned long pte_pgsize;
+	     i += pte_pgsize) {
 		u64 *pte = fetch_pte(&dma_dom->domain, i, &pte_pgsize);
 		if (!pte || !IOMMU_PTE_PRESENT(*pte))
 			continue;
 
-		dma_ops_reserve_addresses(dma_dom, i >> PAGE_SHIFT, 1);
+		dma_ops_reserve_addresses(dma_dom, i >> PAGE_SHIFT,
+					  pte_pgsize >> 12);
 	}
 
 	update_domain(&dma_dom->domain);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 8/9] iommu/amd: Optimize amd_iommu_iova_to_phys for new fetch_pte interface
  2015-04-01 12:58 [PATCH 0/9] AMD IOMMU Updates for v4.1 Joerg Roedel
                   ` (6 preceding siblings ...)
  2015-04-01 12:58 ` [PATCH 7/9] iommu/amd: Optimize alloc_new_range " Joerg Roedel
@ 2015-04-01 12:58 ` Joerg Roedel
  2015-04-01 12:58 ` [PATCH 9/9] iommu/amd: Correctly encode huge pages in iommu page tables Joerg Roedel
  2015-04-01 19:11 ` [PATCH 0/9] AMD IOMMU Updates for v4.1 Suravee Suthikulanit
  9 siblings, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2015-04-01 12:58 UTC (permalink / raw)
  To: iommu; +Cc: Suravee.Suthikulpanit, linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Now that fetch_pte returns the page-size of the pte, this
function can be optimized too.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index f97441b..7a00e5d 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3375,7 +3375,6 @@ static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom,
 {
 	struct protection_domain *domain = dom->priv;
 	unsigned long offset_mask, pte_pgsize;
-	phys_addr_t paddr;
 	u64 *pte, __pte;
 
 	if (domain->mode == PAGE_MODE_NONE)
@@ -3386,15 +3385,10 @@ static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom,
 	if (!pte || !IOMMU_PTE_PRESENT(*pte))
 		return 0;
 
-	if (PM_PTE_LEVEL(*pte) == 0)
-		offset_mask = PAGE_SIZE - 1;
-	else
-		offset_mask = PTE_PAGE_SIZE(*pte) - 1;
-
-	__pte = *pte & PM_ADDR_MASK;
-	paddr = (__pte & ~offset_mask) | (iova & offset_mask);
+	offset_mask = pte_pgsize - 1;
+	__pte	    = *pte & PM_ADDR_MASK;
 
-	return paddr;
+	return (__pte & ~offset_mask) | (iova & offset_mask);
 }
 
 static bool amd_iommu_capable(enum iommu_cap cap)
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 9/9] iommu/amd: Correctly encode huge pages in iommu page tables
  2015-04-01 12:58 [PATCH 0/9] AMD IOMMU Updates for v4.1 Joerg Roedel
                   ` (7 preceding siblings ...)
  2015-04-01 12:58 ` [PATCH 8/9] iommu/amd: Optimize amd_iommu_iova_to_phys " Joerg Roedel
@ 2015-04-01 12:58 ` Joerg Roedel
  2015-04-01 19:11 ` [PATCH 0/9] AMD IOMMU Updates for v4.1 Suravee Suthikulanit
  9 siblings, 0 replies; 11+ messages in thread
From: Joerg Roedel @ 2015-04-01 12:58 UTC (permalink / raw)
  To: iommu; +Cc: Suravee.Suthikulpanit, linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

When a default page-size for given level should be mapped,
the level encoding must be 0 rather than 7. This fixes an
issue seen on IOMMUv2 hardware, where this encoding is
enforced.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 7a00e5d..aa710b0 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1390,13 +1390,14 @@ static int iommu_map_page(struct protection_domain *dom,
 	u64 __pte, *pte;
 	int i, count;
 
+	BUG_ON(!IS_ALIGNED(bus_addr, page_size));
+	BUG_ON(!IS_ALIGNED(phys_addr, page_size));
+
 	if (!(prot & IOMMU_PROT_MASK))
 		return -EINVAL;
 
-	bus_addr  = PAGE_ALIGN(bus_addr);
-	phys_addr = PAGE_ALIGN(phys_addr);
-	count     = PAGE_SIZE_PTE_COUNT(page_size);
-	pte       = alloc_pte(dom, bus_addr, page_size, NULL, GFP_KERNEL);
+	count = PAGE_SIZE_PTE_COUNT(page_size);
+	pte   = alloc_pte(dom, bus_addr, page_size, NULL, GFP_KERNEL);
 
 	if (!pte)
 		return -ENOMEM;
@@ -1405,7 +1406,7 @@ static int iommu_map_page(struct protection_domain *dom,
 		if (IOMMU_PTE_PRESENT(pte[i]))
 			return -EBUSY;
 
-	if (page_size > PAGE_SIZE) {
+	if (count > 1) {
 		__pte = PAGE_SIZE_PTE(phys_addr, page_size);
 		__pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_P | IOMMU_PTE_FC;
 	} else
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/9] AMD IOMMU Updates for v4.1
  2015-04-01 12:58 [PATCH 0/9] AMD IOMMU Updates for v4.1 Joerg Roedel
                   ` (8 preceding siblings ...)
  2015-04-01 12:58 ` [PATCH 9/9] iommu/amd: Correctly encode huge pages in iommu page tables Joerg Roedel
@ 2015-04-01 19:11 ` Suravee Suthikulanit
  9 siblings, 0 replies; 11+ messages in thread
From: Suravee Suthikulanit @ 2015-04-01 19:11 UTC (permalink / raw)
  To: Joerg Roedel, iommu; +Cc: linux-kernel

On 4/1/2015 7:58 AM, Joerg Roedel wrote:
> Hi,
>
> here are a few fixes and enhancements for the AMD IOMMU
> driver for the next merge window. They are tested on
> different versions of AMD IOMMUs. Please review.
>
> Thanks,
>
> 	Joerg
>
> Joerg Roedel (9):
>    iommu/amd: Use BUS_NOTIFY_REMOVED_DEVICE
>    iommu/amd: Ignore BUS_NOTIFY_UNBOUND_DRIVER event
>    iommu/amd: Don't allocate with __GFP_ZERO in alloc_coherent
>    iommu/amd: Add support for contiguous dma allocator
>    iommu/amd: Return the pte page-size in fetch_pte
>    iommu/amd: Optimize iommu_unmap_page for new fetch_pte interface
>    iommu/amd: Optimize alloc_new_range for new fetch_pte interface
>    iommu/amd: Optimize amd_iommu_iova_to_phys for new fetch_pte interface
>    iommu/amd: Correctly encode huge pages in iommu page tables
>
>   drivers/iommu/amd_iommu.c       | 166 +++++++++++++++++++---------------------
>   drivers/iommu/amd_iommu_types.h |   6 ++
>   2 files changed, 83 insertions(+), 89 deletions(-)
>

Tested-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>

Thanks,

Suravee


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-04-01 19:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-01 12:58 [PATCH 0/9] AMD IOMMU Updates for v4.1 Joerg Roedel
2015-04-01 12:58 ` [PATCH 1/9] iommu/amd: Use BUS_NOTIFY_REMOVED_DEVICE Joerg Roedel
2015-04-01 12:58 ` [PATCH 2/9] iommu/amd: Ignore BUS_NOTIFY_UNBOUND_DRIVER event Joerg Roedel
2015-04-01 12:58 ` [PATCH 3/9] iommu/amd: Don't allocate with __GFP_ZERO in alloc_coherent Joerg Roedel
2015-04-01 12:58 ` [PATCH 4/9] iommu/amd: Add support for contiguous dma allocator Joerg Roedel
2015-04-01 12:58 ` [PATCH 5/9] iommu/amd: Return the pte page-size in fetch_pte Joerg Roedel
2015-04-01 12:58 ` [PATCH 6/9] iommu/amd: Optimize iommu_unmap_page for new fetch_pte interface Joerg Roedel
2015-04-01 12:58 ` [PATCH 7/9] iommu/amd: Optimize alloc_new_range " Joerg Roedel
2015-04-01 12:58 ` [PATCH 8/9] iommu/amd: Optimize amd_iommu_iova_to_phys " Joerg Roedel
2015-04-01 12:58 ` [PATCH 9/9] iommu/amd: Correctly encode huge pages in iommu page tables Joerg Roedel
2015-04-01 19:11 ` [PATCH 0/9] AMD IOMMU Updates for v4.1 Suravee Suthikulanit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).