All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] iommu/amd, vfio: Reduce IOTLB Flushm Rate
@ 2017-10-13 14:40 ` Joerg Roedel
  0 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2017-10-13 14:40 UTC (permalink / raw)
  To: iommu, Alex Williamson
  Cc: Suravee Suthikulpanit, kvm, linux-kernel, Joerg Roedel

Hi,

these patches implement the new IOTLB flush interface in the
AMD IOMMU driver. But for it to take effect, changes in VFIO
are also necessary, because VFIO unpins the pages after
every successful iommu_unmap() call. This requires an IOTLB
flush, so that we don't save flushes.

So I implemented vfio-gather code to collect up to 32 ranges
before they are unpinned together. This significantly
reduces the number of necessary IOTLB flushes in my tests.

Please review.

Thanks,

	Joerg

Joerg Roedel (4):
  iommu/amd: Finish TLB flush in amd_iommu_unmap()
  iommu/amd: Implement IOMMU-API TLB flush interface
  vfio/type1: Make use of iommu_unmap_fast()
  vfio/type1: Gather TLB-syncs and pages to unpin

 drivers/iommu/amd_iommu.c       |  12 +++-
 drivers/vfio/vfio_iommu_type1.c | 128 ++++++++++++++++++++++++++++++++++++----
 2 files changed, 128 insertions(+), 12 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 0/4] iommu/amd, vfio: Reduce IOTLB Flushm Rate
@ 2017-10-13 14:40 ` Joerg Roedel
  0 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2017-10-13 14:40 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Alex Williamson
  Cc: kvm-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi,

these patches implement the new IOTLB flush interface in the
AMD IOMMU driver. But for it to take effect, changes in VFIO
are also necessary, because VFIO unpins the pages after
every successful iommu_unmap() call. This requires an IOTLB
flush, so that we don't save flushes.

So I implemented vfio-gather code to collect up to 32 ranges
before they are unpinned together. This significantly
reduces the number of necessary IOTLB flushes in my tests.

Please review.

Thanks,

	Joerg

Joerg Roedel (4):
  iommu/amd: Finish TLB flush in amd_iommu_unmap()
  iommu/amd: Implement IOMMU-API TLB flush interface
  vfio/type1: Make use of iommu_unmap_fast()
  vfio/type1: Gather TLB-syncs and pages to unpin

 drivers/iommu/amd_iommu.c       |  12 +++-
 drivers/vfio/vfio_iommu_type1.c | 128 ++++++++++++++++++++++++++++++++++++----
 2 files changed, 128 insertions(+), 12 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] iommu/amd: Finish TLB flush in amd_iommu_unmap()
@ 2017-10-13 14:40   ` Joerg Roedel
  0 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2017-10-13 14:40 UTC (permalink / raw)
  To: iommu, Alex Williamson
  Cc: Suravee Suthikulpanit, kvm, linux-kernel, Joerg Roedel, stable, #,
	=,
	2.6.33

From: Joerg Roedel <jroedel@suse.de>

The function only sends the flush command to the IOMMU(s),
but does not wait for its completion when it returns. Fix
that.

Fixes: 601367d76bd1 ('x86/amd-iommu: Remove iommu_flush_domain function')
Cc: stable@vger.kernel.org # >= 2.6.33
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index cb7c531..3a702c4 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3046,6 +3046,7 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
 	mutex_unlock(&domain->api_lock);
 
 	domain_flush_tlb_pde(domain);
+	domain_flush_complete(domain);
 
 	return unmap_size;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 1/4] iommu/amd: Finish TLB flush in amd_iommu_unmap()
@ 2017-10-13 14:40   ` Joerg Roedel
  0 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2017-10-13 14:40 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Alex Williamson
  Cc: Joerg Roedel, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	stable-u79uwXL29TY76Z2rM5mHXA, =,
	2.6.33-zLv9SwRftAIdnm+yROfE0A, #

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

The function only sends the flush command to the IOMMU(s),
but does not wait for its completion when it returns. Fix
that.

Fixes: 601367d76bd1 ('x86/amd-iommu: Remove iommu_flush_domain function')
Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org # >= 2.6.33
Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index cb7c531..3a702c4 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3046,6 +3046,7 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
 	mutex_unlock(&domain->api_lock);
 
 	domain_flush_tlb_pde(domain);
+	domain_flush_complete(domain);
 
 	return unmap_size;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] iommu/amd: Implement IOMMU-API TLB flush interface
@ 2017-10-13 14:40   ` Joerg Roedel
  0 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2017-10-13 14:40 UTC (permalink / raw)
  To: iommu, Alex Williamson
  Cc: Suravee Suthikulpanit, kvm, linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Make use of the new IOTLB flush-interface in the IOMMU-API.
We don't implement the iotlb_range_add() call-back for now,
as this will put too many pressure on the command buffer.
Instead, we do a full TLB flush in the iotlb_sync()
call-back.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/amd_iommu.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 3a702c4..8804264 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3032,6 +3032,14 @@ static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova,
 	return ret;
 }
 
+static void amd_iommu_flush_iotlb_all(struct iommu_domain *dom)
+{
+	struct protection_domain *domain = to_pdomain(dom);
+
+	domain_flush_tlb_pde(domain);
+	domain_flush_complete(domain);
+}
+
 static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
 			   size_t page_size)
 {
@@ -3045,9 +3053,6 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
 	unmap_size = iommu_unmap_page(domain, iova, page_size);
 	mutex_unlock(&domain->api_lock);
 
-	domain_flush_tlb_pde(domain);
-	domain_flush_complete(domain);
-
 	return unmap_size;
 }
 
@@ -3174,6 +3179,8 @@ const struct iommu_ops amd_iommu_ops = {
 	.map = amd_iommu_map,
 	.unmap = amd_iommu_unmap,
 	.map_sg = default_iommu_map_sg,
+	.flush_iotlb_all = amd_iommu_flush_iotlb_all,
+	.iotlb_sync = amd_iommu_flush_iotlb_all,
 	.iova_to_phys = amd_iommu_iova_to_phys,
 	.add_device = amd_iommu_add_device,
 	.remove_device = amd_iommu_remove_device,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] iommu/amd: Implement IOMMU-API TLB flush interface
@ 2017-10-13 14:40   ` Joerg Roedel
  0 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2017-10-13 14:40 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Alex Williamson
  Cc: Joerg Roedel, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

Make use of the new IOTLB flush-interface in the IOMMU-API.
We don't implement the iotlb_range_add() call-back for now,
as this will put too many pressure on the command buffer.
Instead, we do a full TLB flush in the iotlb_sync()
call-back.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/iommu/amd_iommu.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 3a702c4..8804264 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3032,6 +3032,14 @@ static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova,
 	return ret;
 }
 
+static void amd_iommu_flush_iotlb_all(struct iommu_domain *dom)
+{
+	struct protection_domain *domain = to_pdomain(dom);
+
+	domain_flush_tlb_pde(domain);
+	domain_flush_complete(domain);
+}
+
 static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
 			   size_t page_size)
 {
@@ -3045,9 +3053,6 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
 	unmap_size = iommu_unmap_page(domain, iova, page_size);
 	mutex_unlock(&domain->api_lock);
 
-	domain_flush_tlb_pde(domain);
-	domain_flush_complete(domain);
-
 	return unmap_size;
 }
 
@@ -3174,6 +3179,8 @@ const struct iommu_ops amd_iommu_ops = {
 	.map = amd_iommu_map,
 	.unmap = amd_iommu_unmap,
 	.map_sg = default_iommu_map_sg,
+	.flush_iotlb_all = amd_iommu_flush_iotlb_all,
+	.iotlb_sync = amd_iommu_flush_iotlb_all,
 	.iova_to_phys = amd_iommu_iova_to_phys,
 	.add_device = amd_iommu_add_device,
 	.remove_device = amd_iommu_remove_device,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] vfio/type1: Make use of iommu_unmap_fast()
  2017-10-13 14:40 ` Joerg Roedel
                   ` (2 preceding siblings ...)
  (?)
@ 2017-10-13 14:40 ` Joerg Roedel
  -1 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2017-10-13 14:40 UTC (permalink / raw)
  To: iommu, Alex Williamson
  Cc: Suravee Suthikulpanit, kvm, linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

Switch from using iommu_unmap to iommu_unmap_fast() and add
the necessary calls the the IOTLB invalidation routines.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/vfio/vfio_iommu_type1.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 92155cc..2b1e81f 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -672,10 +672,13 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 				      struct vfio_domain, next);
 
 	list_for_each_entry_continue(d, &iommu->domain_list, next) {
-		iommu_unmap(d->domain, dma->iova, dma->size);
+		iommu_unmap_fast(d->domain, dma->iova, dma->size);
+		iommu_tlb_range_add(d->domain, dma->iova, dma->size);
 		cond_resched();
 	}
 
+	iommu_tlb_sync(domain->domain);
+
 	while (iova < end) {
 		size_t unmapped, len;
 		phys_addr_t phys, next;
@@ -698,10 +701,13 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 				break;
 		}
 
-		unmapped = iommu_unmap(domain->domain, iova, len);
+		unmapped = iommu_unmap_fast(domain->domain, iova, len);
 		if (WARN_ON(!unmapped))
 			break;
 
+		iommu_tlb_range_add(domain->domain, iova, unmapped);
+		iommu_tlb_sync(domain->domain);
+
 		unlocked += vfio_unpin_pages_remote(dma, iova,
 						    phys >> PAGE_SHIFT,
 						    unmapped >> PAGE_SHIFT,
@@ -885,7 +891,9 @@ static int map_try_harder(struct vfio_domain *domain, dma_addr_t iova,
 	}
 
 	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE)
-		iommu_unmap(domain->domain, iova, PAGE_SIZE);
+		iommu_unmap_fast(domain->domain, iova, PAGE_SIZE);
+
+	iommu_tlb_sync(domain->domain);
 
 	return ret;
 }
@@ -912,7 +920,9 @@ static int vfio_iommu_map(struct vfio_iommu *iommu, dma_addr_t iova,
 
 unwind:
 	list_for_each_entry_continue_reverse(d, &iommu->domain_list, next)
-		iommu_unmap(d->domain, iova, npage << PAGE_SHIFT);
+		iommu_unmap_fast(d->domain, iova, npage << PAGE_SHIFT);
+
+	iommu_tlb_sync(d->domain);
 
 	return ret;
 }
@@ -1136,12 +1146,14 @@ static void vfio_test_domain_fgsp(struct vfio_domain *domain)
 	ret = iommu_map(domain->domain, 0, page_to_phys(pages), PAGE_SIZE * 2,
 			IOMMU_READ | IOMMU_WRITE | domain->prot);
 	if (!ret) {
-		size_t unmapped = iommu_unmap(domain->domain, 0, PAGE_SIZE);
+		size_t unmapped = iommu_unmap_fast(domain->domain, 0, PAGE_SIZE);
 
 		if (unmapped == PAGE_SIZE)
-			iommu_unmap(domain->domain, PAGE_SIZE, PAGE_SIZE);
+			iommu_unmap_fast(domain->domain, PAGE_SIZE, PAGE_SIZE);
 		else
 			domain->fgsp = true;
+
+		iommu_tlb_sync(domain->domain);
 	}
 
 	__free_pages(pages, order);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] vfio/type1: Gather TLB-syncs and pages to unpin
@ 2017-10-13 14:40   ` Joerg Roedel
  0 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2017-10-13 14:40 UTC (permalink / raw)
  To: iommu, Alex Williamson
  Cc: Suravee Suthikulpanit, kvm, linux-kernel, Joerg Roedel

From: Joerg Roedel <jroedel@suse.de>

After every unmap VFIO unpins the pages that where mapped by
the IOMMU. This requires an IOTLB flush after every unmap
and puts a high load on the IOMMU hardware and the device
TLBs.

Gather up to 32 ranges to flush and unpin and do the IOTLB
flush once for all these ranges. This significantly reduces
the number of IOTLB flushes in the unmapping path.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/vfio/vfio_iommu_type1.c | 106 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 101 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 2b1e81f..86fc1da 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -107,6 +107,92 @@ struct vfio_pfn {
 
 static int put_pfn(unsigned long pfn, int prot);
 
+static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
+				    unsigned long pfn, long npage,
+				    bool do_accounting);
+
+#define GATHER_ENTRIES	32
+
+/*
+ * Gather TLB flushes before unpinning pages
+ */
+struct vfio_gather_entry {
+	dma_addr_t iova;
+	phys_addr_t phys;
+	size_t size;
+};
+
+struct vfio_gather {
+	unsigned fill;
+	struct vfio_gather_entry entries[GATHER_ENTRIES];
+};
+
+/*
+ * The vfio_gather* functions below keep track of flushing the IOMMU TLB
+ * and unpinning the pages. It is safe to call them gather == NULL, in
+ * which case they will fall-back to flushing the TLB and unpinning the
+ * pages at every call.
+ */
+static long vfio_gather_flush(struct iommu_domain *domain,
+			      struct vfio_dma *dma,
+			      struct vfio_gather *gather)
+{
+	long unlocked = 0;
+	unsigned i;
+
+	if (!gather)
+		goto out;
+
+	/* First flush unmapped TLB entries */
+	iommu_tlb_sync(domain);
+
+	for (i = 0; i < gather->fill; i++) {
+		dma_addr_t iova = gather->entries[i].iova;
+		phys_addr_t phys = gather->entries[i].phys;
+		size_t size = gather->entries[i].size;
+
+		unlocked += vfio_unpin_pages_remote(dma, iova,
+						    phys >> PAGE_SHIFT,
+						    size >> PAGE_SHIFT,
+						    false);
+	}
+
+	gather->fill = 0;
+
+out:
+	return unlocked;
+}
+
+static long vfio_gather_add(struct iommu_domain *domain,
+			    struct vfio_dma *dma,
+			    struct vfio_gather *gather,
+			    dma_addr_t iova, phys_addr_t phys, size_t size)
+{
+	long unlocked = 0;
+
+	if (gather) {
+		unsigned index;
+
+		if (gather->fill == GATHER_ENTRIES)
+			unlocked = vfio_gather_flush(domain, dma, gather);
+
+		index = gather->fill++;
+
+		gather->entries[index].iova = iova;
+		gather->entries[index].phys = phys;
+		gather->entries[index].size = size;
+	} else {
+		iommu_tlb_sync(domain);
+
+		unlocked = vfio_unpin_pages_remote(dma, iova,
+						   phys >> PAGE_SHIFT,
+						   size >> PAGE_SHIFT,
+						   false);
+	}
+
+	return unlocked;
+}
+
 /*
  * This code handles mapping and unmapping of user data buffers
  * into DMA'ble space using the IOMMU
@@ -653,6 +739,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 {
 	dma_addr_t iova = dma->iova, end = dma->iova + dma->size;
 	struct vfio_domain *domain, *d;
+	struct vfio_gather *gather;
 	long unlocked = 0;
 
 	if (!dma->size)
@@ -662,6 +749,12 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 		return 0;
 
 	/*
+	 * No need to check return value - It is safe to continue with a
+	 * NULL pointer.
+	 */
+	gather = kzalloc(sizeof(*gather), GFP_KERNEL);
+
+	/*
 	 * We use the IOMMU to track the physical addresses, otherwise we'd
 	 * need a much more complicated tracking system.  Unfortunately that
 	 * means we need to use one of the iommu domains to figure out the
@@ -706,17 +799,20 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 			break;
 
 		iommu_tlb_range_add(domain->domain, iova, unmapped);
-		iommu_tlb_sync(domain->domain);
 
-		unlocked += vfio_unpin_pages_remote(dma, iova,
-						    phys >> PAGE_SHIFT,
-						    unmapped >> PAGE_SHIFT,
-						    false);
+		unlocked += vfio_gather_add(domain->domain, dma, gather,
+					    iova, phys, unmapped);
+
 		iova += unmapped;
 
 		cond_resched();
 	}
 
+	unlocked += vfio_gather_flush(domain->domain, dma, gather);
+
+	kfree(gather);
+	gather = NULL;
+
 	dma->iommu_mapped = false;
 	if (do_accounting) {
 		vfio_lock_acct(dma->task, -unlocked, NULL);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] vfio/type1: Gather TLB-syncs and pages to unpin
@ 2017-10-13 14:40   ` Joerg Roedel
  0 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2017-10-13 14:40 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Alex Williamson
  Cc: Joerg Roedel, kvm-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

After every unmap VFIO unpins the pages that where mapped by
the IOMMU. This requires an IOTLB flush after every unmap
and puts a high load on the IOMMU hardware and the device
TLBs.

Gather up to 32 ranges to flush and unpin and do the IOTLB
flush once for all these ranges. This significantly reduces
the number of IOTLB flushes in the unmapping path.

Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 106 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 101 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 2b1e81f..86fc1da 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -107,6 +107,92 @@ struct vfio_pfn {
 
 static int put_pfn(unsigned long pfn, int prot);
 
+static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
+				    unsigned long pfn, long npage,
+				    bool do_accounting);
+
+#define GATHER_ENTRIES	32
+
+/*
+ * Gather TLB flushes before unpinning pages
+ */
+struct vfio_gather_entry {
+	dma_addr_t iova;
+	phys_addr_t phys;
+	size_t size;
+};
+
+struct vfio_gather {
+	unsigned fill;
+	struct vfio_gather_entry entries[GATHER_ENTRIES];
+};
+
+/*
+ * The vfio_gather* functions below keep track of flushing the IOMMU TLB
+ * and unpinning the pages. It is safe to call them gather == NULL, in
+ * which case they will fall-back to flushing the TLB and unpinning the
+ * pages at every call.
+ */
+static long vfio_gather_flush(struct iommu_domain *domain,
+			      struct vfio_dma *dma,
+			      struct vfio_gather *gather)
+{
+	long unlocked = 0;
+	unsigned i;
+
+	if (!gather)
+		goto out;
+
+	/* First flush unmapped TLB entries */
+	iommu_tlb_sync(domain);
+
+	for (i = 0; i < gather->fill; i++) {
+		dma_addr_t iova = gather->entries[i].iova;
+		phys_addr_t phys = gather->entries[i].phys;
+		size_t size = gather->entries[i].size;
+
+		unlocked += vfio_unpin_pages_remote(dma, iova,
+						    phys >> PAGE_SHIFT,
+						    size >> PAGE_SHIFT,
+						    false);
+	}
+
+	gather->fill = 0;
+
+out:
+	return unlocked;
+}
+
+static long vfio_gather_add(struct iommu_domain *domain,
+			    struct vfio_dma *dma,
+			    struct vfio_gather *gather,
+			    dma_addr_t iova, phys_addr_t phys, size_t size)
+{
+	long unlocked = 0;
+
+	if (gather) {
+		unsigned index;
+
+		if (gather->fill == GATHER_ENTRIES)
+			unlocked = vfio_gather_flush(domain, dma, gather);
+
+		index = gather->fill++;
+
+		gather->entries[index].iova = iova;
+		gather->entries[index].phys = phys;
+		gather->entries[index].size = size;
+	} else {
+		iommu_tlb_sync(domain);
+
+		unlocked = vfio_unpin_pages_remote(dma, iova,
+						   phys >> PAGE_SHIFT,
+						   size >> PAGE_SHIFT,
+						   false);
+	}
+
+	return unlocked;
+}
+
 /*
  * This code handles mapping and unmapping of user data buffers
  * into DMA'ble space using the IOMMU
@@ -653,6 +739,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 {
 	dma_addr_t iova = dma->iova, end = dma->iova + dma->size;
 	struct vfio_domain *domain, *d;
+	struct vfio_gather *gather;
 	long unlocked = 0;
 
 	if (!dma->size)
@@ -662,6 +749,12 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 		return 0;
 
 	/*
+	 * No need to check return value - It is safe to continue with a
+	 * NULL pointer.
+	 */
+	gather = kzalloc(sizeof(*gather), GFP_KERNEL);
+
+	/*
 	 * We use the IOMMU to track the physical addresses, otherwise we'd
 	 * need a much more complicated tracking system.  Unfortunately that
 	 * means we need to use one of the iommu domains to figure out the
@@ -706,17 +799,20 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 			break;
 
 		iommu_tlb_range_add(domain->domain, iova, unmapped);
-		iommu_tlb_sync(domain->domain);
 
-		unlocked += vfio_unpin_pages_remote(dma, iova,
-						    phys >> PAGE_SHIFT,
-						    unmapped >> PAGE_SHIFT,
-						    false);
+		unlocked += vfio_gather_add(domain->domain, dma, gather,
+					    iova, phys, unmapped);
+
 		iova += unmapped;
 
 		cond_resched();
 	}
 
+	unlocked += vfio_gather_flush(domain->domain, dma, gather);
+
+	kfree(gather);
+	gather = NULL;
+
 	dma->iommu_mapped = false;
 	if (do_accounting) {
 		vfio_lock_acct(dma->task, -unlocked, NULL);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] vfio/type1: Gather TLB-syncs and pages to unpin
  2017-10-13 14:40   ` Joerg Roedel
  (?)
@ 2017-10-13 17:02   ` Alex Williamson
  -1 siblings, 0 replies; 10+ messages in thread
From: Alex Williamson @ 2017-10-13 17:02 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: iommu, Suravee Suthikulpanit, kvm, linux-kernel, Joerg Roedel

On Fri, 13 Oct 2017 16:40:13 +0200
Joerg Roedel <joro@8bytes.org> wrote:

> From: Joerg Roedel <jroedel@suse.de>
> 
> After every unmap VFIO unpins the pages that where mapped by
> the IOMMU. This requires an IOTLB flush after every unmap
> and puts a high load on the IOMMU hardware and the device
> TLBs.
> 
> Gather up to 32 ranges to flush and unpin and do the IOTLB
> flush once for all these ranges. This significantly reduces
> the number of IOTLB flushes in the unmapping path.
> 
> Signed-off-by: Joerg Roedel <jroedel@suse.de>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 106 ++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 101 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 2b1e81f..86fc1da 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -107,6 +107,92 @@ struct vfio_pfn {
>  
>  static int put_pfn(unsigned long pfn, int prot);
>  
> +static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
> +				    unsigned long pfn, long npage,
> +				    bool do_accounting);
> +
> +#define GATHER_ENTRIES	32

What heuristics make us arrive at this number and how would we evaluate
changing it in the future?  Ideally we'd only want to do a single
flush, but we can't unpin pages until after the iommu sync and we need
the iommu to track iova-phys mappings, so it's a matter of how much do
we want to allocate to buffer those translations.  I wonder if a cache
pool would help here, but this is probably fine for a first pass with
some comment about this trade-off and why 32 was chosen.

> +
> +/*
> + * Gather TLB flushes before unpinning pages
> + */
> +struct vfio_gather_entry {
> +	dma_addr_t iova;
> +	phys_addr_t phys;
> +	size_t size;
> +};
> +
> +struct vfio_gather {
> +	unsigned fill;
> +	struct vfio_gather_entry entries[GATHER_ENTRIES];
> +};
> +
> +/*
> + * The vfio_gather* functions below keep track of flushing the IOMMU TLB
> + * and unpinning the pages. It is safe to call them gather == NULL, in
> + * which case they will fall-back to flushing the TLB and unpinning the
> + * pages at every call.
> + */
> +static long vfio_gather_flush(struct iommu_domain *domain,
> +			      struct vfio_dma *dma,
> +			      struct vfio_gather *gather)
> +{
> +	long unlocked = 0;
> +	unsigned i;
> +
> +	if (!gather)

|| !gather->fill

We might have gotten lucky that our last add triggered a flush.

> +		goto out;
> +
> +	/* First flush unmapped TLB entries */
> +	iommu_tlb_sync(domain);
> +
> +	for (i = 0; i < gather->fill; i++) {
> +		dma_addr_t iova = gather->entries[i].iova;
> +		phys_addr_t phys = gather->entries[i].phys;
> +		size_t size = gather->entries[i].size;
> +
> +		unlocked += vfio_unpin_pages_remote(dma, iova,
> +						    phys >> PAGE_SHIFT,
> +						    size >> PAGE_SHIFT,
> +						    false);
> +	}
> +
> +	gather->fill = 0;


A struct vfio_gather_entry* would clean this up and eliminate some
variables, including i.

> +
> +out:
> +	return unlocked;
> +}
> +
> +static long vfio_gather_add(struct iommu_domain *domain,
> +			    struct vfio_dma *dma,
> +			    struct vfio_gather *gather,
> +			    dma_addr_t iova, phys_addr_t phys, size_t size)
> +{
> +	long unlocked = 0;
> +
> +	if (gather) {
> +		unsigned index;
> +
> +		if (gather->fill == GATHER_ENTRIES)
> +			unlocked = vfio_gather_flush(domain, dma, gather);

                        unlocked += vfio_unpin_pages_remote(...);
                } else {

IOW, vfio_gather_flush() has already done the iommu_tlb_sync() for the
mapping that called us, there's no point in adding these to our list,
unpin them immediate.

> +
> +		index = gather->fill++;
> +
> +		gather->entries[index].iova = iova;
> +		gather->entries[index].phys = phys;
> +		gather->entries[index].size = size;

Alternatively, do the test and flush here instead.

Thanks,
Alex

> +	} else {
> +		iommu_tlb_sync(domain);
> +
> +		unlocked = vfio_unpin_pages_remote(dma, iova,
> +						   phys >> PAGE_SHIFT,
> +						   size >> PAGE_SHIFT,
> +						   false);
> +	}
> +
> +	return unlocked;
> +}
> +
>  /*
>   * This code handles mapping and unmapping of user data buffers
>   * into DMA'ble space using the IOMMU
> @@ -653,6 +739,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>  {
>  	dma_addr_t iova = dma->iova, end = dma->iova + dma->size;
>  	struct vfio_domain *domain, *d;
> +	struct vfio_gather *gather;
>  	long unlocked = 0;
>  
>  	if (!dma->size)
> @@ -662,6 +749,12 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>  		return 0;
>  
>  	/*
> +	 * No need to check return value - It is safe to continue with a
> +	 * NULL pointer.
> +	 */
> +	gather = kzalloc(sizeof(*gather), GFP_KERNEL);
> +
> +	/*
>  	 * We use the IOMMU to track the physical addresses, otherwise we'd
>  	 * need a much more complicated tracking system.  Unfortunately that
>  	 * means we need to use one of the iommu domains to figure out the
> @@ -706,17 +799,20 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>  			break;
>  
>  		iommu_tlb_range_add(domain->domain, iova, unmapped);
> -		iommu_tlb_sync(domain->domain);
>  
> -		unlocked += vfio_unpin_pages_remote(dma, iova,
> -						    phys >> PAGE_SHIFT,
> -						    unmapped >> PAGE_SHIFT,
> -						    false);
> +		unlocked += vfio_gather_add(domain->domain, dma, gather,
> +					    iova, phys, unmapped);
> +
>  		iova += unmapped;
>  
>  		cond_resched();
>  	}
>  
> +	unlocked += vfio_gather_flush(domain->domain, dma, gather);
> +
> +	kfree(gather);
> +	gather = NULL;
> +
>  	dma->iommu_mapped = false;
>  	if (do_accounting) {
>  		vfio_lock_acct(dma->task, -unlocked, NULL);

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-10-13 17:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-13 14:40 [PATCH 0/4] iommu/amd, vfio: Reduce IOTLB Flushm Rate Joerg Roedel
2017-10-13 14:40 ` Joerg Roedel
2017-10-13 14:40 ` [PATCH 1/4] iommu/amd: Finish TLB flush in amd_iommu_unmap() Joerg Roedel
2017-10-13 14:40   ` Joerg Roedel
2017-10-13 14:40 ` [PATCH 2/4] iommu/amd: Implement IOMMU-API TLB flush interface Joerg Roedel
2017-10-13 14:40   ` Joerg Roedel
2017-10-13 14:40 ` [PATCH 3/4] vfio/type1: Make use of iommu_unmap_fast() Joerg Roedel
2017-10-13 14:40 ` [PATCH 4/4] vfio/type1: Gather TLB-syncs and pages to unpin Joerg Roedel
2017-10-13 14:40   ` Joerg Roedel
2017-10-13 17:02   ` Alex Williamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.