All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Reduce IOTLB flush when pass-through dGPU devices
@ 2017-11-17 21:11 ` Suravee Suthikulpanit
  0 siblings, 0 replies; 15+ messages in thread
From: Suravee Suthikulpanit @ 2017-11-17 21:11 UTC (permalink / raw)
  To: linux-kernel, iommu; +Cc: joro, jroedel, alex.williamson, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Currently, when pass-through dGPU to a guest VM, there are thousands
of IOTLB flush commands sent from IOMMU to end-point-device. This cause
performance issue when launching new VMs, and could cause IOTLB invalidate
time-out issue on certain dGPUs.

This can be avoided by adopting the new fast IOTLB flush APIs.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Joerg Roedel <jroedel@suse.de>

Suravee Suthikulpanit (2):
  vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
  iommu/amd: Add support for fast IOTLB flushing

 drivers/iommu/amd_iommu.c       | 77 ++++++++++++++++++++++++++++++++++++++++-
 drivers/iommu/amd_iommu_init.c  |  2 --
 drivers/iommu/amd_iommu_types.h |  2 ++
 drivers/vfio/vfio_iommu_type1.c | 12 +++++--
 4 files changed, 87 insertions(+), 6 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Reduce IOTLB flush when pass-through dGPU devices
@ 2017-11-17 21:11 ` Suravee Suthikulpanit
  0 siblings, 0 replies; 15+ messages in thread
From: Suravee Suthikulpanit @ 2017-11-17 21:11 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: jroedel-l3A5Bk7waGM

From: Suravee Suthikulpanit <suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>

Currently, when pass-through dGPU to a guest VM, there are thousands
of IOTLB flush commands sent from IOMMU to end-point-device. This cause
performance issue when launching new VMs, and could cause IOTLB invalidate
time-out issue on certain dGPUs.

This can be avoided by adopting the new fast IOTLB flush APIs.

Cc: Alex Williamson <alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>

Suravee Suthikulpanit (2):
  vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
  iommu/amd: Add support for fast IOTLB flushing

 drivers/iommu/amd_iommu.c       | 77 ++++++++++++++++++++++++++++++++++++++++-
 drivers/iommu/amd_iommu_init.c  |  2 --
 drivers/iommu/amd_iommu_types.h |  2 ++
 drivers/vfio/vfio_iommu_type1.c | 12 +++++--
 4 files changed, 87 insertions(+), 6 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
@ 2017-11-17 21:11   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 15+ messages in thread
From: Suravee Suthikulpanit @ 2017-11-17 21:11 UTC (permalink / raw)
  To: linux-kernel, iommu; +Cc: joro, jroedel, alex.williamson, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

VFIO IOMMU type1 currently upmaps IOVA pages synchronously, which requires
IOTLB flushing for every unmapping. This results in large IOTLB flushing
overhead when handling pass-through devices with a large number of mapped
IOVAs (e.g. GPUs).

This can be avoided by using the new IOTLB flushing interface.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/vfio/vfio_iommu_type1.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 92155cc..28a7ab6 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -698,10 +698,12 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 				break;
 		}
 
-		unmapped = iommu_unmap(domain->domain, iova, len);
+		unmapped = iommu_unmap_fast(domain->domain, iova, len);
 		if (WARN_ON(!unmapped))
 			break;
 
+		iommu_tlb_range_add(domain->domain, iova, len);
+
 		unlocked += vfio_unpin_pages_remote(dma, iova,
 						    phys >> PAGE_SHIFT,
 						    unmapped >> PAGE_SHIFT,
@@ -710,6 +712,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 
 		cond_resched();
 	}
+	iommu_tlb_sync(domain->domain);
 
 	dma->iommu_mapped = false;
 	if (do_accounting) {
@@ -884,8 +887,11 @@ static int map_try_harder(struct vfio_domain *domain, dma_addr_t iova,
 			break;
 	}
 
-	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE)
-		iommu_unmap(domain->domain, iova, PAGE_SIZE);
+	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE) {
+		iommu_unmap_fast(domain->domain, iova, PAGE_SIZE);
+		iommu_tlb_range_add(domain->domain, iova, PAGE_SIZE);
+	}
+	iommu_tlb_sync(domain->domain);
 
 	return ret;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 1/2] vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
@ 2017-11-17 21:11   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 15+ messages in thread
From: Suravee Suthikulpanit @ 2017-11-17 21:11 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: jroedel-l3A5Bk7waGM

From: Suravee Suthikulpanit <suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>

VFIO IOMMU type1 currently upmaps IOVA pages synchronously, which requires
IOTLB flushing for every unmapping. This results in large IOTLB flushing
overhead when handling pass-through devices with a large number of mapped
IOVAs (e.g. GPUs).

This can be avoided by using the new IOTLB flushing interface.

Cc: Alex Williamson <alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 92155cc..28a7ab6 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -698,10 +698,12 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 				break;
 		}
 
-		unmapped = iommu_unmap(domain->domain, iova, len);
+		unmapped = iommu_unmap_fast(domain->domain, iova, len);
 		if (WARN_ON(!unmapped))
 			break;
 
+		iommu_tlb_range_add(domain->domain, iova, len);
+
 		unlocked += vfio_unpin_pages_remote(dma, iova,
 						    phys >> PAGE_SHIFT,
 						    unmapped >> PAGE_SHIFT,
@@ -710,6 +712,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
 
 		cond_resched();
 	}
+	iommu_tlb_sync(domain->domain);
 
 	dma->iommu_mapped = false;
 	if (do_accounting) {
@@ -884,8 +887,11 @@ static int map_try_harder(struct vfio_domain *domain, dma_addr_t iova,
 			break;
 	}
 
-	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE)
-		iommu_unmap(domain->domain, iova, PAGE_SIZE);
+	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE) {
+		iommu_unmap_fast(domain->domain, iova, PAGE_SIZE);
+		iommu_tlb_range_add(domain->domain, iova, PAGE_SIZE);
+	}
+	iommu_tlb_sync(domain->domain);
 
 	return ret;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] iommu/amd: Add support for fast IOTLB flushing
@ 2017-11-17 21:11   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 15+ messages in thread
From: Suravee Suthikulpanit @ 2017-11-17 21:11 UTC (permalink / raw)
  To: linux-kernel, iommu; +Cc: joro, jroedel, alex.williamson, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Implement the newly added IOTLB flushing interface by introducing
per-protection-domain IOTLB flush list, which maintains a list of
IOVAs to be invalidated (by INVALIDATE_IOTLB_PAGES command) during
IOTLB sync.

Cc: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd_iommu.c       | 77 ++++++++++++++++++++++++++++++++++++++++-
 drivers/iommu/amd_iommu_init.c  |  2 --
 drivers/iommu/amd_iommu_types.h |  2 ++
 3 files changed, 78 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 8e8874d..bf92809 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -130,6 +130,12 @@ struct dma_ops_domain {
 static struct iova_domain reserved_iova_ranges;
 static struct lock_class_key reserved_rbtree_key;
 
+struct iotlb_flush_entry {
+	struct list_head list;
+	unsigned long iova;
+	size_t size;
+};
+
 /****************************************************************************
  *
  * Helper functions
@@ -2838,11 +2844,13 @@ static void protection_domain_free(struct protection_domain *domain)
 static int protection_domain_init(struct protection_domain *domain)
 {
 	spin_lock_init(&domain->lock);
+	spin_lock_init(&domain->iotlb_flush_list_lock);
 	mutex_init(&domain->api_lock);
 	domain->id = domain_id_alloc();
 	if (!domain->id)
 		return -ENOMEM;
 	INIT_LIST_HEAD(&domain->dev_list);
+	INIT_LIST_HEAD(&domain->iotlb_flush_list);
 
 	return 0;
 }
@@ -3047,7 +3055,6 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
 	unmap_size = iommu_unmap_page(domain, iova, page_size);
 	mutex_unlock(&domain->api_lock);
 
-	domain_flush_tlb_pde(domain);
 	domain_flush_complete(domain);
 
 	return unmap_size;
@@ -3167,6 +3174,71 @@ static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
 	return dev_data->defer_attach;
 }
 
+static void amd_iommu_flush_iotlb_all(struct iommu_domain *domain)
+{
+	struct protection_domain *dom = to_pdomain(domain);
+
+	domain_flush_tlb_pde(dom);
+}
+
+static void amd_iommu_iotlb_range_add(struct iommu_domain *domain,
+				      unsigned long iova, size_t size)
+{
+	struct protection_domain *pdom = to_pdomain(domain);
+	struct iotlb_flush_entry *entry, *p;
+	unsigned long flags;
+	bool found = false;
+
+	spin_lock_irqsave(&pdom->iotlb_flush_list_lock, flags);
+	list_for_each_entry(p, &pdom->iotlb_flush_list, list) {
+		if (iova != p->iova)
+			continue;
+
+		if (size > p->size) {
+			p->size = size;
+			pr_debug("%s: update range: iova=%#lx, size = %#lx\n",
+				 __func__, p->iova, p->size);
+		}
+		found = true;
+		break;
+	}
+
+	if (!found) {
+		entry = kzalloc(sizeof(struct iotlb_flush_entry),
+				GFP_ATOMIC);
+		if (!entry)
+			return;
+
+		pr_debug("%s: new range: iova=%lx, size=%#lx\n",
+			 __func__, iova, size);
+
+		entry->iova = iova;
+		entry->size = size;
+		list_add(&entry->list, &pdom->iotlb_flush_list);
+	}
+	spin_unlock_irqrestore(&pdom->iotlb_flush_list_lock, flags);
+}
+
+static void amd_iommu_iotlb_sync(struct iommu_domain *domain)
+{
+	struct protection_domain *pdom = to_pdomain(domain);
+	struct iotlb_flush_entry *entry, *next;
+	unsigned long flags;
+
+	/* Note:
+	 * Currently, IOMMU driver just flushes the whole IO/TLB for
+	 * a given domain. So, just remove entries from the list here.
+	 */
+	spin_lock_irqsave(&pdom->iotlb_flush_list_lock, flags);
+	list_for_each_entry_safe(entry, next, &pdom->iotlb_flush_list, list) {
+		list_del(&entry->list);
+		kfree(entry);
+	}
+	spin_unlock_irqrestore(&pdom->iotlb_flush_list_lock, flags);
+
+	domain_flush_tlb_pde(pdom);
+}
+
 const struct iommu_ops amd_iommu_ops = {
 	.capable = amd_iommu_capable,
 	.domain_alloc = amd_iommu_domain_alloc,
@@ -3185,6 +3257,9 @@ static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
 	.apply_resv_region = amd_iommu_apply_resv_region,
 	.is_attach_deferred = amd_iommu_is_attach_deferred,
 	.pgsize_bitmap	= AMD_IOMMU_PGSIZES,
+	.flush_iotlb_all = amd_iommu_flush_iotlb_all,
+	.iotlb_range_add = amd_iommu_iotlb_range_add,
+	.iotlb_sync = amd_iommu_iotlb_sync,
 };
 
 /*****************************************************************************
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 6fe2d03..1659377 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2489,8 +2489,6 @@ static int __init early_amd_iommu_init(void)
 	 */
 	__set_bit(0, amd_iommu_pd_alloc_bitmap);
 
-	spin_lock_init(&amd_iommu_pd_lock);
-
 	/*
 	 * now the data structures are allocated and basically initialized
 	 * start the real acpi table scan
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index f6b24c7..30e1c68 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -462,9 +462,11 @@ struct amd_iommu_fault {
 struct protection_domain {
 	struct list_head list;  /* for list of all protection domains */
 	struct list_head dev_list; /* List of all devices in this domain */
+	struct list_head iotlb_flush_list; /* store iovas for iotlb sync */
 	struct iommu_domain domain; /* generic domain handle used by
 				       iommu core code */
 	spinlock_t lock;	/* mostly used to lock the page table*/
+	spinlock_t iotlb_flush_list_lock; /* protect iova flush list */
 	struct mutex api_lock;	/* protect page tables in the iommu-api path */
 	u16 id;			/* the domain id written to the device table */
 	int mode;		/* paging mode (0-6 levels) */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] iommu/amd: Add support for fast IOTLB flushing
@ 2017-11-17 21:11   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 15+ messages in thread
From: Suravee Suthikulpanit @ 2017-11-17 21:11 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: jroedel-l3A5Bk7waGM

From: Suravee Suthikulpanit <suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>

Implement the newly added IOTLB flushing interface by introducing
per-protection-domain IOTLB flush list, which maintains a list of
IOVAs to be invalidated (by INVALIDATE_IOTLB_PAGES command) during
IOTLB sync.

Cc: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>
---
 drivers/iommu/amd_iommu.c       | 77 ++++++++++++++++++++++++++++++++++++++++-
 drivers/iommu/amd_iommu_init.c  |  2 --
 drivers/iommu/amd_iommu_types.h |  2 ++
 3 files changed, 78 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 8e8874d..bf92809 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -130,6 +130,12 @@ struct dma_ops_domain {
 static struct iova_domain reserved_iova_ranges;
 static struct lock_class_key reserved_rbtree_key;
 
+struct iotlb_flush_entry {
+	struct list_head list;
+	unsigned long iova;
+	size_t size;
+};
+
 /****************************************************************************
  *
  * Helper functions
@@ -2838,11 +2844,13 @@ static void protection_domain_free(struct protection_domain *domain)
 static int protection_domain_init(struct protection_domain *domain)
 {
 	spin_lock_init(&domain->lock);
+	spin_lock_init(&domain->iotlb_flush_list_lock);
 	mutex_init(&domain->api_lock);
 	domain->id = domain_id_alloc();
 	if (!domain->id)
 		return -ENOMEM;
 	INIT_LIST_HEAD(&domain->dev_list);
+	INIT_LIST_HEAD(&domain->iotlb_flush_list);
 
 	return 0;
 }
@@ -3047,7 +3055,6 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
 	unmap_size = iommu_unmap_page(domain, iova, page_size);
 	mutex_unlock(&domain->api_lock);
 
-	domain_flush_tlb_pde(domain);
 	domain_flush_complete(domain);
 
 	return unmap_size;
@@ -3167,6 +3174,71 @@ static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
 	return dev_data->defer_attach;
 }
 
+static void amd_iommu_flush_iotlb_all(struct iommu_domain *domain)
+{
+	struct protection_domain *dom = to_pdomain(domain);
+
+	domain_flush_tlb_pde(dom);
+}
+
+static void amd_iommu_iotlb_range_add(struct iommu_domain *domain,
+				      unsigned long iova, size_t size)
+{
+	struct protection_domain *pdom = to_pdomain(domain);
+	struct iotlb_flush_entry *entry, *p;
+	unsigned long flags;
+	bool found = false;
+
+	spin_lock_irqsave(&pdom->iotlb_flush_list_lock, flags);
+	list_for_each_entry(p, &pdom->iotlb_flush_list, list) {
+		if (iova != p->iova)
+			continue;
+
+		if (size > p->size) {
+			p->size = size;
+			pr_debug("%s: update range: iova=%#lx, size = %#lx\n",
+				 __func__, p->iova, p->size);
+		}
+		found = true;
+		break;
+	}
+
+	if (!found) {
+		entry = kzalloc(sizeof(struct iotlb_flush_entry),
+				GFP_ATOMIC);
+		if (!entry)
+			return;
+
+		pr_debug("%s: new range: iova=%lx, size=%#lx\n",
+			 __func__, iova, size);
+
+		entry->iova = iova;
+		entry->size = size;
+		list_add(&entry->list, &pdom->iotlb_flush_list);
+	}
+	spin_unlock_irqrestore(&pdom->iotlb_flush_list_lock, flags);
+}
+
+static void amd_iommu_iotlb_sync(struct iommu_domain *domain)
+{
+	struct protection_domain *pdom = to_pdomain(domain);
+	struct iotlb_flush_entry *entry, *next;
+	unsigned long flags;
+
+	/* Note:
+	 * Currently, IOMMU driver just flushes the whole IO/TLB for
+	 * a given domain. So, just remove entries from the list here.
+	 */
+	spin_lock_irqsave(&pdom->iotlb_flush_list_lock, flags);
+	list_for_each_entry_safe(entry, next, &pdom->iotlb_flush_list, list) {
+		list_del(&entry->list);
+		kfree(entry);
+	}
+	spin_unlock_irqrestore(&pdom->iotlb_flush_list_lock, flags);
+
+	domain_flush_tlb_pde(pdom);
+}
+
 const struct iommu_ops amd_iommu_ops = {
 	.capable = amd_iommu_capable,
 	.domain_alloc = amd_iommu_domain_alloc,
@@ -3185,6 +3257,9 @@ static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
 	.apply_resv_region = amd_iommu_apply_resv_region,
 	.is_attach_deferred = amd_iommu_is_attach_deferred,
 	.pgsize_bitmap	= AMD_IOMMU_PGSIZES,
+	.flush_iotlb_all = amd_iommu_flush_iotlb_all,
+	.iotlb_range_add = amd_iommu_iotlb_range_add,
+	.iotlb_sync = amd_iommu_iotlb_sync,
 };
 
 /*****************************************************************************
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 6fe2d03..1659377 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2489,8 +2489,6 @@ static int __init early_amd_iommu_init(void)
 	 */
 	__set_bit(0, amd_iommu_pd_alloc_bitmap);
 
-	spin_lock_init(&amd_iommu_pd_lock);
-
 	/*
 	 * now the data structures are allocated and basically initialized
 	 * start the real acpi table scan
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index f6b24c7..30e1c68 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -462,9 +462,11 @@ struct amd_iommu_fault {
 struct protection_domain {
 	struct list_head list;  /* for list of all protection domains */
 	struct list_head dev_list; /* List of all devices in this domain */
+	struct list_head iotlb_flush_list; /* store iovas for iotlb sync */
 	struct iommu_domain domain; /* generic domain handle used by
 				       iommu core code */
 	spinlock_t lock;	/* mostly used to lock the page table*/
+	spinlock_t iotlb_flush_list_lock; /* protect iova flush list */
 	struct mutex api_lock;	/* protect page tables in the iommu-api path */
 	u16 id;			/* the domain id written to the device table */
 	int mode;		/* paging mode (0-6 levels) */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
@ 2017-11-17 21:51     ` Alex Williamson
  0 siblings, 0 replies; 15+ messages in thread
From: Alex Williamson @ 2017-11-17 21:51 UTC (permalink / raw)
  To: Suravee Suthikulpanit; +Cc: linux-kernel, iommu, joro, jroedel

On Fri, 17 Nov 2017 15:11:19 -0600
Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> wrote:

> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> VFIO IOMMU type1 currently upmaps IOVA pages synchronously, which requires
> IOTLB flushing for every unmapping. This results in large IOTLB flushing
> overhead when handling pass-through devices with a large number of mapped
> IOVAs (e.g. GPUs).

Of course the type of device is really irrelevant, QEMU maps the entire
VM address space for any assigned device.

> This can be avoided by using the new IOTLB flushing interface.
> 
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Joerg Roedel <jroedel@suse.de>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 92155cc..28a7ab6 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -698,10 +698,12 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>  				break;
>  		}
>  
> -		unmapped = iommu_unmap(domain->domain, iova, len);
> +		unmapped = iommu_unmap_fast(domain->domain, iova, len);
>  		if (WARN_ON(!unmapped))
>  			break;
>  
> +		iommu_tlb_range_add(domain->domain, iova, len);
> +

We should only add @unmapped, not @len, right?

>  		unlocked += vfio_unpin_pages_remote(dma, iova,
>  						    phys >> PAGE_SHIFT,
>  						    unmapped >> PAGE_SHIFT,
> @@ -710,6 +712,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>  
>  		cond_resched();
>  	}
> +	iommu_tlb_sync(domain->domain);
>  
>  	dma->iommu_mapped = false;
>  	if (do_accounting) {
> @@ -884,8 +887,11 @@ static int map_try_harder(struct vfio_domain *domain, dma_addr_t iova,
>  			break;
>  	}
>  
> -	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE)
> -		iommu_unmap(domain->domain, iova, PAGE_SIZE);
> +	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE) {
> +		iommu_unmap_fast(domain->domain, iova, PAGE_SIZE);
> +		iommu_tlb_range_add(domain->domain, iova, PAGE_SIZE);
> +	}
> +	iommu_tlb_sync(domain->domain);
>  
>  	return ret;
>  }

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
@ 2017-11-17 21:51     ` Alex Williamson
  0 siblings, 0 replies; 15+ messages in thread
From: Alex Williamson @ 2017-11-17 21:51 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jroedel-l3A5Bk7waGM, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Fri, 17 Nov 2017 15:11:19 -0600
Suravee Suthikulpanit <Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org> wrote:

> From: Suravee Suthikulpanit <suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>
> 
> VFIO IOMMU type1 currently upmaps IOVA pages synchronously, which requires
> IOTLB flushing for every unmapping. This results in large IOTLB flushing
> overhead when handling pass-through devices with a large number of mapped
> IOVAs (e.g. GPUs).

Of course the type of device is really irrelevant, QEMU maps the entire
VM address space for any assigned device.

> This can be avoided by using the new IOTLB flushing interface.
> 
> Cc: Alex Williamson <alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 92155cc..28a7ab6 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -698,10 +698,12 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>  				break;
>  		}
>  
> -		unmapped = iommu_unmap(domain->domain, iova, len);
> +		unmapped = iommu_unmap_fast(domain->domain, iova, len);
>  		if (WARN_ON(!unmapped))
>  			break;
>  
> +		iommu_tlb_range_add(domain->domain, iova, len);
> +

We should only add @unmapped, not @len, right?

>  		unlocked += vfio_unpin_pages_remote(dma, iova,
>  						    phys >> PAGE_SHIFT,
>  						    unmapped >> PAGE_SHIFT,
> @@ -710,6 +712,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>  
>  		cond_resched();
>  	}
> +	iommu_tlb_sync(domain->domain);
>  
>  	dma->iommu_mapped = false;
>  	if (do_accounting) {
> @@ -884,8 +887,11 @@ static int map_try_harder(struct vfio_domain *domain, dma_addr_t iova,
>  			break;
>  	}
>  
> -	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE)
> -		iommu_unmap(domain->domain, iova, PAGE_SIZE);
> +	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE) {
> +		iommu_unmap_fast(domain->domain, iova, PAGE_SIZE);
> +		iommu_tlb_range_add(domain->domain, iova, PAGE_SIZE);
> +	}
> +	iommu_tlb_sync(domain->domain);
>  
>  	return ret;
>  }

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Add support for fast IOTLB flushing
@ 2017-11-17 22:25     ` Tom Lendacky
  0 siblings, 0 replies; 15+ messages in thread
From: Tom Lendacky @ 2017-11-17 22:25 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu; +Cc: joro, jroedel, alex.williamson

On 11/17/2017 3:11 PM, Suravee Suthikulpanit wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> Implement the newly added IOTLB flushing interface by introducing
> per-protection-domain IOTLB flush list, which maintains a list of
> IOVAs to be invalidated (by INVALIDATE_IOTLB_PAGES command) during
> IOTLB sync.
> 
> Cc: Joerg Roedel <jroedel@suse.de>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>   drivers/iommu/amd_iommu.c       | 77 ++++++++++++++++++++++++++++++++++++++++-
>   drivers/iommu/amd_iommu_init.c  |  2 --
>   drivers/iommu/amd_iommu_types.h |  2 ++
>   3 files changed, 78 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 8e8874d..bf92809 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -130,6 +130,12 @@ struct dma_ops_domain {
>   static struct iova_domain reserved_iova_ranges;
>   static struct lock_class_key reserved_rbtree_key;
>   
> +struct iotlb_flush_entry {
> +	struct list_head list;
> +	unsigned long iova;
> +	size_t size;
> +};
> +
>   /****************************************************************************
>    *
>    * Helper functions
> @@ -2838,11 +2844,13 @@ static void protection_domain_free(struct protection_domain *domain)
>   static int protection_domain_init(struct protection_domain *domain)
>   {
>   	spin_lock_init(&domain->lock);
> +	spin_lock_init(&domain->iotlb_flush_list_lock);
>   	mutex_init(&domain->api_lock);
>   	domain->id = domain_id_alloc();
>   	if (!domain->id)
>   		return -ENOMEM;
>   	INIT_LIST_HEAD(&domain->dev_list);
> +	INIT_LIST_HEAD(&domain->iotlb_flush_list);
>   
>   	return 0;
>   }
> @@ -3047,7 +3055,6 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
>   	unmap_size = iommu_unmap_page(domain, iova, page_size);
>   	mutex_unlock(&domain->api_lock);
>   
> -	domain_flush_tlb_pde(domain);
>   	domain_flush_complete(domain);
>   
>   	return unmap_size;
> @@ -3167,6 +3174,71 @@ static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
>   	return dev_data->defer_attach;
>   }
>   
> +static void amd_iommu_flush_iotlb_all(struct iommu_domain *domain)
> +{
> +	struct protection_domain *dom = to_pdomain(domain);
> +
> +	domain_flush_tlb_pde(dom);
> +}
> +
> +static void amd_iommu_iotlb_range_add(struct iommu_domain *domain,
> +				      unsigned long iova, size_t size)
> +{
> +	struct protection_domain *pdom = to_pdomain(domain);
> +	struct iotlb_flush_entry *entry, *p;
> +	unsigned long flags;
> +	bool found = false;
> +
> +	spin_lock_irqsave(&pdom->iotlb_flush_list_lock, flags);
> +	list_for_each_entry(p, &pdom->iotlb_flush_list, list) {
> +		if (iova != p->iova)
> +			continue;
> +
> +		if (size > p->size) {
> +			p->size = size;
> +			pr_debug("%s: update range: iova=%#lx, size = %#lx\n",
> +				 __func__, p->iova, p->size);
> +		}
> +		found = true;
> +		break;
> +	}
> +
> +	if (!found) {
> +		entry = kzalloc(sizeof(struct iotlb_flush_entry),
> +				GFP_ATOMIC);
> +		if (!entry)
> +			return;

You need to release the spinlock before returning here.

Thanks,
Tom

> +
> +		pr_debug("%s: new range: iova=%lx, size=%#lx\n",
> +			 __func__, iova, size);
> +
> +		entry->iova = iova;
> +		entry->size = size;
> +		list_add(&entry->list, &pdom->iotlb_flush_list);
> +	}
> +	spin_unlock_irqrestore(&pdom->iotlb_flush_list_lock, flags);
> +}
> +
> +static void amd_iommu_iotlb_sync(struct iommu_domain *domain)
> +{
> +	struct protection_domain *pdom = to_pdomain(domain);
> +	struct iotlb_flush_entry *entry, *next;
> +	unsigned long flags;
> +
> +	/* Note:
> +	 * Currently, IOMMU driver just flushes the whole IO/TLB for
> +	 * a given domain. So, just remove entries from the list here.
> +	 */
> +	spin_lock_irqsave(&pdom->iotlb_flush_list_lock, flags);
> +	list_for_each_entry_safe(entry, next, &pdom->iotlb_flush_list, list) {
> +		list_del(&entry->list);
> +		kfree(entry);
> +	}
> +	spin_unlock_irqrestore(&pdom->iotlb_flush_list_lock, flags);
> +
> +	domain_flush_tlb_pde(pdom);
> +}
> +
>   const struct iommu_ops amd_iommu_ops = {
>   	.capable = amd_iommu_capable,
>   	.domain_alloc = amd_iommu_domain_alloc,
> @@ -3185,6 +3257,9 @@ static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
>   	.apply_resv_region = amd_iommu_apply_resv_region,
>   	.is_attach_deferred = amd_iommu_is_attach_deferred,
>   	.pgsize_bitmap	= AMD_IOMMU_PGSIZES,
> +	.flush_iotlb_all = amd_iommu_flush_iotlb_all,
> +	.iotlb_range_add = amd_iommu_iotlb_range_add,
> +	.iotlb_sync = amd_iommu_iotlb_sync,
>   };
>   
>   /*****************************************************************************
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 6fe2d03..1659377 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -2489,8 +2489,6 @@ static int __init early_amd_iommu_init(void)
>   	 */
>   	__set_bit(0, amd_iommu_pd_alloc_bitmap);
>   
> -	spin_lock_init(&amd_iommu_pd_lock);
> -
>   	/*
>   	 * now the data structures are allocated and basically initialized
>   	 * start the real acpi table scan
> diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
> index f6b24c7..30e1c68 100644
> --- a/drivers/iommu/amd_iommu_types.h
> +++ b/drivers/iommu/amd_iommu_types.h
> @@ -462,9 +462,11 @@ struct amd_iommu_fault {
>   struct protection_domain {
>   	struct list_head list;  /* for list of all protection domains */
>   	struct list_head dev_list; /* List of all devices in this domain */
> +	struct list_head iotlb_flush_list; /* store iovas for iotlb sync */
>   	struct iommu_domain domain; /* generic domain handle used by
>   				       iommu core code */
>   	spinlock_t lock;	/* mostly used to lock the page table*/
> +	spinlock_t iotlb_flush_list_lock; /* protect iova flush list */
>   	struct mutex api_lock;	/* protect page tables in the iommu-api path */
>   	u16 id;			/* the domain id written to the device table */
>   	int mode;		/* paging mode (0-6 levels) */
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Add support for fast IOTLB flushing
@ 2017-11-17 22:25     ` Tom Lendacky
  0 siblings, 0 replies; 15+ messages in thread
From: Tom Lendacky @ 2017-11-17 22:25 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: jroedel-l3A5Bk7waGM

On 11/17/2017 3:11 PM, Suravee Suthikulpanit wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>
> 
> Implement the newly added IOTLB flushing interface by introducing
> per-protection-domain IOTLB flush list, which maintains a list of
> IOVAs to be invalidated (by INVALIDATE_IOTLB_PAGES command) during
> IOTLB sync.
> 
> Cc: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>
> ---
>   drivers/iommu/amd_iommu.c       | 77 ++++++++++++++++++++++++++++++++++++++++-
>   drivers/iommu/amd_iommu_init.c  |  2 --
>   drivers/iommu/amd_iommu_types.h |  2 ++
>   3 files changed, 78 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 8e8874d..bf92809 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -130,6 +130,12 @@ struct dma_ops_domain {
>   static struct iova_domain reserved_iova_ranges;
>   static struct lock_class_key reserved_rbtree_key;
>   
> +struct iotlb_flush_entry {
> +	struct list_head list;
> +	unsigned long iova;
> +	size_t size;
> +};
> +
>   /****************************************************************************
>    *
>    * Helper functions
> @@ -2838,11 +2844,13 @@ static void protection_domain_free(struct protection_domain *domain)
>   static int protection_domain_init(struct protection_domain *domain)
>   {
>   	spin_lock_init(&domain->lock);
> +	spin_lock_init(&domain->iotlb_flush_list_lock);
>   	mutex_init(&domain->api_lock);
>   	domain->id = domain_id_alloc();
>   	if (!domain->id)
>   		return -ENOMEM;
>   	INIT_LIST_HEAD(&domain->dev_list);
> +	INIT_LIST_HEAD(&domain->iotlb_flush_list);
>   
>   	return 0;
>   }
> @@ -3047,7 +3055,6 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
>   	unmap_size = iommu_unmap_page(domain, iova, page_size);
>   	mutex_unlock(&domain->api_lock);
>   
> -	domain_flush_tlb_pde(domain);
>   	domain_flush_complete(domain);
>   
>   	return unmap_size;
> @@ -3167,6 +3174,71 @@ static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
>   	return dev_data->defer_attach;
>   }
>   
> +static void amd_iommu_flush_iotlb_all(struct iommu_domain *domain)
> +{
> +	struct protection_domain *dom = to_pdomain(domain);
> +
> +	domain_flush_tlb_pde(dom);
> +}
> +
> +static void amd_iommu_iotlb_range_add(struct iommu_domain *domain,
> +				      unsigned long iova, size_t size)
> +{
> +	struct protection_domain *pdom = to_pdomain(domain);
> +	struct iotlb_flush_entry *entry, *p;
> +	unsigned long flags;
> +	bool found = false;
> +
> +	spin_lock_irqsave(&pdom->iotlb_flush_list_lock, flags);
> +	list_for_each_entry(p, &pdom->iotlb_flush_list, list) {
> +		if (iova != p->iova)
> +			continue;
> +
> +		if (size > p->size) {
> +			p->size = size;
> +			pr_debug("%s: update range: iova=%#lx, size = %#lx\n",
> +				 __func__, p->iova, p->size);
> +		}
> +		found = true;
> +		break;
> +	}
> +
> +	if (!found) {
> +		entry = kzalloc(sizeof(struct iotlb_flush_entry),
> +				GFP_ATOMIC);
> +		if (!entry)
> +			return;

You need to release the spinlock before returning here.

Thanks,
Tom

> +
> +		pr_debug("%s: new range: iova=%lx, size=%#lx\n",
> +			 __func__, iova, size);
> +
> +		entry->iova = iova;
> +		entry->size = size;
> +		list_add(&entry->list, &pdom->iotlb_flush_list);
> +	}
> +	spin_unlock_irqrestore(&pdom->iotlb_flush_list_lock, flags);
> +}
> +
> +static void amd_iommu_iotlb_sync(struct iommu_domain *domain)
> +{
> +	struct protection_domain *pdom = to_pdomain(domain);
> +	struct iotlb_flush_entry *entry, *next;
> +	unsigned long flags;
> +
> +	/* Note:
> +	 * Currently, IOMMU driver just flushes the whole IO/TLB for
> +	 * a given domain. So, just remove entries from the list here.
> +	 */
> +	spin_lock_irqsave(&pdom->iotlb_flush_list_lock, flags);
> +	list_for_each_entry_safe(entry, next, &pdom->iotlb_flush_list, list) {
> +		list_del(&entry->list);
> +		kfree(entry);
> +	}
> +	spin_unlock_irqrestore(&pdom->iotlb_flush_list_lock, flags);
> +
> +	domain_flush_tlb_pde(pdom);
> +}
> +
>   const struct iommu_ops amd_iommu_ops = {
>   	.capable = amd_iommu_capable,
>   	.domain_alloc = amd_iommu_domain_alloc,
> @@ -3185,6 +3257,9 @@ static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain,
>   	.apply_resv_region = amd_iommu_apply_resv_region,
>   	.is_attach_deferred = amd_iommu_is_attach_deferred,
>   	.pgsize_bitmap	= AMD_IOMMU_PGSIZES,
> +	.flush_iotlb_all = amd_iommu_flush_iotlb_all,
> +	.iotlb_range_add = amd_iommu_iotlb_range_add,
> +	.iotlb_sync = amd_iommu_iotlb_sync,
>   };
>   
>   /*****************************************************************************
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 6fe2d03..1659377 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -2489,8 +2489,6 @@ static int __init early_amd_iommu_init(void)
>   	 */
>   	__set_bit(0, amd_iommu_pd_alloc_bitmap);
>   
> -	spin_lock_init(&amd_iommu_pd_lock);
> -
>   	/*
>   	 * now the data structures are allocated and basically initialized
>   	 * start the real acpi table scan
> diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
> index f6b24c7..30e1c68 100644
> --- a/drivers/iommu/amd_iommu_types.h
> +++ b/drivers/iommu/amd_iommu_types.h
> @@ -462,9 +462,11 @@ struct amd_iommu_fault {
>   struct protection_domain {
>   	struct list_head list;  /* for list of all protection domains */
>   	struct list_head dev_list; /* List of all devices in this domain */
> +	struct list_head iotlb_flush_list; /* store iovas for iotlb sync */
>   	struct iommu_domain domain; /* generic domain handle used by
>   				       iommu core code */
>   	spinlock_t lock;	/* mostly used to lock the page table*/
> +	spinlock_t iotlb_flush_list_lock; /* protect iova flush list */
>   	struct mutex api_lock;	/* protect page tables in the iommu-api path */
>   	u16 id;			/* the domain id written to the device table */
>   	int mode;		/* paging mode (0-6 levels) */
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
@ 2017-11-18  4:20       ` Alex Williamson
  0 siblings, 0 replies; 15+ messages in thread
From: Alex Williamson @ 2017-11-18  4:20 UTC (permalink / raw)
  To: Suravee Suthikulpanit; +Cc: iommu, jroedel, linux-kernel

On Fri, 17 Nov 2017 14:51:52 -0700
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Fri, 17 Nov 2017 15:11:19 -0600
> Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> wrote:
> 
> > From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> > 
> > VFIO IOMMU type1 currently upmaps IOVA pages synchronously, which requires
> > IOTLB flushing for every unmapping. This results in large IOTLB flushing
> > overhead when handling pass-through devices with a large number of mapped
> > IOVAs (e.g. GPUs).  
> 
> Of course the type of device is really irrelevant, QEMU maps the entire
> VM address space for any assigned device.
> 
> > This can be avoided by using the new IOTLB flushing interface.
> > 
> > Cc: Alex Williamson <alex.williamson@redhat.com>
> > Cc: Joerg Roedel <jroedel@suse.de>
> > Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> > ---
> >  drivers/vfio/vfio_iommu_type1.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> > index 92155cc..28a7ab6 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -698,10 +698,12 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
> >  				break;
> >  		}
> >  
> > -		unmapped = iommu_unmap(domain->domain, iova, len);
> > +		unmapped = iommu_unmap_fast(domain->domain, iova, len);
> >  		if (WARN_ON(!unmapped))
> >  			break;
> >  
> > +		iommu_tlb_range_add(domain->domain, iova, len);
> > +  
> 
> We should only add @unmapped, not @len, right?

Actually, the problems are deeper than that, if we can't guarantee that
the above iommu_unmap_fast has removed the iommu mapping, then we can't
do the unpin below as that would potentially allow the device access to
unknown memory.  Thus, to support this, the unpinning would need to be
pushed until after the sync and we therefore need some mechanism of
remembering the phys addresses that we've unmapped.  Thanks,

Alex
 
> >  		unlocked += vfio_unpin_pages_remote(dma, iova,
> >  						    phys >> PAGE_SHIFT,
> >  						    unmapped >> PAGE_SHIFT,
> > @@ -710,6 +712,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
> >  
> >  		cond_resched();
> >  	}
> > +	iommu_tlb_sync(domain->domain);
> >  
> >  	dma->iommu_mapped = false;
> >  	if (do_accounting) {
> > @@ -884,8 +887,11 @@ static int map_try_harder(struct vfio_domain *domain, dma_addr_t iova,
> >  			break;
> >  	}
> >  
> > -	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE)
> > -		iommu_unmap(domain->domain, iova, PAGE_SIZE);
> > +	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE) {
> > +		iommu_unmap_fast(domain->domain, iova, PAGE_SIZE);
> > +		iommu_tlb_range_add(domain->domain, iova, PAGE_SIZE);
> > +	}
> > +	iommu_tlb_sync(domain->domain);
> >  
> >  	return ret;
> >  }  
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
@ 2017-11-18  4:20       ` Alex Williamson
  0 siblings, 0 replies; 15+ messages in thread
From: Alex Williamson @ 2017-11-18  4:20 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jroedel-l3A5Bk7waGM, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Fri, 17 Nov 2017 14:51:52 -0700
Alex Williamson <alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

> On Fri, 17 Nov 2017 15:11:19 -0600
> Suravee Suthikulpanit <Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org> wrote:
> 
> > From: Suravee Suthikulpanit <suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>
> > 
> > VFIO IOMMU type1 currently upmaps IOVA pages synchronously, which requires
> > IOTLB flushing for every unmapping. This results in large IOTLB flushing
> > overhead when handling pass-through devices with a large number of mapped
> > IOVAs (e.g. GPUs).  
> 
> Of course the type of device is really irrelevant, QEMU maps the entire
> VM address space for any assigned device.
> 
> > This can be avoided by using the new IOTLB flushing interface.
> > 
> > Cc: Alex Williamson <alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > Cc: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
> > Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>
> > ---
> >  drivers/vfio/vfio_iommu_type1.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> > index 92155cc..28a7ab6 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -698,10 +698,12 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
> >  				break;
> >  		}
> >  
> > -		unmapped = iommu_unmap(domain->domain, iova, len);
> > +		unmapped = iommu_unmap_fast(domain->domain, iova, len);
> >  		if (WARN_ON(!unmapped))
> >  			break;
> >  
> > +		iommu_tlb_range_add(domain->domain, iova, len);
> > +  
> 
> We should only add @unmapped, not @len, right?

Actually, the problems are deeper than that, if we can't guarantee that
the above iommu_unmap_fast has removed the iommu mapping, then we can't
do the unpin below as that would potentially allow the device access to
unknown memory.  Thus, to support this, the unpinning would need to be
pushed until after the sync and we therefore need some mechanism of
remembering the phys addresses that we've unmapped.  Thanks,

Alex
 
> >  		unlocked += vfio_unpin_pages_remote(dma, iova,
> >  						    phys >> PAGE_SHIFT,
> >  						    unmapped >> PAGE_SHIFT,
> > @@ -710,6 +712,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
> >  
> >  		cond_resched();
> >  	}
> > +	iommu_tlb_sync(domain->domain);
> >  
> >  	dma->iommu_mapped = false;
> >  	if (do_accounting) {
> > @@ -884,8 +887,11 @@ static int map_try_harder(struct vfio_domain *domain, dma_addr_t iova,
> >  			break;
> >  	}
> >  
> > -	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE)
> > -		iommu_unmap(domain->domain, iova, PAGE_SIZE);
> > +	for (; i < npage && i > 0; i--, iova -= PAGE_SIZE) {
> > +		iommu_unmap_fast(domain->domain, iova, PAGE_SIZE);
> > +		iommu_tlb_range_add(domain->domain, iova, PAGE_SIZE);
> > +	}
> > +	iommu_tlb_sync(domain->domain);
> >  
> >  	return ret;
> >  }  
> 
> _______________________________________________
> iommu mailing list
> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
  2017-11-17 21:51     ` Alex Williamson
  (?)
  (?)
@ 2017-11-27  8:12     ` Suravee Suthikulpanit
  -1 siblings, 0 replies; 15+ messages in thread
From: Suravee Suthikulpanit @ 2017-11-27  8:12 UTC (permalink / raw)
  To: Alex Williamson; +Cc: linux-kernel, iommu, joro, jroedel

Hi Alex,

On 11/18/17 4:51 AM, Alex Williamson wrote:
> On Fri, 17 Nov 2017 15:11:19 -0600
> Suravee Suthikulpanit<Suravee.Suthikulpanit@amd.com>  wrote:
> 
>> From: Suravee Suthikulpanit<suravee.suthikulpanit@amd.com>
>>
>> VFIO IOMMU type1 currently upmaps IOVA pages synchronously, which requires
>> IOTLB flushing for every unmapping. This results in large IOTLB flushing
>> overhead when handling pass-through devices with a large number of mapped
>> IOVAs (e.g. GPUs).
> Of course the type of device is really irrelevant, QEMU maps the entire
> VM address space for any assigned device.

Right, in this case certain high performance dGPUs map large address space.

> 
>> This can be avoided by using the new IOTLB flushing interface.
>>
>> Cc: Alex Williamson<alex.williamson@redhat.com>
>> Cc: Joerg Roedel<jroedel@suse.de>
>> Signed-off-by: Suravee Suthikulpanit<suravee.suthikulpanit@amd.com>
>> ---
>>   drivers/vfio/vfio_iommu_type1.c | 12 +++++++++---
>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index 92155cc..28a7ab6 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -698,10 +698,12 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>>   				break;
>>   		}
>>   
>> -		unmapped = iommu_unmap(domain->domain, iova, len);
>> +		unmapped = iommu_unmap_fast(domain->domain, iova, len);
>>   		if (WARN_ON(!unmapped))
>>   			break;
>>   
>> +		iommu_tlb_range_add(domain->domain, iova, len);
>> +
> We should only add @unmapped, not @len, right?

Hm right, I will change this in V2.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
@ 2017-11-27  8:14         ` Suravee Suthikulpanit
  0 siblings, 0 replies; 15+ messages in thread
From: Suravee Suthikulpanit @ 2017-11-27  8:14 UTC (permalink / raw)
  To: Alex Williamson; +Cc: iommu, jroedel, linux-kernel

Hi Alex,

On 11/18/17 11:20 AM, Alex Williamson wrote:
> On Fri, 17 Nov 2017 14:51:52 -0700
> Alex Williamson<alex.williamson@redhat.com>  wrote:
> 
>> On Fri, 17 Nov 2017 15:11:19 -0600
>> Suravee Suthikulpanit<Suravee.Suthikulpanit@amd.com>  wrote:
>>
>>> From: Suravee Suthikulpanit<suravee.suthikulpanit@amd.com>
>>>
>>> VFIO IOMMU type1 currently upmaps IOVA pages synchronously, which requires
>>> IOTLB flushing for every unmapping. This results in large IOTLB flushing
>>> overhead when handling pass-through devices with a large number of mapped
>>> IOVAs (e.g. GPUs).
>> Of course the type of device is really irrelevant, QEMU maps the entire
>> VM address space for any assigned device.
>>
>>> This can be avoided by using the new IOTLB flushing interface.
>>>
>>> Cc: Alex Williamson<alex.williamson@redhat.com>
>>> Cc: Joerg Roedel<jroedel@suse.de>
>>> Signed-off-by: Suravee Suthikulpanit<suravee.suthikulpanit@amd.com>
>>> ---
>>>   drivers/vfio/vfio_iommu_type1.c | 12 +++++++++---
>>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>>> index 92155cc..28a7ab6 100644
>>> --- a/drivers/vfio/vfio_iommu_type1.c
>>> +++ b/drivers/vfio/vfio_iommu_type1.c
>>> @@ -698,10 +698,12 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>>>   				break;
>>>   		}
>>>   
>>> -		unmapped = iommu_unmap(domain->domain, iova, len);
>>> +		unmapped = iommu_unmap_fast(domain->domain, iova, len);
>>>   		if (WARN_ON(!unmapped))
>>>   			break;
>>>   
>>> +		iommu_tlb_range_add(domain->domain, iova, len);
>>> +
>> We should only add @unmapped, not @len, right?
> Actually, the problems are deeper than that, if we can't guarantee that
> the above iommu_unmap_fast has removed the iommu mapping, then we can't
> do the unpin below as that would potentially allow the device access to
> unknown memory.  Thus, to support this, the unpinning would need to be
> pushed until after the sync and we therefore need some mechanism of
> remembering the phys addresses that we've unmapped.  Thanks,
> 
> Alex
>   

If so, I am planning to use a list to temporary store information for 
unmapped regions to be unpinned after sync.  Please lemme know if that 
would be alright.

Suravee

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
@ 2017-11-27  8:14         ` Suravee Suthikulpanit
  0 siblings, 0 replies; 15+ messages in thread
From: Suravee Suthikulpanit @ 2017-11-27  8:14 UTC (permalink / raw)
  To: Alex Williamson
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	jroedel-l3A5Bk7waGM, linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi Alex,

On 11/18/17 11:20 AM, Alex Williamson wrote:
> On Fri, 17 Nov 2017 14:51:52 -0700
> Alex Williamson<alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>  wrote:
> 
>> On Fri, 17 Nov 2017 15:11:19 -0600
>> Suravee Suthikulpanit<Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org>  wrote:
>>
>>> From: Suravee Suthikulpanit<suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>
>>>
>>> VFIO IOMMU type1 currently upmaps IOVA pages synchronously, which requires
>>> IOTLB flushing for every unmapping. This results in large IOTLB flushing
>>> overhead when handling pass-through devices with a large number of mapped
>>> IOVAs (e.g. GPUs).
>> Of course the type of device is really irrelevant, QEMU maps the entire
>> VM address space for any assigned device.
>>
>>> This can be avoided by using the new IOTLB flushing interface.
>>>
>>> Cc: Alex Williamson<alex.williamson-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>>> Cc: Joerg Roedel<jroedel-l3A5Bk7waGM@public.gmane.org>
>>> Signed-off-by: Suravee Suthikulpanit<suravee.suthikulpanit-5C7GfCeVMHo@public.gmane.org>
>>> ---
>>>   drivers/vfio/vfio_iommu_type1.c | 12 +++++++++---
>>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>>> index 92155cc..28a7ab6 100644
>>> --- a/drivers/vfio/vfio_iommu_type1.c
>>> +++ b/drivers/vfio/vfio_iommu_type1.c
>>> @@ -698,10 +698,12 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>>>   				break;
>>>   		}
>>>   
>>> -		unmapped = iommu_unmap(domain->domain, iova, len);
>>> +		unmapped = iommu_unmap_fast(domain->domain, iova, len);
>>>   		if (WARN_ON(!unmapped))
>>>   			break;
>>>   
>>> +		iommu_tlb_range_add(domain->domain, iova, len);
>>> +
>> We should only add @unmapped, not @len, right?
> Actually, the problems are deeper than that, if we can't guarantee that
> the above iommu_unmap_fast has removed the iommu mapping, then we can't
> do the unpin below as that would potentially allow the device access to
> unknown memory.  Thus, to support this, the unpinning would need to be
> pushed until after the sync and we therefore need some mechanism of
> remembering the phys addresses that we've unmapped.  Thanks,
> 
> Alex
>   

If so, I am planning to use a list to temporary store information for 
unmapped regions to be unpinned after sync.  Please lemme know if that 
would be alright.

Suravee

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-11-27  8:15 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-17 21:11 [PATCH 0/2] Reduce IOTLB flush when pass-through dGPU devices Suravee Suthikulpanit
2017-11-17 21:11 ` Suravee Suthikulpanit
2017-11-17 21:11 ` [PATCH 1/2] vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs Suravee Suthikulpanit
2017-11-17 21:11   ` Suravee Suthikulpanit
2017-11-17 21:51   ` Alex Williamson
2017-11-17 21:51     ` Alex Williamson
2017-11-18  4:20     ` Alex Williamson
2017-11-18  4:20       ` Alex Williamson
2017-11-27  8:14       ` Suravee Suthikulpanit
2017-11-27  8:14         ` Suravee Suthikulpanit
2017-11-27  8:12     ` Suravee Suthikulpanit
2017-11-17 21:11 ` [PATCH 2/2] iommu/amd: Add support for fast IOTLB flushing Suravee Suthikulpanit
2017-11-17 21:11   ` Suravee Suthikulpanit
2017-11-17 22:25   ` Tom Lendacky
2017-11-17 22:25     ` Tom Lendacky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.