linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] iommu: Refactor flush queues into iommu-dma
@ 2021-11-23 14:10 Robin Murphy
  2021-11-23 14:10 ` [PATCH 1/9] gpu: host1x: Add missing DMA API include Robin Murphy
                   ` (9 more replies)
  0 siblings, 10 replies; 19+ messages in thread
From: Robin Murphy @ 2021-11-23 14:10 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel, john.garry

Hi all,

As promised, this series cleans up the flush queue code and streamlines
it directly into iommu-dma. Since we no longer have per-driver DMA ops
implementations, a lot of the abstraction is now no longer necessary, so
there's a nice degree of simplification in the process. Un-abstracting
the queued page freeing mechanism is also the perfect opportunity to
revise which struct page fields we use so we can be better-behaved
from the MM point of view, thanks to Matthew.

These changes should also make it viable to start using the gather
freelist in io-pgtable-arm, and eliminate some more synchronous
invalidations from the normal flow there, but that is proving to need a
bit more careful thought than I have time for in this cycle, so I've
parked that again for now and will revisit it in the new year.

For convenience, branch at:
  https://gitlab.arm.com/linux-arm/linux-rm/-/tree/iommu/iova

I've build-tested for x86_64, and boot-tested arm64 to the point of
confirming that put_pages_list() gets passed a valid empty list when
flushing, while everything else still works.

Cheers,
Robin.


Matthew Wilcox (Oracle) (2):
  iommu/amd: Use put_pages_list
  iommu/vt-d: Use put_pages_list

Robin Murphy (7):
  gpu: host1x: Add missing DMA API include
  iommu/iova: Squash entry_dtor abstraction
  iommu/iova: Squash flush_cb abstraction
  iommu/amd: Simplify pagetable freeing
  iommu/iova: Consolidate flush queue code
  iommu/iova: Move flush queue code to iommu-dma
  iommu: Move flush queue data into iommu_dma_cookie

 drivers/gpu/host1x/bus.c       |   1 +
 drivers/iommu/amd/io_pgtable.c | 116 ++++++--------
 drivers/iommu/dma-iommu.c      | 266 +++++++++++++++++++++++++++------
 drivers/iommu/intel/iommu.c    |  89 ++++-------
 drivers/iommu/iova.c           | 200 -------------------------
 include/linux/iommu.h          |   3 +-
 include/linux/iova.h           |  69 +--------
 7 files changed, 295 insertions(+), 449 deletions(-)

-- 
2.28.0.dirty


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/9] gpu: host1x: Add missing DMA API include
  2021-11-23 14:10 [PATCH 0/9] iommu: Refactor flush queues into iommu-dma Robin Murphy
@ 2021-11-23 14:10 ` Robin Murphy
  2021-11-24 14:05   ` Robin Murphy
  2021-11-23 14:10 ` [PATCH 2/9] iommu/iova: Squash entry_dtor abstraction Robin Murphy
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Robin Murphy @ 2021-11-23 14:10 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel,
	john.garry, Thierry Reding, Mikko Perttunen, dri-devel,
	linux-tegra

Host1x seems to be relying on picking up dma-mapping.h transitively from
iova.h, which has no reason to include it in the first place. Fix the
former issue before we totally break things by fixing the latter one.

CC: Thierry Reding <thierry.reding@gmail.com>
CC: Mikko Perttunen <mperttunen@nvidia.com>
CC: dri-devel@lists.freedesktop.org
CC: linux-tegra@vger.kernel.org
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

Feel free to pick this into drm-misc-next or drm-misc-fixes straight
away if that suits - it's only to avoid a build breakage once the rest
of the series gets queued.

Robin.

 drivers/gpu/host1x/bus.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/host1x/bus.c b/drivers/gpu/host1x/bus.c
index 218e3718fd68..881fad5c3307 100644
--- a/drivers/gpu/host1x/bus.c
+++ b/drivers/gpu/host1x/bus.c
@@ -5,6 +5,7 @@
  */
 
 #include <linux/debugfs.h>
+#include <linux/dma-mapping.h>
 #include <linux/host1x.h>
 #include <linux/of.h>
 #include <linux/seq_file.h>
-- 
2.28.0.dirty


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/9] iommu/iova: Squash entry_dtor abstraction
  2021-11-23 14:10 [PATCH 0/9] iommu: Refactor flush queues into iommu-dma Robin Murphy
  2021-11-23 14:10 ` [PATCH 1/9] gpu: host1x: Add missing DMA API include Robin Murphy
@ 2021-11-23 14:10 ` Robin Murphy
  2021-11-23 14:10 ` [PATCH 3/9] iommu/iova: Squash flush_cb abstraction Robin Murphy
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Robin Murphy @ 2021-11-23 14:10 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel, john.garry

All flush queues are driven by iommu-dma now, so there is no need to
abstract entry_dtor or its data any more. Squash the now-canonical
implementation directly into the IOVA code to get it out of the way.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 17 ++---------------
 drivers/iommu/iova.c      | 28 +++++++++++++++-------------
 include/linux/iova.h      | 26 +++-----------------------
 3 files changed, 20 insertions(+), 51 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index b42e38a0dbe2..fa21b9141b71 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -64,18 +64,6 @@ static int __init iommu_dma_forcedac_setup(char *str)
 }
 early_param("iommu.forcedac", iommu_dma_forcedac_setup);
 
-static void iommu_dma_entry_dtor(unsigned long data)
-{
-	struct page *freelist = (struct page *)data;
-
-	while (freelist) {
-		unsigned long p = (unsigned long)page_address(freelist);
-
-		freelist = freelist->freelist;
-		free_page(p);
-	}
-}
-
 static inline size_t cookie_msi_granule(struct iommu_dma_cookie *cookie)
 {
 	if (cookie->type == IOMMU_DMA_IOVA_COOKIE)
@@ -324,8 +312,7 @@ int iommu_dma_init_fq(struct iommu_domain *domain)
 	if (cookie->fq_domain)
 		return 0;
 
-	ret = init_iova_flush_queue(&cookie->iovad, iommu_dma_flush_iotlb_all,
-				    iommu_dma_entry_dtor);
+	ret = init_iova_flush_queue(&cookie->iovad, iommu_dma_flush_iotlb_all);
 	if (ret) {
 		pr_warn("iova flush queue initialization failed\n");
 		return ret;
@@ -479,7 +466,7 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
 	else if (gather && gather->queued)
 		queue_iova(iovad, iova_pfn(iovad, iova),
 				size >> iova_shift(iovad),
-				(unsigned long)gather->freelist);
+				gather->freelist);
 	else
 		free_iova_fast(iovad, iova_pfn(iovad, iova),
 				size >> iova_shift(iovad));
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 9e8bc802ac05..982e2779b981 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -92,11 +92,9 @@ static void free_iova_flush_queue(struct iova_domain *iovad)
 
 	iovad->fq         = NULL;
 	iovad->flush_cb   = NULL;
-	iovad->entry_dtor = NULL;
 }
 
-int init_iova_flush_queue(struct iova_domain *iovad,
-			  iova_flush_cb flush_cb, iova_entry_dtor entry_dtor)
+int init_iova_flush_queue(struct iova_domain *iovad, iova_flush_cb flush_cb)
 {
 	struct iova_fq __percpu *queue;
 	int cpu;
@@ -109,7 +107,6 @@ int init_iova_flush_queue(struct iova_domain *iovad,
 		return -ENOMEM;
 
 	iovad->flush_cb   = flush_cb;
-	iovad->entry_dtor = entry_dtor;
 
 	for_each_possible_cpu(cpu) {
 		struct iova_fq *fq;
@@ -539,6 +536,16 @@ free_iova_fast(struct iova_domain *iovad, unsigned long pfn, unsigned long size)
 }
 EXPORT_SYMBOL_GPL(free_iova_fast);
 
+static void fq_entry_dtor(struct page *freelist)
+{
+	while (freelist) {
+		unsigned long p = (unsigned long)page_address(freelist);
+
+		freelist = freelist->freelist;
+		free_page(p);
+	}
+}
+
 #define fq_ring_for_each(i, fq) \
 	for ((i) = (fq)->head; (i) != (fq)->tail; (i) = ((i) + 1) % IOVA_FQ_SIZE)
 
@@ -571,9 +578,7 @@ static void fq_ring_free(struct iova_domain *iovad, struct iova_fq *fq)
 		if (fq->entries[idx].counter >= counter)
 			break;
 
-		if (iovad->entry_dtor)
-			iovad->entry_dtor(fq->entries[idx].data);
-
+		fq_entry_dtor(fq->entries[idx].freelist);
 		free_iova_fast(iovad,
 			       fq->entries[idx].iova_pfn,
 			       fq->entries[idx].pages);
@@ -598,15 +603,12 @@ static void fq_destroy_all_entries(struct iova_domain *iovad)
 	 * bother to free iovas, just call the entry_dtor on all remaining
 	 * entries.
 	 */
-	if (!iovad->entry_dtor)
-		return;
-
 	for_each_possible_cpu(cpu) {
 		struct iova_fq *fq = per_cpu_ptr(iovad->fq, cpu);
 		int idx;
 
 		fq_ring_for_each(idx, fq)
-			iovad->entry_dtor(fq->entries[idx].data);
+			fq_entry_dtor(fq->entries[idx].freelist);
 	}
 }
 
@@ -631,7 +633,7 @@ static void fq_flush_timeout(struct timer_list *t)
 
 void queue_iova(struct iova_domain *iovad,
 		unsigned long pfn, unsigned long pages,
-		unsigned long data)
+		struct page *freelist)
 {
 	struct iova_fq *fq;
 	unsigned long flags;
@@ -665,7 +667,7 @@ void queue_iova(struct iova_domain *iovad,
 
 	fq->entries[idx].iova_pfn = pfn;
 	fq->entries[idx].pages    = pages;
-	fq->entries[idx].data     = data;
+	fq->entries[idx].freelist = freelist;
 	fq->entries[idx].counter  = atomic64_read(&iovad->fq_flush_start_cnt);
 
 	spin_unlock_irqrestore(&fq->lock, flags);
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 71d8a2de6635..e746d8e41449 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -40,9 +40,6 @@ struct iova_domain;
 /* Call-Back from IOVA code into IOMMU drivers */
 typedef void (* iova_flush_cb)(struct iova_domain *domain);
 
-/* Destructor for per-entry data */
-typedef void (* iova_entry_dtor)(unsigned long data);
-
 /* Number of entries per Flush Queue */
 #define IOVA_FQ_SIZE	256
 
@@ -53,7 +50,7 @@ typedef void (* iova_entry_dtor)(unsigned long data);
 struct iova_fq_entry {
 	unsigned long iova_pfn;
 	unsigned long pages;
-	unsigned long data;
+	struct page *freelist;
 	u64 counter; /* Flush counter when this entrie was added */
 };
 
@@ -88,9 +85,6 @@ struct iova_domain {
 	iova_flush_cb	flush_cb;	/* Call-Back function to flush IOMMU
 					   TLBs */
 
-	iova_entry_dtor entry_dtor;	/* IOMMU driver specific destructor for
-					   iova entry */
-
 	struct timer_list fq_timer;		/* Timer to regularily empty the
 						   flush-queues */
 	atomic_t fq_timer_on;			/* 1 when timer is active, 0
@@ -146,15 +140,14 @@ void free_iova_fast(struct iova_domain *iovad, unsigned long pfn,
 		    unsigned long size);
 void queue_iova(struct iova_domain *iovad,
 		unsigned long pfn, unsigned long pages,
-		unsigned long data);
+		struct page *freelist);
 unsigned long alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
 			      unsigned long limit_pfn, bool flush_rcache);
 struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
 	unsigned long pfn_hi);
 void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
 	unsigned long start_pfn);
-int init_iova_flush_queue(struct iova_domain *iovad,
-			  iova_flush_cb flush_cb, iova_entry_dtor entry_dtor);
+int init_iova_flush_queue(struct iova_domain *iovad, iova_flush_cb flush_cb);
 struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
 void put_iova_domain(struct iova_domain *iovad);
 #else
@@ -189,12 +182,6 @@ static inline void free_iova_fast(struct iova_domain *iovad,
 {
 }
 
-static inline void queue_iova(struct iova_domain *iovad,
-			      unsigned long pfn, unsigned long pages,
-			      unsigned long data)
-{
-}
-
 static inline unsigned long alloc_iova_fast(struct iova_domain *iovad,
 					    unsigned long size,
 					    unsigned long limit_pfn,
@@ -216,13 +203,6 @@ static inline void init_iova_domain(struct iova_domain *iovad,
 {
 }
 
-static inline int init_iova_flush_queue(struct iova_domain *iovad,
-					iova_flush_cb flush_cb,
-					iova_entry_dtor entry_dtor)
-{
-	return -ENODEV;
-}
-
 static inline struct iova *find_iova(struct iova_domain *iovad,
 				     unsigned long pfn)
 {
-- 
2.28.0.dirty


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/9] iommu/iova: Squash flush_cb abstraction
  2021-11-23 14:10 [PATCH 0/9] iommu: Refactor flush queues into iommu-dma Robin Murphy
  2021-11-23 14:10 ` [PATCH 1/9] gpu: host1x: Add missing DMA API include Robin Murphy
  2021-11-23 14:10 ` [PATCH 2/9] iommu/iova: Squash entry_dtor abstraction Robin Murphy
@ 2021-11-23 14:10 ` Robin Murphy
  2021-11-23 14:10 ` [PATCH 4/9] iommu/amd: Simplify pagetable freeing Robin Murphy
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Robin Murphy @ 2021-11-23 14:10 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel, john.garry

Once again, with iommu-dma now being the only flush queue user, we no
longer need the extra level of indirection through flush_cb. Squash that
and let the flush queue code call the domain method directly.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 13 +------------
 drivers/iommu/iova.c      | 11 +++++------
 include/linux/iova.h      | 11 +++--------
 3 files changed, 9 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index fa21b9141b71..cde887530549 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -282,17 +282,6 @@ static int iova_reserve_iommu_regions(struct device *dev,
 	return ret;
 }
 
-static void iommu_dma_flush_iotlb_all(struct iova_domain *iovad)
-{
-	struct iommu_dma_cookie *cookie;
-	struct iommu_domain *domain;
-
-	cookie = container_of(iovad, struct iommu_dma_cookie, iovad);
-	domain = cookie->fq_domain;
-
-	domain->ops->flush_iotlb_all(domain);
-}
-
 static bool dev_is_untrusted(struct device *dev)
 {
 	return dev_is_pci(dev) && to_pci_dev(dev)->untrusted;
@@ -312,7 +301,7 @@ int iommu_dma_init_fq(struct iommu_domain *domain)
 	if (cookie->fq_domain)
 		return 0;
 
-	ret = init_iova_flush_queue(&cookie->iovad, iommu_dma_flush_iotlb_all);
+	ret = init_iova_flush_queue(&cookie->iovad, domain);
 	if (ret) {
 		pr_warn("iova flush queue initialization failed\n");
 		return ret;
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 982e2779b981..7619ccb726cc 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -63,7 +63,7 @@ init_iova_domain(struct iova_domain *iovad, unsigned long granule,
 	iovad->start_pfn = start_pfn;
 	iovad->dma_32bit_pfn = 1UL << (32 - iova_shift(iovad));
 	iovad->max32_alloc_size = iovad->dma_32bit_pfn;
-	iovad->flush_cb = NULL;
+	iovad->fq_domain = NULL;
 	iovad->fq = NULL;
 	iovad->anchor.pfn_lo = iovad->anchor.pfn_hi = IOVA_ANCHOR;
 	rb_link_node(&iovad->anchor.node, NULL, &iovad->rbroot.rb_node);
@@ -91,10 +91,10 @@ static void free_iova_flush_queue(struct iova_domain *iovad)
 	free_percpu(iovad->fq);
 
 	iovad->fq         = NULL;
-	iovad->flush_cb   = NULL;
+	iovad->fq_domain  = NULL;
 }
 
-int init_iova_flush_queue(struct iova_domain *iovad, iova_flush_cb flush_cb)
+int init_iova_flush_queue(struct iova_domain *iovad, struct iommu_domain *fq_domain)
 {
 	struct iova_fq __percpu *queue;
 	int cpu;
@@ -106,8 +106,6 @@ int init_iova_flush_queue(struct iova_domain *iovad, iova_flush_cb flush_cb)
 	if (!queue)
 		return -ENOMEM;
 
-	iovad->flush_cb   = flush_cb;
-
 	for_each_possible_cpu(cpu) {
 		struct iova_fq *fq;
 
@@ -118,6 +116,7 @@ int init_iova_flush_queue(struct iova_domain *iovad, iova_flush_cb flush_cb)
 		spin_lock_init(&fq->lock);
 	}
 
+	iovad->fq_domain = fq_domain;
 	iovad->fq = queue;
 
 	timer_setup(&iovad->fq_timer, fq_flush_timeout, 0);
@@ -590,7 +589,7 @@ static void fq_ring_free(struct iova_domain *iovad, struct iova_fq *fq)
 static void iova_domain_flush(struct iova_domain *iovad)
 {
 	atomic64_inc(&iovad->fq_flush_start_cnt);
-	iovad->flush_cb(iovad);
+	iovad->fq_domain->ops->flush_iotlb_all(iovad->fq_domain);
 	atomic64_inc(&iovad->fq_flush_finish_cnt);
 }
 
diff --git a/include/linux/iova.h b/include/linux/iova.h
index e746d8e41449..99be4fcea4f3 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -14,6 +14,7 @@
 #include <linux/rbtree.h>
 #include <linux/atomic.h>
 #include <linux/dma-mapping.h>
+#include <linux/iommu.h>
 
 /* iova structure */
 struct iova {
@@ -35,11 +36,6 @@ struct iova_rcache {
 	struct iova_cpu_rcache __percpu *cpu_rcaches;
 };
 
-struct iova_domain;
-
-/* Call-Back from IOVA code into IOMMU drivers */
-typedef void (* iova_flush_cb)(struct iova_domain *domain);
-
 /* Number of entries per Flush Queue */
 #define IOVA_FQ_SIZE	256
 
@@ -82,8 +78,7 @@ struct iova_domain {
 	struct iova	anchor;		/* rbtree lookup anchor */
 	struct iova_rcache rcaches[IOVA_RANGE_CACHE_MAX_SIZE];	/* IOVA range caches */
 
-	iova_flush_cb	flush_cb;	/* Call-Back function to flush IOMMU
-					   TLBs */
+	struct iommu_domain *fq_domain;
 
 	struct timer_list fq_timer;		/* Timer to regularily empty the
 						   flush-queues */
@@ -147,7 +142,7 @@ struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
 	unsigned long pfn_hi);
 void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
 	unsigned long start_pfn);
-int init_iova_flush_queue(struct iova_domain *iovad, iova_flush_cb flush_cb);
+int init_iova_flush_queue(struct iova_domain *iovad, struct iommu_domain *fq_domain);
 struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
 void put_iova_domain(struct iova_domain *iovad);
 #else
-- 
2.28.0.dirty


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/9] iommu/amd: Simplify pagetable freeing
  2021-11-23 14:10 [PATCH 0/9] iommu: Refactor flush queues into iommu-dma Robin Murphy
                   ` (2 preceding siblings ...)
  2021-11-23 14:10 ` [PATCH 3/9] iommu/iova: Squash flush_cb abstraction Robin Murphy
@ 2021-11-23 14:10 ` Robin Murphy
  2021-12-06 12:40   ` Joerg Roedel
  2021-11-23 14:10 ` [PATCH 5/9] iommu/amd: Use put_pages_list Robin Murphy
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 19+ messages in thread
From: Robin Murphy @ 2021-11-23 14:10 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel, john.garry

For reasons unclear, pagetable freeing is an effectively recursive
method implemented via an elaborate system of templated functions that
turns out to account for 25% of the object file size. Implementing it
using regular straightforward recursion makes the code simpler, and
seems like a good thing to do before we work on it further. As part of
that, also fix the types to avoid all the needless casting back and
forth which just gets in the way.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/amd/io_pgtable.c | 78 +++++++++++++---------------------
 1 file changed, 30 insertions(+), 48 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 182c93a43efd..f92ecb3e21d7 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -84,49 +84,41 @@ static void free_page_list(struct page *freelist)
 	}
 }
 
-static struct page *free_pt_page(unsigned long pt, struct page *freelist)
+static struct page *free_pt_page(u64 *pt, struct page *freelist)
 {
-	struct page *p = virt_to_page((void *)pt);
+	struct page *p = virt_to_page(pt);
 
 	p->freelist = freelist;
 
 	return p;
 }
 
-#define DEFINE_FREE_PT_FN(LVL, FN)						\
-static struct page *free_pt_##LVL (unsigned long __pt, struct page *freelist)	\
-{										\
-	unsigned long p;							\
-	u64 *pt;								\
-	int i;									\
-										\
-	pt = (u64 *)__pt;							\
-										\
-	for (i = 0; i < 512; ++i) {						\
-		/* PTE present? */						\
-		if (!IOMMU_PTE_PRESENT(pt[i]))					\
-			continue;						\
-										\
-		/* Large PTE? */						\
-		if (PM_PTE_LEVEL(pt[i]) == 0 ||					\
-		    PM_PTE_LEVEL(pt[i]) == 7)					\
-			continue;						\
-										\
-		p = (unsigned long)IOMMU_PTE_PAGE(pt[i]);			\
-		freelist = FN(p, freelist);					\
-	}									\
-										\
-	return free_pt_page((unsigned long)pt, freelist);			\
+static struct page *free_pt_lvl(u64 *pt, struct page *freelist, int lvl)
+{
+	u64 *p;
+	int i;
+
+	for (i = 0; i < 512; ++i) {
+		/* PTE present? */
+		if (!IOMMU_PTE_PRESENT(pt[i]))
+			continue;
+
+		/* Large PTE? */
+		if (PM_PTE_LEVEL(pt[i]) == 0 ||
+		    PM_PTE_LEVEL(pt[i]) == 7)
+			continue;
+
+		p = IOMMU_PTE_PAGE(pt[i]);
+		if (lvl > 2)
+			freelist = free_pt_lvl(p, freelist, lvl - 1);
+		else
+			freelist = free_pt_page(p, freelist);
+	}
+
+	return free_pt_page(pt, freelist);
 }
 
-DEFINE_FREE_PT_FN(l2, free_pt_page)
-DEFINE_FREE_PT_FN(l3, free_pt_l2)
-DEFINE_FREE_PT_FN(l4, free_pt_l3)
-DEFINE_FREE_PT_FN(l5, free_pt_l4)
-DEFINE_FREE_PT_FN(l6, free_pt_l5)
-
-static struct page *free_sub_pt(unsigned long root, int mode,
-				struct page *freelist)
+static struct page *free_sub_pt(u64 *root, int mode, struct page *freelist)
 {
 	switch (mode) {
 	case PAGE_MODE_NONE:
@@ -136,19 +128,11 @@ static struct page *free_sub_pt(unsigned long root, int mode,
 		freelist = free_pt_page(root, freelist);
 		break;
 	case PAGE_MODE_2_LEVEL:
-		freelist = free_pt_l2(root, freelist);
-		break;
 	case PAGE_MODE_3_LEVEL:
-		freelist = free_pt_l3(root, freelist);
-		break;
 	case PAGE_MODE_4_LEVEL:
-		freelist = free_pt_l4(root, freelist);
-		break;
 	case PAGE_MODE_5_LEVEL:
-		freelist = free_pt_l5(root, freelist);
-		break;
 	case PAGE_MODE_6_LEVEL:
-		freelist = free_pt_l6(root, freelist);
+		free_pt_lvl(root, freelist, mode);
 		break;
 	default:
 		BUG();
@@ -364,7 +348,7 @@ static u64 *fetch_pte(struct amd_io_pgtable *pgtable,
 
 static struct page *free_clear_pte(u64 *pte, u64 pteval, struct page *freelist)
 {
-	unsigned long pt;
+	u64 *pt;
 	int mode;
 
 	while (cmpxchg64(pte, pteval, 0) != pteval) {
@@ -375,7 +359,7 @@ static struct page *free_clear_pte(u64 *pte, u64 pteval, struct page *freelist)
 	if (!IOMMU_PTE_PRESENT(pteval))
 		return freelist;
 
-	pt   = (unsigned long)IOMMU_PTE_PAGE(pteval);
+	pt   = IOMMU_PTE_PAGE(pteval);
 	mode = IOMMU_PTE_MODE(pteval);
 
 	return free_sub_pt(pt, mode, freelist);
@@ -512,7 +496,6 @@ static void v1_free_pgtable(struct io_pgtable *iop)
 	struct amd_io_pgtable *pgtable = container_of(iop, struct amd_io_pgtable, iop);
 	struct protection_domain *dom;
 	struct page *freelist = NULL;
-	unsigned long root;
 
 	if (pgtable->mode == PAGE_MODE_NONE)
 		return;
@@ -529,8 +512,7 @@ static void v1_free_pgtable(struct io_pgtable *iop)
 	BUG_ON(pgtable->mode < PAGE_MODE_NONE ||
 	       pgtable->mode > PAGE_MODE_6_LEVEL);
 
-	root = (unsigned long)pgtable->root;
-	freelist = free_sub_pt(root, pgtable->mode, freelist);
+	freelist = free_sub_pt(pgtable->root, pgtable->mode, freelist);
 
 	free_page_list(freelist);
 }
-- 
2.28.0.dirty


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/9] iommu/amd: Use put_pages_list
  2021-11-23 14:10 [PATCH 0/9] iommu: Refactor flush queues into iommu-dma Robin Murphy
                   ` (3 preceding siblings ...)
  2021-11-23 14:10 ` [PATCH 4/9] iommu/amd: Simplify pagetable freeing Robin Murphy
@ 2021-11-23 14:10 ` Robin Murphy
  2021-11-23 14:10 ` [PATCH 6/9] iommu/vt-d: " Robin Murphy
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Robin Murphy @ 2021-11-23 14:10 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel, john.garry

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

page->freelist is for the use of slab.  We already have the ability
to free a list of pages in the core mm, but it requires the use of a
list_head and for the pages to be chained together through page->lru.
Switch the AMD IOMMU code over to using free_pages_list().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
[rm: split from original patch, cosmetic tweaks]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/amd/io_pgtable.c | 50 ++++++++++++----------------------
 1 file changed, 18 insertions(+), 32 deletions(-)

diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index f92ecb3e21d7..be2eba61b4d3 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -74,26 +74,14 @@ static u64 *first_pte_l7(u64 *pte, unsigned long *page_size,
  *
  ****************************************************************************/
 
-static void free_page_list(struct page *freelist)
-{
-	while (freelist != NULL) {
-		unsigned long p = (unsigned long)page_address(freelist);
-
-		freelist = freelist->freelist;
-		free_page(p);
-	}
-}
-
-static struct page *free_pt_page(u64 *pt, struct page *freelist)
+static void free_pt_page(u64 *pt, struct list_head *freelist)
 {
 	struct page *p = virt_to_page(pt);
 
-	p->freelist = freelist;
-
-	return p;
+	list_add_tail(&p->lru, freelist);
 }
 
-static struct page *free_pt_lvl(u64 *pt, struct page *freelist, int lvl)
+static void free_pt_lvl(u64 *pt, struct list_head *freelist, int lvl)
 {
 	u64 *p;
 	int i;
@@ -110,22 +98,22 @@ static struct page *free_pt_lvl(u64 *pt, struct page *freelist, int lvl)
 
 		p = IOMMU_PTE_PAGE(pt[i]);
 		if (lvl > 2)
-			freelist = free_pt_lvl(p, freelist, lvl - 1);
+			free_pt_lvl(p, freelist, lvl - 1);
 		else
-			freelist = free_pt_page(p, freelist);
+			free_pt_page(p, freelist);
 	}
 
-	return free_pt_page(pt, freelist);
+	free_pt_page(pt, freelist);
 }
 
-static struct page *free_sub_pt(u64 *root, int mode, struct page *freelist)
+static void free_sub_pt(u64 *root, int mode, struct list_head *freelist)
 {
 	switch (mode) {
 	case PAGE_MODE_NONE:
 	case PAGE_MODE_7_LEVEL:
 		break;
 	case PAGE_MODE_1_LEVEL:
-		freelist = free_pt_page(root, freelist);
+		free_pt_page(root, freelist);
 		break;
 	case PAGE_MODE_2_LEVEL:
 	case PAGE_MODE_3_LEVEL:
@@ -137,8 +125,6 @@ static struct page *free_sub_pt(u64 *root, int mode, struct page *freelist)
 	default:
 		BUG();
 	}
-
-	return freelist;
 }
 
 void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
@@ -346,7 +332,7 @@ static u64 *fetch_pte(struct amd_io_pgtable *pgtable,
 	return pte;
 }
 
-static struct page *free_clear_pte(u64 *pte, u64 pteval, struct page *freelist)
+static void free_clear_pte(u64 *pte, u64 pteval, struct list_head *freelist)
 {
 	u64 *pt;
 	int mode;
@@ -357,12 +343,12 @@ static struct page *free_clear_pte(u64 *pte, u64 pteval, struct page *freelist)
 	}
 
 	if (!IOMMU_PTE_PRESENT(pteval))
-		return freelist;
+		return;
 
 	pt   = IOMMU_PTE_PAGE(pteval);
 	mode = IOMMU_PTE_MODE(pteval);
 
-	return free_sub_pt(pt, mode, freelist);
+	free_sub_pt(pt, mode, freelist);
 }
 
 /*
@@ -376,7 +362,7 @@ static int iommu_v1_map_page(struct io_pgtable_ops *ops, unsigned long iova,
 			  phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
 {
 	struct protection_domain *dom = io_pgtable_ops_to_domain(ops);
-	struct page *freelist = NULL;
+	LIST_HEAD(freelist);
 	bool updated = false;
 	u64 __pte, *pte;
 	int ret, i, count;
@@ -396,9 +382,9 @@ static int iommu_v1_map_page(struct io_pgtable_ops *ops, unsigned long iova,
 		goto out;
 
 	for (i = 0; i < count; ++i)
-		freelist = free_clear_pte(&pte[i], pte[i], freelist);
+		free_clear_pte(&pte[i], pte[i], &freelist);
 
-	if (freelist != NULL)
+	if (!list_empty(&freelist))
 		updated = true;
 
 	if (count > 1) {
@@ -433,7 +419,7 @@ static int iommu_v1_map_page(struct io_pgtable_ops *ops, unsigned long iova,
 	}
 
 	/* Everything flushed out, free pages now */
-	free_page_list(freelist);
+	put_pages_list(&freelist);
 
 	return ret;
 }
@@ -495,7 +481,7 @@ static void v1_free_pgtable(struct io_pgtable *iop)
 {
 	struct amd_io_pgtable *pgtable = container_of(iop, struct amd_io_pgtable, iop);
 	struct protection_domain *dom;
-	struct page *freelist = NULL;
+	LIST_HEAD(freelist);
 
 	if (pgtable->mode == PAGE_MODE_NONE)
 		return;
@@ -512,9 +498,9 @@ static void v1_free_pgtable(struct io_pgtable *iop)
 	BUG_ON(pgtable->mode < PAGE_MODE_NONE ||
 	       pgtable->mode > PAGE_MODE_6_LEVEL);
 
-	freelist = free_sub_pt(pgtable->root, pgtable->mode, freelist);
+	free_sub_pt(pgtable->root, pgtable->mode, &freelist);
 
-	free_page_list(freelist);
+	put_pages_list(&freelist);
 }
 
 static struct io_pgtable *v1_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
-- 
2.28.0.dirty


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/9] iommu/vt-d: Use put_pages_list
  2021-11-23 14:10 [PATCH 0/9] iommu: Refactor flush queues into iommu-dma Robin Murphy
                   ` (4 preceding siblings ...)
  2021-11-23 14:10 ` [PATCH 5/9] iommu/amd: Use put_pages_list Robin Murphy
@ 2021-11-23 14:10 ` Robin Murphy
  2021-11-23 14:10 ` [PATCH 7/9] iommu/iova: Consolidate flush queue code Robin Murphy
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Robin Murphy @ 2021-11-23 14:10 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel, john.garry

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

page->freelist is for the use of slab.  We already have the ability
to free a list of pages in the core mm, but it requires the use of a
list_head and for the pages to be chained together through page->lru.
Switch the Intel IOMMU and IOVA code over to using free_pages_list().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
[rm: split from original patch, cosmetic tweaks, fix fq entries]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c   |  2 +-
 drivers/iommu/intel/iommu.c | 89 +++++++++++++------------------------
 drivers/iommu/iova.c        | 26 ++++-------
 include/linux/iommu.h       |  3 +-
 include/linux/iova.h        |  4 +-
 5 files changed, 45 insertions(+), 79 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index cde887530549..f139b77caee0 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -455,7 +455,7 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
 	else if (gather && gather->queued)
 		queue_iova(iovad, iova_pfn(iovad, iova),
 				size >> iova_shift(iovad),
-				gather->freelist);
+				&gather->freelist);
 	else
 		free_iova_fast(iovad, iova_pfn(iovad, iova),
 				size >> iova_shift(iovad));
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 0bde0c8b4126..f65206dac485 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1303,35 +1303,30 @@ static void dma_pte_free_pagetable(struct dmar_domain *domain,
    know the hardware page-walk will no longer touch them.
    The 'pte' argument is the *parent* PTE, pointing to the page that is to
    be freed. */
-static struct page *dma_pte_list_pagetables(struct dmar_domain *domain,
-					    int level, struct dma_pte *pte,
-					    struct page *freelist)
+static void dma_pte_list_pagetables(struct dmar_domain *domain,
+				    int level, struct dma_pte *pte,
+				    struct list_head *freelist)
 {
 	struct page *pg;
 
 	pg = pfn_to_page(dma_pte_addr(pte) >> PAGE_SHIFT);
-	pg->freelist = freelist;
-	freelist = pg;
+	list_add_tail(&pg->lru, freelist);
 
 	if (level == 1)
-		return freelist;
+		return;
 
 	pte = page_address(pg);
 	do {
 		if (dma_pte_present(pte) && !dma_pte_superpage(pte))
-			freelist = dma_pte_list_pagetables(domain, level - 1,
-							   pte, freelist);
+			dma_pte_list_pagetables(domain, level - 1, pte, freelist);
 		pte++;
 	} while (!first_pte_in_page(pte));
-
-	return freelist;
 }
 
-static struct page *dma_pte_clear_level(struct dmar_domain *domain, int level,
-					struct dma_pte *pte, unsigned long pfn,
-					unsigned long start_pfn,
-					unsigned long last_pfn,
-					struct page *freelist)
+static void dma_pte_clear_level(struct dmar_domain *domain, int level,
+				struct dma_pte *pte, unsigned long pfn,
+				unsigned long start_pfn, unsigned long last_pfn,
+				struct list_head *freelist)
 {
 	struct dma_pte *first_pte = NULL, *last_pte = NULL;
 
@@ -1352,7 +1347,7 @@ static struct page *dma_pte_clear_level(struct dmar_domain *domain, int level,
 			/* These suborbinate page tables are going away entirely. Don't
 			   bother to clear them; we're just going to *free* them. */
 			if (level > 1 && !dma_pte_superpage(pte))
-				freelist = dma_pte_list_pagetables(domain, level - 1, pte, freelist);
+				dma_pte_list_pagetables(domain, level - 1, pte, freelist);
 
 			dma_clear_pte(pte);
 			if (!first_pte)
@@ -1360,10 +1355,10 @@ static struct page *dma_pte_clear_level(struct dmar_domain *domain, int level,
 			last_pte = pte;
 		} else if (level > 1) {
 			/* Recurse down into a level that isn't *entirely* obsolete */
-			freelist = dma_pte_clear_level(domain, level - 1,
-						       phys_to_virt(dma_pte_addr(pte)),
-						       level_pfn, start_pfn, last_pfn,
-						       freelist);
+			dma_pte_clear_level(domain, level - 1,
+					    phys_to_virt(dma_pte_addr(pte)),
+					    level_pfn, start_pfn, last_pfn,
+					    freelist);
 		}
 next:
 		pfn += level_size(level);
@@ -1372,47 +1367,28 @@ static struct page *dma_pte_clear_level(struct dmar_domain *domain, int level,
 	if (first_pte)
 		domain_flush_cache(domain, first_pte,
 				   (void *)++last_pte - (void *)first_pte);
-
-	return freelist;
 }
 
 /* We can't just free the pages because the IOMMU may still be walking
    the page tables, and may have cached the intermediate levels. The
    pages can only be freed after the IOTLB flush has been done. */
-static struct page *domain_unmap(struct dmar_domain *domain,
-				 unsigned long start_pfn,
-				 unsigned long last_pfn,
-				 struct page *freelist)
+static void domain_unmap(struct dmar_domain *domain, unsigned long start_pfn,
+			 unsigned long last_pfn, struct list_head *freelist)
 {
 	BUG_ON(!domain_pfn_supported(domain, start_pfn));
 	BUG_ON(!domain_pfn_supported(domain, last_pfn));
 	BUG_ON(start_pfn > last_pfn);
 
 	/* we don't need lock here; nobody else touches the iova range */
-	freelist = dma_pte_clear_level(domain, agaw_to_level(domain->agaw),
-				       domain->pgd, 0, start_pfn, last_pfn,
-				       freelist);
+	dma_pte_clear_level(domain, agaw_to_level(domain->agaw),
+			    domain->pgd, 0, start_pfn, last_pfn, freelist);
 
 	/* free pgd */
 	if (start_pfn == 0 && last_pfn == DOMAIN_MAX_PFN(domain->gaw)) {
 		struct page *pgd_page = virt_to_page(domain->pgd);
-		pgd_page->freelist = freelist;
-		freelist = pgd_page;
-
+		list_add_tail(&pgd_page->lru, freelist);
 		domain->pgd = NULL;
 	}
-
-	return freelist;
-}
-
-static void dma_free_pagelist(struct page *freelist)
-{
-	struct page *pg;
-
-	while ((pg = freelist)) {
-		freelist = pg->freelist;
-		free_pgtable_page(page_address(pg));
-	}
 }
 
 /* iommu handling */
@@ -2097,11 +2073,10 @@ static void domain_exit(struct dmar_domain *domain)
 	domain_remove_dev_info(domain);
 
 	if (domain->pgd) {
-		struct page *freelist;
+		LIST_HEAD(freelist);
 
-		freelist = domain_unmap(domain, 0,
-					DOMAIN_MAX_PFN(domain->gaw), NULL);
-		dma_free_pagelist(freelist);
+		domain_unmap(domain, 0, DOMAIN_MAX_PFN(domain->gaw), &freelist);
+		put_pages_list(&freelist);
 	}
 
 	free_domain_mem(domain);
@@ -4194,19 +4169,17 @@ static int intel_iommu_memory_notifier(struct notifier_block *nb,
 		{
 			struct dmar_drhd_unit *drhd;
 			struct intel_iommu *iommu;
-			struct page *freelist;
+			LIST_HEAD(freelist);
 
-			freelist = domain_unmap(si_domain,
-						start_vpfn, last_vpfn,
-						NULL);
+			domain_unmap(si_domain, start_vpfn, last_vpfn, &freelist);
 
 			rcu_read_lock();
 			for_each_active_iommu(iommu, drhd)
 				iommu_flush_iotlb_psi(iommu, si_domain,
 					start_vpfn, mhp->nr_pages,
-					!freelist, 0);
+					list_empty(&freelist), 0);
 			rcu_read_unlock();
-			dma_free_pagelist(freelist);
+			put_pages_list(&freelist);
 		}
 		break;
 	}
@@ -5213,8 +5186,7 @@ static size_t intel_iommu_unmap(struct iommu_domain *domain,
 	start_pfn = iova >> VTD_PAGE_SHIFT;
 	last_pfn = (iova + size - 1) >> VTD_PAGE_SHIFT;
 
-	gather->freelist = domain_unmap(dmar_domain, start_pfn,
-					last_pfn, gather->freelist);
+	domain_unmap(dmar_domain, start_pfn, last_pfn, &gather->freelist);
 
 	if (dmar_domain->max_addr == iova + size)
 		dmar_domain->max_addr = iova;
@@ -5250,9 +5222,10 @@ static void intel_iommu_tlb_sync(struct iommu_domain *domain,
 
 	for_each_domain_iommu(iommu_id, dmar_domain)
 		iommu_flush_iotlb_psi(g_iommus[iommu_id], dmar_domain,
-				      start_pfn, nrpages, !gather->freelist, 0);
+				      start_pfn, nrpages,
+				      list_empty(&gather->freelist), 0);
 
-	dma_free_pagelist(gather->freelist);
+	put_pages_list(&gather->freelist);
 }
 
 static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 7619ccb726cc..a32007c950e5 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -97,7 +97,7 @@ static void free_iova_flush_queue(struct iova_domain *iovad)
 int init_iova_flush_queue(struct iova_domain *iovad, struct iommu_domain *fq_domain)
 {
 	struct iova_fq __percpu *queue;
-	int cpu;
+	int i, cpu;
 
 	atomic64_set(&iovad->fq_flush_start_cnt,  0);
 	atomic64_set(&iovad->fq_flush_finish_cnt, 0);
@@ -114,6 +114,9 @@ int init_iova_flush_queue(struct iova_domain *iovad, struct iommu_domain *fq_dom
 		fq->tail = 0;
 
 		spin_lock_init(&fq->lock);
+
+		for (i = 0; i < IOVA_FQ_SIZE; i++)
+			INIT_LIST_HEAD(&fq->entries[i].freelist);
 	}
 
 	iovad->fq_domain = fq_domain;
@@ -535,16 +538,6 @@ free_iova_fast(struct iova_domain *iovad, unsigned long pfn, unsigned long size)
 }
 EXPORT_SYMBOL_GPL(free_iova_fast);
 
-static void fq_entry_dtor(struct page *freelist)
-{
-	while (freelist) {
-		unsigned long p = (unsigned long)page_address(freelist);
-
-		freelist = freelist->freelist;
-		free_page(p);
-	}
-}
-
 #define fq_ring_for_each(i, fq) \
 	for ((i) = (fq)->head; (i) != (fq)->tail; (i) = ((i) + 1) % IOVA_FQ_SIZE)
 
@@ -577,7 +570,7 @@ static void fq_ring_free(struct iova_domain *iovad, struct iova_fq *fq)
 		if (fq->entries[idx].counter >= counter)
 			break;
 
-		fq_entry_dtor(fq->entries[idx].freelist);
+		put_pages_list(&fq->entries[idx].freelist);
 		free_iova_fast(iovad,
 			       fq->entries[idx].iova_pfn,
 			       fq->entries[idx].pages);
@@ -599,15 +592,14 @@ static void fq_destroy_all_entries(struct iova_domain *iovad)
 
 	/*
 	 * This code runs when the iova_domain is being detroyed, so don't
-	 * bother to free iovas, just call the entry_dtor on all remaining
-	 * entries.
+	 * bother to free iovas, just free any remaining pagetable pages.
 	 */
 	for_each_possible_cpu(cpu) {
 		struct iova_fq *fq = per_cpu_ptr(iovad->fq, cpu);
 		int idx;
 
 		fq_ring_for_each(idx, fq)
-			fq_entry_dtor(fq->entries[idx].freelist);
+			put_pages_list(&fq->entries[idx].freelist);
 	}
 }
 
@@ -632,7 +624,7 @@ static void fq_flush_timeout(struct timer_list *t)
 
 void queue_iova(struct iova_domain *iovad,
 		unsigned long pfn, unsigned long pages,
-		struct page *freelist)
+		struct list_head *freelist)
 {
 	struct iova_fq *fq;
 	unsigned long flags;
@@ -666,8 +658,8 @@ void queue_iova(struct iova_domain *iovad,
 
 	fq->entries[idx].iova_pfn = pfn;
 	fq->entries[idx].pages    = pages;
-	fq->entries[idx].freelist = freelist;
 	fq->entries[idx].counter  = atomic64_read(&iovad->fq_flush_start_cnt);
+	list_splice(freelist, &fq->entries[idx].freelist);
 
 	spin_unlock_irqrestore(&fq->lock, flags);
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d2f3435e7d17..de0c57a567c8 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -186,7 +186,7 @@ struct iommu_iotlb_gather {
 	unsigned long		start;
 	unsigned long		end;
 	size_t			pgsize;
-	struct page		*freelist;
+	struct list_head	freelist;
 	bool			queued;
 };
 
@@ -399,6 +399,7 @@ static inline void iommu_iotlb_gather_init(struct iommu_iotlb_gather *gather)
 {
 	*gather = (struct iommu_iotlb_gather) {
 		.start	= ULONG_MAX,
+		.freelist = LIST_HEAD_INIT(gather->freelist),
 	};
 }
 
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 99be4fcea4f3..072a09c06e8a 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -46,7 +46,7 @@ struct iova_rcache {
 struct iova_fq_entry {
 	unsigned long iova_pfn;
 	unsigned long pages;
-	struct page *freelist;
+	struct list_head freelist;
 	u64 counter; /* Flush counter when this entrie was added */
 };
 
@@ -135,7 +135,7 @@ void free_iova_fast(struct iova_domain *iovad, unsigned long pfn,
 		    unsigned long size);
 void queue_iova(struct iova_domain *iovad,
 		unsigned long pfn, unsigned long pages,
-		struct page *freelist);
+		struct list_head *freelist);
 unsigned long alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
 			      unsigned long limit_pfn, bool flush_rcache);
 struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
-- 
2.28.0.dirty


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 7/9] iommu/iova: Consolidate flush queue code
  2021-11-23 14:10 [PATCH 0/9] iommu: Refactor flush queues into iommu-dma Robin Murphy
                   ` (5 preceding siblings ...)
  2021-11-23 14:10 ` [PATCH 6/9] iommu/vt-d: " Robin Murphy
@ 2021-11-23 14:10 ` Robin Murphy
  2021-11-23 14:10 ` [PATCH 8/9] iommu/iova: Move flush queue code to iommu-dma Robin Murphy
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Robin Murphy @ 2021-11-23 14:10 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel, john.garry

Squash and simplify some of the freeing code, and move the init
and free routines down into the rest of the flush queue code to
obviate the forward declarations.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/iova.c | 132 +++++++++++++++++++------------------------
 1 file changed, 58 insertions(+), 74 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index a32007c950e5..159acd34501b 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -24,8 +24,6 @@ static unsigned long iova_rcache_get(struct iova_domain *iovad,
 static void init_iova_rcaches(struct iova_domain *iovad);
 static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
 static void free_iova_rcaches(struct iova_domain *iovad);
-static void fq_destroy_all_entries(struct iova_domain *iovad);
-static void fq_flush_timeout(struct timer_list *t);
 
 static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node)
 {
@@ -73,61 +71,6 @@ init_iova_domain(struct iova_domain *iovad, unsigned long granule,
 }
 EXPORT_SYMBOL_GPL(init_iova_domain);
 
-static bool has_iova_flush_queue(struct iova_domain *iovad)
-{
-	return !!iovad->fq;
-}
-
-static void free_iova_flush_queue(struct iova_domain *iovad)
-{
-	if (!has_iova_flush_queue(iovad))
-		return;
-
-	if (timer_pending(&iovad->fq_timer))
-		del_timer(&iovad->fq_timer);
-
-	fq_destroy_all_entries(iovad);
-
-	free_percpu(iovad->fq);
-
-	iovad->fq         = NULL;
-	iovad->fq_domain  = NULL;
-}
-
-int init_iova_flush_queue(struct iova_domain *iovad, struct iommu_domain *fq_domain)
-{
-	struct iova_fq __percpu *queue;
-	int i, cpu;
-
-	atomic64_set(&iovad->fq_flush_start_cnt,  0);
-	atomic64_set(&iovad->fq_flush_finish_cnt, 0);
-
-	queue = alloc_percpu(struct iova_fq);
-	if (!queue)
-		return -ENOMEM;
-
-	for_each_possible_cpu(cpu) {
-		struct iova_fq *fq;
-
-		fq = per_cpu_ptr(queue, cpu);
-		fq->head = 0;
-		fq->tail = 0;
-
-		spin_lock_init(&fq->lock);
-
-		for (i = 0; i < IOVA_FQ_SIZE; i++)
-			INIT_LIST_HEAD(&fq->entries[i].freelist);
-	}
-
-	iovad->fq_domain = fq_domain;
-	iovad->fq = queue;
-
-	timer_setup(&iovad->fq_timer, fq_flush_timeout, 0);
-	atomic_set(&iovad->fq_timer_on, 0);
-
-	return 0;
-}
-
 static struct rb_node *
 __get_cached_rbnode(struct iova_domain *iovad, unsigned long limit_pfn)
 {
@@ -586,23 +529,6 @@ static void iova_domain_flush(struct iova_domain *iovad)
 	atomic64_inc(&iovad->fq_flush_finish_cnt);
 }
 
-static void fq_destroy_all_entries(struct iova_domain *iovad)
-{
-	int cpu;
-
-	/*
-	 * This code runs when the iova_domain is being detroyed, so don't
-	 * bother to free iovas, just free any remaining pagetable pages.
-	 */
-	for_each_possible_cpu(cpu) {
-		struct iova_fq *fq = per_cpu_ptr(iovad->fq, cpu);
-		int idx;
-
-		fq_ring_for_each(idx, fq)
-			put_pages_list(&fq->entries[idx].freelist);
-	}
-}
-
 static void fq_flush_timeout(struct timer_list *t)
 {
 	struct iova_domain *iovad = from_timer(iovad, t, fq_timer);
@@ -670,6 +596,64 @@ void queue_iova(struct iova_domain *iovad,
 			  jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT));
 }
 
+static void free_iova_flush_queue(struct iova_domain *iovad)
+{
+	int cpu, idx;
+
+	if (!iovad->fq)
+		return;
+
+	del_timer(&iovad->fq_timer);
+	/*
+	 * This code runs when the iova_domain is being detroyed, so don't
+	 * bother to free iovas, just free any remaining pagetable pages.
+	 */
+	for_each_possible_cpu(cpu) {
+		struct iova_fq *fq = per_cpu_ptr(iovad->fq, cpu);
+
+		fq_ring_for_each(idx, fq)
+			put_pages_list(&fq->entries[idx].freelist);
+	}
+
+	free_percpu(iovad->fq);
+
+	iovad->fq = NULL;
+	iovad->fq_domain = NULL;
+}
+
+int init_iova_flush_queue(struct iova_domain *iovad, struct iommu_domain *fq_domain)
+{
+	struct iova_fq __percpu *queue;
+	int i, cpu;
+
+	atomic64_set(&iovad->fq_flush_start_cnt,  0);
+	atomic64_set(&iovad->fq_flush_finish_cnt, 0);
+
+	queue = alloc_percpu(struct iova_fq);
+	if (!queue)
+		return -ENOMEM;
+
+	for_each_possible_cpu(cpu) {
+		struct iova_fq *fq = per_cpu_ptr(queue, cpu);
+
+		fq->head = 0;
+		fq->tail = 0;
+
+		spin_lock_init(&fq->lock);
+
+		for (i = 0; i < IOVA_FQ_SIZE; i++)
+			INIT_LIST_HEAD(&fq->entries[i].freelist);
+	}
+
+	iovad->fq_domain = fq_domain;
+	iovad->fq = queue;
+
+	timer_setup(&iovad->fq_timer, fq_flush_timeout, 0);
+	atomic_set(&iovad->fq_timer_on, 0);
+
+	return 0;
+}
+
 /**
  * put_iova_domain - destroys the iova domain
  * @iovad: - iova domain in question.
-- 
2.28.0.dirty


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 8/9] iommu/iova: Move flush queue code to iommu-dma
  2021-11-23 14:10 [PATCH 0/9] iommu: Refactor flush queues into iommu-dma Robin Murphy
                   ` (6 preceding siblings ...)
  2021-11-23 14:10 ` [PATCH 7/9] iommu/iova: Consolidate flush queue code Robin Murphy
@ 2021-11-23 14:10 ` Robin Murphy
  2021-11-23 14:10 ` [PATCH 9/9] iommu: Move flush queue data into iommu_dma_cookie Robin Murphy
  2021-11-24 17:21 ` [PATCH 0/9] iommu: Refactor flush queues into iommu-dma John Garry
  9 siblings, 0 replies; 19+ messages in thread
From: Robin Murphy @ 2021-11-23 14:10 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel, john.garry

Flush queues are specific to DMA ops, which are now handled exclusively
by iommu-dma. As such, now that the historical artefacts from being
shared directly with drivers have been cleaned up, move the flush queue
code into iommu-dma itself to get it out of the way of other IOVA users.

This is pure code movement with no functional change; refactoring to
clean up the headers and definitions will follow.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 179 +++++++++++++++++++++++++++++++++++++-
 drivers/iommu/iova.c      | 175 -------------------------------------
 2 files changed, 178 insertions(+), 176 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f139b77caee0..ddf75e7c2ebc 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -64,6 +64,181 @@ static int __init iommu_dma_forcedac_setup(char *str)
 }
 early_param("iommu.forcedac", iommu_dma_forcedac_setup);
 
+
+#define fq_ring_for_each(i, fq) \
+	for ((i) = (fq)->head; (i) != (fq)->tail; (i) = ((i) + 1) % IOVA_FQ_SIZE)
+
+static inline bool fq_full(struct iova_fq *fq)
+{
+	assert_spin_locked(&fq->lock);
+	return (((fq->tail + 1) % IOVA_FQ_SIZE) == fq->head);
+}
+
+static inline unsigned fq_ring_add(struct iova_fq *fq)
+{
+	unsigned idx = fq->tail;
+
+	assert_spin_locked(&fq->lock);
+
+	fq->tail = (idx + 1) % IOVA_FQ_SIZE;
+
+	return idx;
+}
+
+static void fq_ring_free(struct iova_domain *iovad, struct iova_fq *fq)
+{
+	u64 counter = atomic64_read(&iovad->fq_flush_finish_cnt);
+	unsigned idx;
+
+	assert_spin_locked(&fq->lock);
+
+	fq_ring_for_each(idx, fq) {
+
+		if (fq->entries[idx].counter >= counter)
+			break;
+
+		put_pages_list(&fq->entries[idx].freelist);
+		free_iova_fast(iovad,
+			       fq->entries[idx].iova_pfn,
+			       fq->entries[idx].pages);
+
+		fq->head = (fq->head + 1) % IOVA_FQ_SIZE;
+	}
+}
+
+static void iova_domain_flush(struct iova_domain *iovad)
+{
+	atomic64_inc(&iovad->fq_flush_start_cnt);
+	iovad->fq_domain->ops->flush_iotlb_all(iovad->fq_domain);
+	atomic64_inc(&iovad->fq_flush_finish_cnt);
+}
+
+static void fq_flush_timeout(struct timer_list *t)
+{
+	struct iova_domain *iovad = from_timer(iovad, t, fq_timer);
+	int cpu;
+
+	atomic_set(&iovad->fq_timer_on, 0);
+	iova_domain_flush(iovad);
+
+	for_each_possible_cpu(cpu) {
+		unsigned long flags;
+		struct iova_fq *fq;
+
+		fq = per_cpu_ptr(iovad->fq, cpu);
+		spin_lock_irqsave(&fq->lock, flags);
+		fq_ring_free(iovad, fq);
+		spin_unlock_irqrestore(&fq->lock, flags);
+	}
+}
+
+void queue_iova(struct iova_domain *iovad,
+		unsigned long pfn, unsigned long pages,
+		struct list_head *freelist)
+{
+	struct iova_fq *fq;
+	unsigned long flags;
+	unsigned idx;
+
+	/*
+	 * Order against the IOMMU driver's pagetable update from unmapping
+	 * @pte, to guarantee that iova_domain_flush() observes that if called
+	 * from a different CPU before we release the lock below. Full barrier
+	 * so it also pairs with iommu_dma_init_fq() to avoid seeing partially
+	 * written fq state here.
+	 */
+	smp_mb();
+
+	fq = raw_cpu_ptr(iovad->fq);
+	spin_lock_irqsave(&fq->lock, flags);
+
+	/*
+	 * First remove all entries from the flush queue that have already been
+	 * flushed out on another CPU. This makes the fq_full() check below less
+	 * likely to be true.
+	 */
+	fq_ring_free(iovad, fq);
+
+	if (fq_full(fq)) {
+		iova_domain_flush(iovad);
+		fq_ring_free(iovad, fq);
+	}
+
+	idx = fq_ring_add(fq);
+
+	fq->entries[idx].iova_pfn = pfn;
+	fq->entries[idx].pages    = pages;
+	fq->entries[idx].counter  = atomic64_read(&iovad->fq_flush_start_cnt);
+	list_splice(freelist, &fq->entries[idx].freelist);
+
+	spin_unlock_irqrestore(&fq->lock, flags);
+
+	/* Avoid false sharing as much as possible. */
+	if (!atomic_read(&iovad->fq_timer_on) &&
+	    !atomic_xchg(&iovad->fq_timer_on, 1))
+		mod_timer(&iovad->fq_timer,
+			  jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT));
+}
+
+static void free_iova_flush_queue(struct iova_domain *iovad)
+{
+	int cpu, idx;
+
+	if (!iovad->fq)
+		return;
+
+	del_timer(&iovad->fq_timer);
+	/*
+	 * This code runs when the iova_domain is being detroyed, so don't
+	 * bother to free iovas, just free any remaining pagetable pages.
+	 */
+	for_each_possible_cpu(cpu) {
+		struct iova_fq *fq = per_cpu_ptr(iovad->fq, cpu);
+
+		fq_ring_for_each(idx, fq)
+			put_pages_list(&fq->entries[idx].freelist);
+	}
+
+	free_percpu(iovad->fq);
+
+	iovad->fq = NULL;
+	iovad->fq_domain = NULL;
+}
+
+int init_iova_flush_queue(struct iova_domain *iovad, struct iommu_domain *fq_domain)
+{
+	struct iova_fq __percpu *queue;
+	int i, cpu;
+
+	atomic64_set(&iovad->fq_flush_start_cnt,  0);
+	atomic64_set(&iovad->fq_flush_finish_cnt, 0);
+
+	queue = alloc_percpu(struct iova_fq);
+	if (!queue)
+		return -ENOMEM;
+
+	for_each_possible_cpu(cpu) {
+		struct iova_fq *fq = per_cpu_ptr(queue, cpu);
+
+		fq->head = 0;
+		fq->tail = 0;
+
+		spin_lock_init(&fq->lock);
+
+		for (i = 0; i < IOVA_FQ_SIZE; i++)
+			INIT_LIST_HEAD(&fq->entries[i].freelist);
+	}
+
+	iovad->fq_domain = fq_domain;
+	iovad->fq = queue;
+
+	timer_setup(&iovad->fq_timer, fq_flush_timeout, 0);
+	atomic_set(&iovad->fq_timer_on, 0);
+
+	return 0;
+}
+
+
 static inline size_t cookie_msi_granule(struct iommu_dma_cookie *cookie)
 {
 	if (cookie->type == IOMMU_DMA_IOVA_COOKIE)
@@ -144,8 +319,10 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
 	if (!cookie)
 		return;
 
-	if (cookie->type == IOMMU_DMA_IOVA_COOKIE && cookie->iovad.granule)
+	if (cookie->type == IOMMU_DMA_IOVA_COOKIE && cookie->iovad.granule) {
+		free_iova_flush_queue(&cookie->iovad);
 		put_iova_domain(&cookie->iovad);
+	}
 
 	list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list, list) {
 		list_del(&msi->list);
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 159acd34501b..6673dfa8e7c5 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -481,179 +481,6 @@ free_iova_fast(struct iova_domain *iovad, unsigned long pfn, unsigned long size)
 }
 EXPORT_SYMBOL_GPL(free_iova_fast);
 
-#define fq_ring_for_each(i, fq) \
-	for ((i) = (fq)->head; (i) != (fq)->tail; (i) = ((i) + 1) % IOVA_FQ_SIZE)
-
-static inline bool fq_full(struct iova_fq *fq)
-{
-	assert_spin_locked(&fq->lock);
-	return (((fq->tail + 1) % IOVA_FQ_SIZE) == fq->head);
-}
-
-static inline unsigned fq_ring_add(struct iova_fq *fq)
-{
-	unsigned idx = fq->tail;
-
-	assert_spin_locked(&fq->lock);
-
-	fq->tail = (idx + 1) % IOVA_FQ_SIZE;
-
-	return idx;
-}
-
-static void fq_ring_free(struct iova_domain *iovad, struct iova_fq *fq)
-{
-	u64 counter = atomic64_read(&iovad->fq_flush_finish_cnt);
-	unsigned idx;
-
-	assert_spin_locked(&fq->lock);
-
-	fq_ring_for_each(idx, fq) {
-
-		if (fq->entries[idx].counter >= counter)
-			break;
-
-		put_pages_list(&fq->entries[idx].freelist);
-		free_iova_fast(iovad,
-			       fq->entries[idx].iova_pfn,
-			       fq->entries[idx].pages);
-
-		fq->head = (fq->head + 1) % IOVA_FQ_SIZE;
-	}
-}
-
-static void iova_domain_flush(struct iova_domain *iovad)
-{
-	atomic64_inc(&iovad->fq_flush_start_cnt);
-	iovad->fq_domain->ops->flush_iotlb_all(iovad->fq_domain);
-	atomic64_inc(&iovad->fq_flush_finish_cnt);
-}
-
-static void fq_flush_timeout(struct timer_list *t)
-{
-	struct iova_domain *iovad = from_timer(iovad, t, fq_timer);
-	int cpu;
-
-	atomic_set(&iovad->fq_timer_on, 0);
-	iova_domain_flush(iovad);
-
-	for_each_possible_cpu(cpu) {
-		unsigned long flags;
-		struct iova_fq *fq;
-
-		fq = per_cpu_ptr(iovad->fq, cpu);
-		spin_lock_irqsave(&fq->lock, flags);
-		fq_ring_free(iovad, fq);
-		spin_unlock_irqrestore(&fq->lock, flags);
-	}
-}
-
-void queue_iova(struct iova_domain *iovad,
-		unsigned long pfn, unsigned long pages,
-		struct list_head *freelist)
-{
-	struct iova_fq *fq;
-	unsigned long flags;
-	unsigned idx;
-
-	/*
-	 * Order against the IOMMU driver's pagetable update from unmapping
-	 * @pte, to guarantee that iova_domain_flush() observes that if called
-	 * from a different CPU before we release the lock below. Full barrier
-	 * so it also pairs with iommu_dma_init_fq() to avoid seeing partially
-	 * written fq state here.
-	 */
-	smp_mb();
-
-	fq = raw_cpu_ptr(iovad->fq);
-	spin_lock_irqsave(&fq->lock, flags);
-
-	/*
-	 * First remove all entries from the flush queue that have already been
-	 * flushed out on another CPU. This makes the fq_full() check below less
-	 * likely to be true.
-	 */
-	fq_ring_free(iovad, fq);
-
-	if (fq_full(fq)) {
-		iova_domain_flush(iovad);
-		fq_ring_free(iovad, fq);
-	}
-
-	idx = fq_ring_add(fq);
-
-	fq->entries[idx].iova_pfn = pfn;
-	fq->entries[idx].pages    = pages;
-	fq->entries[idx].counter  = atomic64_read(&iovad->fq_flush_start_cnt);
-	list_splice(freelist, &fq->entries[idx].freelist);
-
-	spin_unlock_irqrestore(&fq->lock, flags);
-
-	/* Avoid false sharing as much as possible. */
-	if (!atomic_read(&iovad->fq_timer_on) &&
-	    !atomic_xchg(&iovad->fq_timer_on, 1))
-		mod_timer(&iovad->fq_timer,
-			  jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT));
-}
-
-static void free_iova_flush_queue(struct iova_domain *iovad)
-{
-	int cpu, idx;
-
-	if (!iovad->fq)
-		return;
-
-	del_timer(&iovad->fq_timer);
-	/*
-	 * This code runs when the iova_domain is being detroyed, so don't
-	 * bother to free iovas, just free any remaining pagetable pages.
-	 */
-	for_each_possible_cpu(cpu) {
-		struct iova_fq *fq = per_cpu_ptr(iovad->fq, cpu);
-
-		fq_ring_for_each(idx, fq)
-			put_pages_list(&fq->entries[idx].freelist);
-	}
-
-	free_percpu(iovad->fq);
-
-	iovad->fq = NULL;
-	iovad->fq_domain = NULL;
-}
-
-int init_iova_flush_queue(struct iova_domain *iovad, struct iommu_domain *fq_domain)
-{
-	struct iova_fq __percpu *queue;
-	int i, cpu;
-
-	atomic64_set(&iovad->fq_flush_start_cnt,  0);
-	atomic64_set(&iovad->fq_flush_finish_cnt, 0);
-
-	queue = alloc_percpu(struct iova_fq);
-	if (!queue)
-		return -ENOMEM;
-
-	for_each_possible_cpu(cpu) {
-		struct iova_fq *fq = per_cpu_ptr(queue, cpu);
-
-		fq->head = 0;
-		fq->tail = 0;
-
-		spin_lock_init(&fq->lock);
-
-		for (i = 0; i < IOVA_FQ_SIZE; i++)
-			INIT_LIST_HEAD(&fq->entries[i].freelist);
-	}
-
-	iovad->fq_domain = fq_domain;
-	iovad->fq = queue;
-
-	timer_setup(&iovad->fq_timer, fq_flush_timeout, 0);
-	atomic_set(&iovad->fq_timer_on, 0);
-
-	return 0;
-}
-
 /**
  * put_iova_domain - destroys the iova domain
  * @iovad: - iova domain in question.
@@ -665,8 +492,6 @@ void put_iova_domain(struct iova_domain *iovad)
 
 	cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD,
 					    &iovad->cpuhp_dead);
-
-	free_iova_flush_queue(iovad);
 	free_iova_rcaches(iovad);
 	rbtree_postorder_for_each_entry_safe(iova, tmp, &iovad->rbroot, node)
 		free_iova_mem(iova);
-- 
2.28.0.dirty


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 9/9] iommu: Move flush queue data into iommu_dma_cookie
  2021-11-23 14:10 [PATCH 0/9] iommu: Refactor flush queues into iommu-dma Robin Murphy
                   ` (7 preceding siblings ...)
  2021-11-23 14:10 ` [PATCH 8/9] iommu/iova: Move flush queue code to iommu-dma Robin Murphy
@ 2021-11-23 14:10 ` Robin Murphy
  2021-11-23 22:40   ` kernel test robot
  2021-11-24 17:25   ` John Garry
  2021-11-24 17:21 ` [PATCH 0/9] iommu: Refactor flush queues into iommu-dma John Garry
  9 siblings, 2 replies; 19+ messages in thread
From: Robin Murphy @ 2021-11-23 14:10 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel, john.garry

Complete the move into iommu-dma by refactoring the flush queues
themselves to belong to the DMA cookie rather than the IOVA domain.

The refactoring may as well extend to some minor cosmetic aspects
too, to help us stay one step ahead of the style police.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 171 +++++++++++++++++++++-----------------
 drivers/iommu/iova.c      |   2 -
 include/linux/iova.h      |  44 +---------
 3 files changed, 95 insertions(+), 122 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index ddf75e7c2ebc..8a1aa980d376 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -9,9 +9,12 @@
  */
 
 #include <linux/acpi_iort.h>
+#include <linux/atomic.h>
+#include <linux/crash_dump.h>
 #include <linux/device.h>
-#include <linux/dma-map-ops.h>
+#include <linux/dma-direct.h>
 #include <linux/dma-iommu.h>
+#include <linux/dma-map-ops.h>
 #include <linux/gfp.h>
 #include <linux/huge_mm.h>
 #include <linux/iommu.h>
@@ -20,11 +23,10 @@
 #include <linux/mm.h>
 #include <linux/mutex.h>
 #include <linux/pci.h>
-#include <linux/swiotlb.h>
 #include <linux/scatterlist.h>
+#include <linux/spinlock.h>
+#include <linux/swiotlb.h>
 #include <linux/vmalloc.h>
-#include <linux/crash_dump.h>
-#include <linux/dma-direct.h>
 
 struct iommu_dma_msi_page {
 	struct list_head	list;
@@ -41,7 +43,19 @@ struct iommu_dma_cookie {
 	enum iommu_dma_cookie_type	type;
 	union {
 		/* Full allocator for IOMMU_DMA_IOVA_COOKIE */
-		struct iova_domain	iovad;
+		struct {
+			struct iova_domain	iovad;
+
+			struct iova_fq __percpu *fq;	/* Flush queue */
+			/* Number of TLB flushes that have been started */
+			atomic64_t		fq_flush_start_cnt;
+			/* Number of TLB flushes that have been finished */
+			atomic64_t		fq_flush_finish_cnt;
+			/* Timer to regularily empty the flush queues */
+			struct timer_list	fq_timer;
+			/* 1 when timer is active, 0 when not */
+			atomic_t		fq_timer_on;
+		};
 		/* Trivial linear page allocator for IOMMU_DMA_MSI_COOKIE */
 		dma_addr_t		msi_iova;
 	};
@@ -65,6 +79,27 @@ static int __init iommu_dma_forcedac_setup(char *str)
 early_param("iommu.forcedac", iommu_dma_forcedac_setup);
 
 
+/* Number of entries per flush queue */
+#define IOVA_FQ_SIZE	256
+
+/* Timeout (in ms) after which entries are flushed from the queue */
+#define IOVA_FQ_TIMEOUT	10
+
+/* Flush queue entry for deferred flushing */
+struct iova_fq_entry {
+	unsigned long iova_pfn;
+	unsigned long pages;
+	struct list_head freelist;
+	u64 counter; /* Flush counter when this entry was added */
+};
+
+/* Per-CPU flush queue structure */
+struct iova_fq {
+	struct iova_fq_entry entries[IOVA_FQ_SIZE];
+	unsigned int head, tail;
+	spinlock_t lock;
+};
+
 #define fq_ring_for_each(i, fq) \
 	for ((i) = (fq)->head; (i) != (fq)->tail; (i) = ((i) + 1) % IOVA_FQ_SIZE)
 
@@ -74,9 +109,9 @@ static inline bool fq_full(struct iova_fq *fq)
 	return (((fq->tail + 1) % IOVA_FQ_SIZE) == fq->head);
 }
 
-static inline unsigned fq_ring_add(struct iova_fq *fq)
+static inline unsigned int fq_ring_add(struct iova_fq *fq)
 {
-	unsigned idx = fq->tail;
+	unsigned int idx = fq->tail;
 
 	assert_spin_locked(&fq->lock);
 
@@ -85,10 +120,10 @@ static inline unsigned fq_ring_add(struct iova_fq *fq)
 	return idx;
 }
 
-static void fq_ring_free(struct iova_domain *iovad, struct iova_fq *fq)
+static void fq_ring_free(struct iommu_dma_cookie *cookie, struct iova_fq *fq)
 {
-	u64 counter = atomic64_read(&iovad->fq_flush_finish_cnt);
-	unsigned idx;
+	u64 counter = atomic64_read(&cookie->fq_flush_finish_cnt);
+	unsigned int idx;
 
 	assert_spin_locked(&fq->lock);
 
@@ -98,7 +133,7 @@ static void fq_ring_free(struct iova_domain *iovad, struct iova_fq *fq)
 			break;
 
 		put_pages_list(&fq->entries[idx].freelist);
-		free_iova_fast(iovad,
+		free_iova_fast(&cookie->iovad,
 			       fq->entries[idx].iova_pfn,
 			       fq->entries[idx].pages);
 
@@ -106,50 +141,50 @@ static void fq_ring_free(struct iova_domain *iovad, struct iova_fq *fq)
 	}
 }
 
-static void iova_domain_flush(struct iova_domain *iovad)
+static void fq_flush_iotlb(struct iommu_dma_cookie *cookie)
 {
-	atomic64_inc(&iovad->fq_flush_start_cnt);
-	iovad->fq_domain->ops->flush_iotlb_all(iovad->fq_domain);
-	atomic64_inc(&iovad->fq_flush_finish_cnt);
+	atomic64_inc(&cookie->fq_flush_start_cnt);
+	cookie->fq_domain->ops->flush_iotlb_all(cookie->fq_domain);
+	atomic64_inc(&cookie->fq_flush_finish_cnt);
 }
 
 static void fq_flush_timeout(struct timer_list *t)
 {
-	struct iova_domain *iovad = from_timer(iovad, t, fq_timer);
+	struct iommu_dma_cookie *cookie = from_timer(cookie, t, fq_timer);
 	int cpu;
 
-	atomic_set(&iovad->fq_timer_on, 0);
-	iova_domain_flush(iovad);
+	atomic_set(&cookie->fq_timer_on, 0);
+	fq_flush_iotlb(cookie);
 
 	for_each_possible_cpu(cpu) {
 		unsigned long flags;
 		struct iova_fq *fq;
 
-		fq = per_cpu_ptr(iovad->fq, cpu);
+		fq = per_cpu_ptr(cookie->fq, cpu);
 		spin_lock_irqsave(&fq->lock, flags);
-		fq_ring_free(iovad, fq);
+		fq_ring_free(cookie, fq);
 		spin_unlock_irqrestore(&fq->lock, flags);
 	}
 }
 
-void queue_iova(struct iova_domain *iovad,
+static void queue_iova(struct iommu_dma_cookie *cookie,
 		unsigned long pfn, unsigned long pages,
 		struct list_head *freelist)
 {
 	struct iova_fq *fq;
 	unsigned long flags;
-	unsigned idx;
+	unsigned int idx;
 
 	/*
 	 * Order against the IOMMU driver's pagetable update from unmapping
-	 * @pte, to guarantee that iova_domain_flush() observes that if called
+	 * @pte, to guarantee that fq_flush_iotlb() observes that if called
 	 * from a different CPU before we release the lock below. Full barrier
 	 * so it also pairs with iommu_dma_init_fq() to avoid seeing partially
 	 * written fq state here.
 	 */
 	smp_mb();
 
-	fq = raw_cpu_ptr(iovad->fq);
+	fq = raw_cpu_ptr(cookie->fq);
 	spin_lock_irqsave(&fq->lock, flags);
 
 	/*
@@ -157,65 +192,66 @@ void queue_iova(struct iova_domain *iovad,
 	 * flushed out on another CPU. This makes the fq_full() check below less
 	 * likely to be true.
 	 */
-	fq_ring_free(iovad, fq);
+	fq_ring_free(cookie, fq);
 
 	if (fq_full(fq)) {
-		iova_domain_flush(iovad);
-		fq_ring_free(iovad, fq);
+		fq_flush_iotlb(cookie);
+		fq_ring_free(cookie, fq);
 	}
 
 	idx = fq_ring_add(fq);
 
 	fq->entries[idx].iova_pfn = pfn;
 	fq->entries[idx].pages    = pages;
-	fq->entries[idx].counter  = atomic64_read(&iovad->fq_flush_start_cnt);
+	fq->entries[idx].counter  = atomic64_read(&cookie->fq_flush_start_cnt);
 	list_splice(freelist, &fq->entries[idx].freelist);
 
 	spin_unlock_irqrestore(&fq->lock, flags);
 
 	/* Avoid false sharing as much as possible. */
-	if (!atomic_read(&iovad->fq_timer_on) &&
-	    !atomic_xchg(&iovad->fq_timer_on, 1))
-		mod_timer(&iovad->fq_timer,
+	if (!atomic_read(&cookie->fq_timer_on) &&
+	    !atomic_xchg(&cookie->fq_timer_on, 1))
+		mod_timer(&cookie->fq_timer,
 			  jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT));
 }
 
-static void free_iova_flush_queue(struct iova_domain *iovad)
+static void iommu_dma_free_fq(struct iommu_dma_cookie *cookie)
 {
 	int cpu, idx;
 
-	if (!iovad->fq)
+	if (!cookie->fq)
 		return;
 
-	del_timer(&iovad->fq_timer);
-	/*
-	 * This code runs when the iova_domain is being detroyed, so don't
-	 * bother to free iovas, just free any remaining pagetable pages.
-	 */
+	del_timer(&cookie->fq_timer);
+	/* The IOVAs will be torn down separately, so just free our queued pages */
 	for_each_possible_cpu(cpu) {
-		struct iova_fq *fq = per_cpu_ptr(iovad->fq, cpu);
+		struct iova_fq *fq = per_cpu_ptr(cookie->fq, cpu);
 
 		fq_ring_for_each(idx, fq)
 			put_pages_list(&fq->entries[idx].freelist);
 	}
 
-	free_percpu(iovad->fq);
-
-	iovad->fq = NULL;
-	iovad->fq_domain = NULL;
+	free_percpu(cookie->fq);
 }
 
-int init_iova_flush_queue(struct iova_domain *iovad, struct iommu_domain *fq_domain)
+/* sysfs updates are serialised by the mutex of the group owning @domain */
+int iommu_dma_init_fq(struct iommu_domain *domain)
 {
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
 	struct iova_fq __percpu *queue;
 	int i, cpu;
 
-	atomic64_set(&iovad->fq_flush_start_cnt,  0);
-	atomic64_set(&iovad->fq_flush_finish_cnt, 0);
+	if (cookie->fq_domain)
+		return 0;
+
+	atomic64_set(&cookie->fq_flush_start_cnt,  0);
+	atomic64_set(&cookie->fq_flush_finish_cnt, 0);
 
 	queue = alloc_percpu(struct iova_fq);
-	if (!queue)
+	if (!queue) {
+		pr_warn("iova flush queue initialization failed\n");
 		return -ENOMEM;
+	}
 
 	for_each_possible_cpu(cpu) {
 		struct iova_fq *fq = per_cpu_ptr(queue, cpu);
@@ -229,12 +265,16 @@ int init_iova_flush_queue(struct iova_domain *iovad, struct iommu_domain *fq_dom
 			INIT_LIST_HEAD(&fq->entries[i].freelist);
 	}
 
-	iovad->fq_domain = fq_domain;
-	iovad->fq = queue;
-
-	timer_setup(&iovad->fq_timer, fq_flush_timeout, 0);
-	atomic_set(&iovad->fq_timer_on, 0);
+	cookie->fq = queue;
 
+	timer_setup(&cookie->fq_timer, fq_flush_timeout, 0);
+	atomic_set(&cookie->fq_timer_on, 0);
+	/*
+	 * Prevent incomplete fq state being observable. Pairs with path from
+	 * __iommu_dma_unmap() through iommu_dma_free_iova() to queue_iova()
+	 */
+	smp_wmb();
+	WRITE_ONCE(cookie->fq_domain, domain);
 	return 0;
 }
 
@@ -320,7 +360,7 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
 		return;
 
 	if (cookie->type == IOMMU_DMA_IOVA_COOKIE && cookie->iovad.granule) {
-		free_iova_flush_queue(&cookie->iovad);
+		iommu_dma_free_fq(cookie);
 		put_iova_domain(&cookie->iovad);
 	}
 
@@ -469,29 +509,6 @@ static bool dev_use_swiotlb(struct device *dev)
 	return IS_ENABLED(CONFIG_SWIOTLB) && dev_is_untrusted(dev);
 }
 
-/* sysfs updates are serialised by the mutex of the group owning @domain */
-int iommu_dma_init_fq(struct iommu_domain *domain)
-{
-	struct iommu_dma_cookie *cookie = domain->iova_cookie;
-	int ret;
-
-	if (cookie->fq_domain)
-		return 0;
-
-	ret = init_iova_flush_queue(&cookie->iovad, domain);
-	if (ret) {
-		pr_warn("iova flush queue initialization failed\n");
-		return ret;
-	}
-	/*
-	 * Prevent incomplete iovad->fq being observable. Pairs with path from
-	 * __iommu_dma_unmap() through iommu_dma_free_iova() to queue_iova()
-	 */
-	smp_wmb();
-	WRITE_ONCE(cookie->fq_domain, domain);
-	return 0;
-}
-
 /**
  * iommu_dma_init_domain - Initialise a DMA mapping domain
  * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
@@ -630,7 +647,7 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
 	if (cookie->type == IOMMU_DMA_MSI_COOKIE)
 		cookie->msi_iova -= size;
 	else if (gather && gather->queued)
-		queue_iova(iovad, iova_pfn(iovad, iova),
+		queue_iova(cookie, iova_pfn(iovad, iova),
 				size >> iova_shift(iovad),
 				&gather->freelist);
 	else
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 6673dfa8e7c5..72ac25831584 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -61,8 +61,6 @@ init_iova_domain(struct iova_domain *iovad, unsigned long granule,
 	iovad->start_pfn = start_pfn;
 	iovad->dma_32bit_pfn = 1UL << (32 - iova_shift(iovad));
 	iovad->max32_alloc_size = iovad->dma_32bit_pfn;
-	iovad->fq_domain = NULL;
-	iovad->fq = NULL;
 	iovad->anchor.pfn_lo = iovad->anchor.pfn_hi = IOVA_ANCHOR;
 	rb_link_node(&iovad->anchor.node, NULL, &iovad->rbroot.rb_node);
 	rb_insert_color(&iovad->anchor.node, &iovad->rbroot);
diff --git a/include/linux/iova.h b/include/linux/iova.h
index 072a09c06e8a..0abd48c5e622 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -12,9 +12,6 @@
 #include <linux/types.h>
 #include <linux/kernel.h>
 #include <linux/rbtree.h>
-#include <linux/atomic.h>
-#include <linux/dma-mapping.h>
-#include <linux/iommu.h>
 
 /* iova structure */
 struct iova {
@@ -36,27 +33,6 @@ struct iova_rcache {
 	struct iova_cpu_rcache __percpu *cpu_rcaches;
 };
 
-/* Number of entries per Flush Queue */
-#define IOVA_FQ_SIZE	256
-
-/* Timeout (in ms) after which entries are flushed from the Flush-Queue */
-#define IOVA_FQ_TIMEOUT	10
-
-/* Flush Queue entry for defered flushing */
-struct iova_fq_entry {
-	unsigned long iova_pfn;
-	unsigned long pages;
-	struct list_head freelist;
-	u64 counter; /* Flush counter when this entrie was added */
-};
-
-/* Per-CPU Flush Queue structure */
-struct iova_fq {
-	struct iova_fq_entry entries[IOVA_FQ_SIZE];
-	unsigned head, tail;
-	spinlock_t lock;
-};
-
 /* holds all the iova translations for a domain */
 struct iova_domain {
 	spinlock_t	iova_rbtree_lock; /* Lock to protect update of rbtree */
@@ -67,23 +43,9 @@ struct iova_domain {
 	unsigned long	start_pfn;	/* Lower limit for this domain */
 	unsigned long	dma_32bit_pfn;
 	unsigned long	max32_alloc_size; /* Size of last failed allocation */
-	struct iova_fq __percpu *fq;	/* Flush Queue */
-
-	atomic64_t	fq_flush_start_cnt;	/* Number of TLB flushes that
-						   have been started */
-
-	atomic64_t	fq_flush_finish_cnt;	/* Number of TLB flushes that
-						   have been finished */
-
 	struct iova	anchor;		/* rbtree lookup anchor */
+
 	struct iova_rcache rcaches[IOVA_RANGE_CACHE_MAX_SIZE];	/* IOVA range caches */
-
-	struct iommu_domain *fq_domain;
-
-	struct timer_list fq_timer;		/* Timer to regularily empty the
-						   flush-queues */
-	atomic_t fq_timer_on;			/* 1 when timer is active, 0
-						   when not */
 	struct hlist_node	cpuhp_dead;
 };
 
@@ -133,16 +95,12 @@ struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,
 	bool size_aligned);
 void free_iova_fast(struct iova_domain *iovad, unsigned long pfn,
 		    unsigned long size);
-void queue_iova(struct iova_domain *iovad,
-		unsigned long pfn, unsigned long pages,
-		struct list_head *freelist);
 unsigned long alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
 			      unsigned long limit_pfn, bool flush_rcache);
 struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
 	unsigned long pfn_hi);
 void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
 	unsigned long start_pfn);
-int init_iova_flush_queue(struct iova_domain *iovad, struct iommu_domain *fq_domain);
 struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
 void put_iova_domain(struct iova_domain *iovad);
 #else
-- 
2.28.0.dirty


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 9/9] iommu: Move flush queue data into iommu_dma_cookie
  2021-11-23 14:10 ` [PATCH 9/9] iommu: Move flush queue data into iommu_dma_cookie Robin Murphy
@ 2021-11-23 22:40   ` kernel test robot
  2021-11-24 17:25   ` John Garry
  1 sibling, 0 replies; 19+ messages in thread
From: kernel test robot @ 2021-11-23 22:40 UTC (permalink / raw)
  To: Robin Murphy, joro, will
  Cc: kbuild-all, iommu, suravee.suthikulpanit, baolu.lu, willy,
	linux-kernel, john.garry

Hi Robin,

I love your patch! Perhaps something to improve:

[auto build test WARNING on joro-iommu/next]
[also build test WARNING on v5.16-rc2 next-20211123]
[cannot apply to tegra-drm/drm/tegra/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Robin-Murphy/iommu-Refactor-flush-queues-into-iommu-dma/20211123-221220
base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
config: arm-defconfig (https://download.01.org/0day-ci/archive/20211124/202111240645.30neUyaq-lkp@intel.com/config.gz)
compiler: arm-linux-gnueabi-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/d4623bb02366503fa7c3805228fa9534c9490d20
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Robin-Murphy/iommu-Refactor-flush-queues-into-iommu-dma/20211123-221220
        git checkout d4623bb02366503fa7c3805228fa9534c9490d20
        # save the config file to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/tegra/vic.c: In function 'vic_exit':
   drivers/gpu/drm/tegra/vic.c:196:17: error: implicit declaration of function 'dma_unmap_single'; did you mean 'mount_single'? [-Werror=implicit-function-declaration]
     196 |                 dma_unmap_single(vic->dev, vic->falcon.firmware.phys,
         |                 ^~~~~~~~~~~~~~~~
         |                 mount_single
   drivers/gpu/drm/tegra/vic.c:197:61: error: 'DMA_TO_DEVICE' undeclared (first use in this function); did you mean 'MT_DEVICE'?
     197 |                                  vic->falcon.firmware.size, DMA_TO_DEVICE);
         |                                                             ^~~~~~~~~~~~~
         |                                                             MT_DEVICE
   drivers/gpu/drm/tegra/vic.c:197:61: note: each undeclared identifier is reported only once for each function it appears in
   drivers/gpu/drm/tegra/vic.c:202:17: error: implicit declaration of function 'dma_free_coherent' [-Werror=implicit-function-declaration]
     202 |                 dma_free_coherent(vic->dev, vic->falcon.firmware.size,
         |                 ^~~~~~~~~~~~~~~~~
   drivers/gpu/drm/tegra/vic.c: In function 'vic_load_firmware':
   drivers/gpu/drm/tegra/vic.c:234:24: error: implicit declaration of function 'dma_alloc_coherent' [-Werror=implicit-function-declaration]
     234 |                 virt = dma_alloc_coherent(vic->dev, size, &iova, GFP_KERNEL);
         |                        ^~~~~~~~~~~~~~~~~~
>> drivers/gpu/drm/tegra/vic.c:234:22: warning: assignment to 'void *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
     234 |                 virt = dma_alloc_coherent(vic->dev, size, &iova, GFP_KERNEL);
         |                      ^
   drivers/gpu/drm/tegra/vic.c:236:23: error: implicit declaration of function 'dma_mapping_error' [-Werror=implicit-function-declaration]
     236 |                 err = dma_mapping_error(vic->dev, iova);
         |                       ^~~~~~~~~~~~~~~~~
   drivers/gpu/drm/tegra/vic.c:258:24: error: implicit declaration of function 'dma_map_single' [-Werror=implicit-function-declaration]
     258 |                 phys = dma_map_single(vic->dev, virt, size, DMA_TO_DEVICE);
         |                        ^~~~~~~~~~~~~~
   drivers/gpu/drm/tegra/vic.c:258:61: error: 'DMA_TO_DEVICE' undeclared (first use in this function); did you mean 'MT_DEVICE'?
     258 |                 phys = dma_map_single(vic->dev, virt, size, DMA_TO_DEVICE);
         |                                                             ^~~~~~~~~~~~~
         |                                                             MT_DEVICE
   drivers/gpu/drm/tegra/vic.c: In function 'vic_probe':
   drivers/gpu/drm/tegra/vic.c:412:15: error: implicit declaration of function 'dma_coerce_mask_and_coherent' [-Werror=implicit-function-declaration]
     412 |         err = dma_coerce_mask_and_coherent(dev, *dev->parent->dma_mask);
         |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +234 drivers/gpu/drm/tegra/vic.c

0ae797a8ba05a2 Arto Merilainen 2016-12-14  214  
77a0b09dd993c8 Thierry Reding  2019-02-01  215  static int vic_load_firmware(struct vic *vic)
77a0b09dd993c8 Thierry Reding  2019-02-01  216  {
20e7dce255e96a Thierry Reding  2019-10-28  217  	struct host1x_client *client = &vic->client.base;
20e7dce255e96a Thierry Reding  2019-10-28  218  	struct tegra_drm *tegra = vic->client.drm;
d972d624762805 Thierry Reding  2019-10-28  219  	dma_addr_t iova;
20e7dce255e96a Thierry Reding  2019-10-28  220  	size_t size;
20e7dce255e96a Thierry Reding  2019-10-28  221  	void *virt;
77a0b09dd993c8 Thierry Reding  2019-02-01  222  	int err;
77a0b09dd993c8 Thierry Reding  2019-02-01  223  
d972d624762805 Thierry Reding  2019-10-28  224  	if (vic->falcon.firmware.virt)
77a0b09dd993c8 Thierry Reding  2019-02-01  225  		return 0;
77a0b09dd993c8 Thierry Reding  2019-02-01  226  
77a0b09dd993c8 Thierry Reding  2019-02-01  227  	err = falcon_read_firmware(&vic->falcon, vic->config->firmware);
77a0b09dd993c8 Thierry Reding  2019-02-01  228  	if (err < 0)
20e7dce255e96a Thierry Reding  2019-10-28  229  		return err;
20e7dce255e96a Thierry Reding  2019-10-28  230  
20e7dce255e96a Thierry Reding  2019-10-28  231  	size = vic->falcon.firmware.size;
20e7dce255e96a Thierry Reding  2019-10-28  232  
20e7dce255e96a Thierry Reding  2019-10-28  233  	if (!client->group) {
d972d624762805 Thierry Reding  2019-10-28 @234  		virt = dma_alloc_coherent(vic->dev, size, &iova, GFP_KERNEL);
20e7dce255e96a Thierry Reding  2019-10-28  235  
d972d624762805 Thierry Reding  2019-10-28  236  		err = dma_mapping_error(vic->dev, iova);
20e7dce255e96a Thierry Reding  2019-10-28  237  		if (err < 0)
20e7dce255e96a Thierry Reding  2019-10-28  238  			return err;
20e7dce255e96a Thierry Reding  2019-10-28  239  	} else {
d972d624762805 Thierry Reding  2019-10-28  240  		virt = tegra_drm_alloc(tegra, size, &iova);
20e7dce255e96a Thierry Reding  2019-10-28  241  	}
20e7dce255e96a Thierry Reding  2019-10-28  242  
d972d624762805 Thierry Reding  2019-10-28  243  	vic->falcon.firmware.virt = virt;
d972d624762805 Thierry Reding  2019-10-28  244  	vic->falcon.firmware.iova = iova;
77a0b09dd993c8 Thierry Reding  2019-02-01  245  
77a0b09dd993c8 Thierry Reding  2019-02-01  246  	err = falcon_load_firmware(&vic->falcon);
77a0b09dd993c8 Thierry Reding  2019-02-01  247  	if (err < 0)
77a0b09dd993c8 Thierry Reding  2019-02-01  248  		goto cleanup;
77a0b09dd993c8 Thierry Reding  2019-02-01  249  
20e7dce255e96a Thierry Reding  2019-10-28  250  	/*
20e7dce255e96a Thierry Reding  2019-10-28  251  	 * In this case we have received an IOVA from the shared domain, so we
20e7dce255e96a Thierry Reding  2019-10-28  252  	 * need to make sure to get the physical address so that the DMA API
20e7dce255e96a Thierry Reding  2019-10-28  253  	 * knows what memory pages to flush the cache for.
20e7dce255e96a Thierry Reding  2019-10-28  254  	 */
20e7dce255e96a Thierry Reding  2019-10-28  255  	if (client->group) {
d972d624762805 Thierry Reding  2019-10-28  256  		dma_addr_t phys;
d972d624762805 Thierry Reding  2019-10-28  257  
20e7dce255e96a Thierry Reding  2019-10-28  258  		phys = dma_map_single(vic->dev, virt, size, DMA_TO_DEVICE);
20e7dce255e96a Thierry Reding  2019-10-28  259  
20e7dce255e96a Thierry Reding  2019-10-28  260  		err = dma_mapping_error(vic->dev, phys);
20e7dce255e96a Thierry Reding  2019-10-28  261  		if (err < 0)
20e7dce255e96a Thierry Reding  2019-10-28  262  			goto cleanup;
20e7dce255e96a Thierry Reding  2019-10-28  263  
d972d624762805 Thierry Reding  2019-10-28  264  		vic->falcon.firmware.phys = phys;
20e7dce255e96a Thierry Reding  2019-10-28  265  	}
20e7dce255e96a Thierry Reding  2019-10-28  266  
77a0b09dd993c8 Thierry Reding  2019-02-01  267  	return 0;
77a0b09dd993c8 Thierry Reding  2019-02-01  268  
77a0b09dd993c8 Thierry Reding  2019-02-01  269  cleanup:
20e7dce255e96a Thierry Reding  2019-10-28  270  	if (!client->group)
d972d624762805 Thierry Reding  2019-10-28  271  		dma_free_coherent(vic->dev, size, virt, iova);
20e7dce255e96a Thierry Reding  2019-10-28  272  	else
d972d624762805 Thierry Reding  2019-10-28  273  		tegra_drm_free(tegra, size, virt, iova);
20e7dce255e96a Thierry Reding  2019-10-28  274  
77a0b09dd993c8 Thierry Reding  2019-02-01  275  	return err;
77a0b09dd993c8 Thierry Reding  2019-02-01  276  }
77a0b09dd993c8 Thierry Reding  2019-02-01  277  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/9] gpu: host1x: Add missing DMA API include
  2021-11-23 14:10 ` [PATCH 1/9] gpu: host1x: Add missing DMA API include Robin Murphy
@ 2021-11-24 14:05   ` Robin Murphy
  2021-12-06 12:20     ` Joerg Roedel
  0 siblings, 1 reply; 19+ messages in thread
From: Robin Murphy @ 2021-11-24 14:05 UTC (permalink / raw)
  To: Thierry Reding, Mikko Perttunen
  Cc: linux-kernel, willy, iommu, dri-devel, linux-tegra, joro, will

On 2021-11-23 14:10, Robin Murphy wrote:
> Host1x seems to be relying on picking up dma-mapping.h transitively from
> iova.h, which has no reason to include it in the first place. Fix the
> former issue before we totally break things by fixing the latter one.
> 
> CC: Thierry Reding <thierry.reding@gmail.com>
> CC: Mikko Perttunen <mperttunen@nvidia.com>
> CC: dri-devel@lists.freedesktop.org
> CC: linux-tegra@vger.kernel.org
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
> 
> Feel free to pick this into drm-misc-next or drm-misc-fixes straight
> away if that suits - it's only to avoid a build breakage once the rest
> of the series gets queued.

Bah, seems like tegra-vic needs the same treatment too, but wasn't in my 
local config. Should I squash that into a respin of this patch on the 
grounds of being vaguely related, or would you prefer it separate?

(Either way I'll wait a little while to see if the buildbots uncover any 
more...)

Cheers,
Robin.

>   drivers/gpu/host1x/bus.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/host1x/bus.c b/drivers/gpu/host1x/bus.c
> index 218e3718fd68..881fad5c3307 100644
> --- a/drivers/gpu/host1x/bus.c
> +++ b/drivers/gpu/host1x/bus.c
> @@ -5,6 +5,7 @@
>    */
>   
>   #include <linux/debugfs.h>
> +#include <linux/dma-mapping.h>
>   #include <linux/host1x.h>
>   #include <linux/of.h>
>   #include <linux/seq_file.h>
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/9] iommu: Refactor flush queues into iommu-dma
  2021-11-23 14:10 [PATCH 0/9] iommu: Refactor flush queues into iommu-dma Robin Murphy
                   ` (8 preceding siblings ...)
  2021-11-23 14:10 ` [PATCH 9/9] iommu: Move flush queue data into iommu_dma_cookie Robin Murphy
@ 2021-11-24 17:21 ` John Garry
  2021-11-24 18:33   ` Robin Murphy
  9 siblings, 1 reply; 19+ messages in thread
From: John Garry @ 2021-11-24 17:21 UTC (permalink / raw)
  To: Robin Murphy, joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel

On 23/11/2021 14:10, Robin Murphy wrote:
> As promised, this series cleans up the flush queue code and streamlines
> it directly into iommu-dma. Since we no longer have per-driver DMA ops
> implementations, a lot of the abstraction is now no longer necessary, so
> there's a nice degree of simplification in the process. Un-abstracting
> the queued page freeing mechanism is also the perfect opportunity to
> revise which struct page fields we use so we can be better-behaved
> from the MM point of view, thanks to Matthew.
> 
> These changes should also make it viable to start using the gather
> freelist in io-pgtable-arm, and eliminate some more synchronous
> invalidations from the normal flow there, but that is proving to need a
> bit more careful thought than I have time for in this cycle, so I've
> parked that again for now and will revisit it in the new year.
> 
> For convenience, branch at:
>    https://gitlab.arm.com/linux-arm/linux-rm/-/tree/iommu/iova
> 
> I've build-tested for x86_64, and boot-tested arm64 to the point of
> confirming that put_pages_list() gets passed a valid empty list when
> flushing, while everything else still works.
My interest is in patches 2, 3, 7, 8, 9, and they look ok. I did a bit 
of testing for strict and non-strict mode on my arm64 system and no 
problems.

Apart from this, I noticed that one possible optimization could be to 
avoid so many reads of fq_flush_finish_cnt, as we seem to have a pattern 
of fq_flush_iotlb()->atomic64_inc(fq_flush_finish_cnt) followed by a 
read of fq_flush_finish_cnt in fq_ring_free(), so we could use 
atomic64_inc_return(fq_flush_finish_cnt) and reuse the value. I think 
that any racing in fq_flush_finish_cnt accesses are latent, but maybe 
there is a flaw in this. However I tried something along these lines and 
got a 2.4% throughput gain for my storage scenario.

Thanks,
John

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 9/9] iommu: Move flush queue data into iommu_dma_cookie
  2021-11-23 14:10 ` [PATCH 9/9] iommu: Move flush queue data into iommu_dma_cookie Robin Murphy
  2021-11-23 22:40   ` kernel test robot
@ 2021-11-24 17:25   ` John Garry
  2021-11-24 18:00     ` Robin Murphy
  1 sibling, 1 reply; 19+ messages in thread
From: John Garry @ 2021-11-24 17:25 UTC (permalink / raw)
  To: Robin Murphy, joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel

On 23/11/2021 14:10, Robin Murphy wrote:
> ruct iommu_dma_msi_page {
>   	struct list_head	list;
> @@ -41,7 +43,19 @@ struct iommu_dma_cookie {
>   	enum iommu_dma_cookie_type	type;
>   	union {
>   		/* Full allocator for IOMMU_DMA_IOVA_COOKIE */
> -		struct iova_domain	iovad;
> +		struct {
> +			struct iova_domain	iovad;
> +
> +			struct iova_fq __percpu *fq;	/* Flush queue */
> +			/* Number of TLB flushes that have been started */
> +			atomic64_t		fq_flush_start_cnt;
> +			/* Number of TLB flushes that have been finished */
> +			atomic64_t		fq_flush_finish_cnt;
> +			/* Timer to regularily empty the flush queues */
> +			struct timer_list	fq_timer;
> +			/* 1 when timer is active, 0 when not */
> +			atomic_t		fq_timer_on;

I thought that putting all the FQ stuff in its own structure would be 
neater, but that's just personal preference.

Thanks,
John

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 9/9] iommu: Move flush queue data into iommu_dma_cookie
  2021-11-24 17:25   ` John Garry
@ 2021-11-24 18:00     ` Robin Murphy
  0 siblings, 0 replies; 19+ messages in thread
From: Robin Murphy @ 2021-11-24 18:00 UTC (permalink / raw)
  To: John Garry, joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel

On 2021-11-24 17:25, John Garry wrote:
> On 23/11/2021 14:10, Robin Murphy wrote:
>> ruct iommu_dma_msi_page {
>>       struct list_head    list;
>> @@ -41,7 +43,19 @@ struct iommu_dma_cookie {
>>       enum iommu_dma_cookie_type    type;
>>       union {
>>           /* Full allocator for IOMMU_DMA_IOVA_COOKIE */
>> -        struct iova_domain    iovad;
>> +        struct {
>> +            struct iova_domain    iovad;
>> +
>> +            struct iova_fq __percpu *fq;    /* Flush queue */
>> +            /* Number of TLB flushes that have been started */
>> +            atomic64_t        fq_flush_start_cnt;
>> +            /* Number of TLB flushes that have been finished */
>> +            atomic64_t        fq_flush_finish_cnt;
>> +            /* Timer to regularily empty the flush queues */
>> +            struct timer_list    fq_timer;
>> +            /* 1 when timer is active, 0 when not */
>> +            atomic_t        fq_timer_on;
> 
> I thought that putting all the FQ stuff in its own structure would be 
> neater, but that's just personal preference.

But look, it is! ;)

The iova_domain is still a fundamental part of the flush queue built 
around it; the rest of the machinery can't stand in isolation. It's just 
an anonymous structure because I don't feel like needlessly cluttering 
up the code with "cookie->fq.fq" silliness.

Cheers,
Robin.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/9] iommu: Refactor flush queues into iommu-dma
  2021-11-24 17:21 ` [PATCH 0/9] iommu: Refactor flush queues into iommu-dma John Garry
@ 2021-11-24 18:33   ` Robin Murphy
  0 siblings, 0 replies; 19+ messages in thread
From: Robin Murphy @ 2021-11-24 18:33 UTC (permalink / raw)
  To: John Garry, joro, will
  Cc: iommu, suravee.suthikulpanit, baolu.lu, willy, linux-kernel

On 2021-11-24 17:21, John Garry wrote:
> On 23/11/2021 14:10, Robin Murphy wrote:
>> As promised, this series cleans up the flush queue code and streamlines
>> it directly into iommu-dma. Since we no longer have per-driver DMA ops
>> implementations, a lot of the abstraction is now no longer necessary, so
>> there's a nice degree of simplification in the process. Un-abstracting
>> the queued page freeing mechanism is also the perfect opportunity to
>> revise which struct page fields we use so we can be better-behaved
>> from the MM point of view, thanks to Matthew.
>>
>> These changes should also make it viable to start using the gather
>> freelist in io-pgtable-arm, and eliminate some more synchronous
>> invalidations from the normal flow there, but that is proving to need a
>> bit more careful thought than I have time for in this cycle, so I've
>> parked that again for now and will revisit it in the new year.
>>
>> For convenience, branch at:
>>    https://gitlab.arm.com/linux-arm/linux-rm/-/tree/iommu/iova
>>
>> I've build-tested for x86_64, and boot-tested arm64 to the point of
>> confirming that put_pages_list() gets passed a valid empty list when
>> flushing, while everything else still works.
> My interest is in patches 2, 3, 7, 8, 9, and they look ok. I did a bit 
> of testing for strict and non-strict mode on my arm64 system and no 
> problems.
> 
> Apart from this, I noticed that one possible optimization could be to 
> avoid so many reads of fq_flush_finish_cnt, as we seem to have a pattern 
> of fq_flush_iotlb()->atomic64_inc(fq_flush_finish_cnt) followed by a 
> read of fq_flush_finish_cnt in fq_ring_free(), so we could use 
> atomic64_inc_return(fq_flush_finish_cnt) and reuse the value. I think 
> that any racing in fq_flush_finish_cnt accesses are latent, but maybe 
> there is a flaw in this. However I tried something along these lines and 
> got a 2.4% throughput gain for my storage scenario.

Yes, that sounds reasonable - off-hand I can't see that there's any more 
potential for harmful races either. All that jumps out is the case where 
the flush count gets bumped via queue_iova() while another CPU is 
already running fq_flush_timeout(), where freeing newer IOVAs added 
since the timeout is then more likely to be left to the local CPU, or 
postponed until the next flush cycle entirely, rather than being piled 
on to the guy already processing the for_each_possible_cpu() loop. And I 
can't help thinking that could only be a *good* thing, given how the FQ 
timeout seems to be a smoking gun in your "performance never recovers 
after falling off the cliff" scenario :)

Robin.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/9] gpu: host1x: Add missing DMA API include
  2021-11-24 14:05   ` Robin Murphy
@ 2021-12-06 12:20     ` Joerg Roedel
  0 siblings, 0 replies; 19+ messages in thread
From: Joerg Roedel @ 2021-12-06 12:20 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Thierry Reding, Mikko Perttunen, linux-kernel, willy, iommu,
	dri-devel, linux-tegra, will

Hi Robin,

On Wed, Nov 24, 2021 at 02:05:15PM +0000, Robin Murphy wrote:
> Bah, seems like tegra-vic needs the same treatment too, but wasn't in my
> local config. Should I squash that into a respin of this patch on the
> grounds of being vaguely related, or would you prefer it separate?

In case this fix gets queued in the iommu-tree too, please put it all in
one patch.

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/9] iommu/amd: Simplify pagetable freeing
  2021-11-23 14:10 ` [PATCH 4/9] iommu/amd: Simplify pagetable freeing Robin Murphy
@ 2021-12-06 12:40   ` Joerg Roedel
  2021-12-06 13:28     ` Robin Murphy
  0 siblings, 1 reply; 19+ messages in thread
From: Joerg Roedel @ 2021-12-06 12:40 UTC (permalink / raw)
  To: Robin Murphy
  Cc: will, iommu, suravee.suthikulpanit, baolu.lu, willy,
	linux-kernel, john.garry

On Tue, Nov 23, 2021 at 02:10:39PM +0000, Robin Murphy wrote:
> For reasons unclear, pagetable freeing is an effectively recursive
> method implemented via an elaborate system of templated functions that
> turns out to account for 25% of the object file size. Implementing it
> using regular straightforward recursion makes the code simpler, and
> seems like a good thing to do before we work on it further. As part of
> that, also fix the types to avoid all the needless casting back and
> forth which just gets in the way.

Nice cleanup! The stack of functions came from the fact that recursion
was pretty much discouraged in the kernel. But in this case it looks
well bounded and should be fine.

> +static struct page *free_pt_lvl(u64 *pt, struct page *freelist, int lvl)
> +{
> +	u64 *p;
> +	int i;
> +
> +	for (i = 0; i < 512; ++i) {
> +		/* PTE present? */
> +		if (!IOMMU_PTE_PRESENT(pt[i]))
> +			continue;
> +
> +		/* Large PTE? */
> +		if (PM_PTE_LEVEL(pt[i]) == 0 ||
> +		    PM_PTE_LEVEL(pt[i]) == 7)
> +			continue;
> +
> +		p = IOMMU_PTE_PAGE(pt[i]);
> +		if (lvl > 2)

I thinkt this function deserves a couple of comments. It took me a while
to make sense of the 'lvl > 2' comparision. I think it is right, but if
I have think again I'd appreciate a comment :)

Regards,

	Joerg

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/9] iommu/amd: Simplify pagetable freeing
  2021-12-06 12:40   ` Joerg Roedel
@ 2021-12-06 13:28     ` Robin Murphy
  0 siblings, 0 replies; 19+ messages in thread
From: Robin Murphy @ 2021-12-06 13:28 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: will, iommu, suravee.suthikulpanit, baolu.lu, willy,
	linux-kernel, john.garry

On 2021-12-06 12:40, Joerg Roedel wrote:
> On Tue, Nov 23, 2021 at 02:10:39PM +0000, Robin Murphy wrote:
>> For reasons unclear, pagetable freeing is an effectively recursive
>> method implemented via an elaborate system of templated functions that
>> turns out to account for 25% of the object file size. Implementing it
>> using regular straightforward recursion makes the code simpler, and
>> seems like a good thing to do before we work on it further. As part of
>> that, also fix the types to avoid all the needless casting back and
>> forth which just gets in the way.
> 
> Nice cleanup! The stack of functions came from the fact that recursion
> was pretty much discouraged in the kernel. But in this case it looks
> well bounded and should be fine.

I did wonder about explicitly clamping lvl to ensure that it couldn't 
possibly recurse any further than the multi-function version, but given 
that you'd need to craft a suitable bogus pagetable in addition to 
corrupting the arguments to be able to exploit it at all, that seemed 
perhaps a little too paranoid. Happy to add something like:

	if (WARN_ON(lvl > PAGE_MODE_7_LEVEL))
		return NULL;

if you like, though.

>> +static struct page *free_pt_lvl(u64 *pt, struct page *freelist, int lvl)
>> +{
>> +	u64 *p;
>> +	int i;
>> +
>> +	for (i = 0; i < 512; ++i) {
>> +		/* PTE present? */
>> +		if (!IOMMU_PTE_PRESENT(pt[i]))
>> +			continue;
>> +
>> +		/* Large PTE? */
>> +		if (PM_PTE_LEVEL(pt[i]) == 0 ||
>> +		    PM_PTE_LEVEL(pt[i]) == 7)
>> +			continue;
>> +
>> +		p = IOMMU_PTE_PAGE(pt[i]);
>> +		if (lvl > 2)
> 
> I thinkt this function deserves a couple of comments. It took me a while
> to make sense of the 'lvl > 2' comparision. I think it is right, but if
> I have think again I'd appreciate a comment :)

Heh, it's merely a direct transformation of the logic encoded in the 
existing "DEFINE_FREE_PT_FN(...)" cases - I assume that's just an 
optimisation, so I'll add a comment to that effect.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-12-06 13:28 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-23 14:10 [PATCH 0/9] iommu: Refactor flush queues into iommu-dma Robin Murphy
2021-11-23 14:10 ` [PATCH 1/9] gpu: host1x: Add missing DMA API include Robin Murphy
2021-11-24 14:05   ` Robin Murphy
2021-12-06 12:20     ` Joerg Roedel
2021-11-23 14:10 ` [PATCH 2/9] iommu/iova: Squash entry_dtor abstraction Robin Murphy
2021-11-23 14:10 ` [PATCH 3/9] iommu/iova: Squash flush_cb abstraction Robin Murphy
2021-11-23 14:10 ` [PATCH 4/9] iommu/amd: Simplify pagetable freeing Robin Murphy
2021-12-06 12:40   ` Joerg Roedel
2021-12-06 13:28     ` Robin Murphy
2021-11-23 14:10 ` [PATCH 5/9] iommu/amd: Use put_pages_list Robin Murphy
2021-11-23 14:10 ` [PATCH 6/9] iommu/vt-d: " Robin Murphy
2021-11-23 14:10 ` [PATCH 7/9] iommu/iova: Consolidate flush queue code Robin Murphy
2021-11-23 14:10 ` [PATCH 8/9] iommu/iova: Move flush queue code to iommu-dma Robin Murphy
2021-11-23 14:10 ` [PATCH 9/9] iommu: Move flush queue data into iommu_dma_cookie Robin Murphy
2021-11-23 22:40   ` kernel test robot
2021-11-24 17:25   ` John Garry
2021-11-24 18:00     ` Robin Murphy
2021-11-24 17:21 ` [PATCH 0/9] iommu: Refactor flush queues into iommu-dma John Garry
2021-11-24 18:33   ` Robin Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).