linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] iommu/iova: Solve longterm IOVA issue
@ 2020-12-09 18:23 John Garry
  2020-12-09 18:23 ` [PATCH v4 1/3] iommu/iova: Add free_all_cpu_cached_iovas() John Garry
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: John Garry @ 2020-12-09 18:23 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, linux-kernel, linuxarm, robin.murphy, thunder.leizhen, John Garry

This series contains a patch to solve the longterm IOVA issue which
leizhen originally tried to address at [0].

A sieved kernel log is at the following, showing periodic dumps of IOVA
sizes, per CPU and per depot bin, per IOVA size granule:
https://raw.githubusercontent.com/hisilicon/kernel-dev/topic-iommu-5.10-iova-debug-v3/aging_test

Notice, for example, the following logs:
[13175.355584] print_iova1 cpu_total=40135 depot_total=3866 total=44001
[83483.457858] print_iova1 cpu_total=62532 depot_total=24476 total=87008

Where total IOVA rcache size has grown from 44K->87K over a long time.

Along with this patch, I included the following:
- A smaller helper to clear all IOVAs for a domain
- Change polarity of the IOVA magazine helpers

Differences to v3:
- Drop Cong's patch, already accepted
- Add tags
- Rebased

Differnces to v2:
- Update commit message for patch 3/4

Differences to v1:
- Add IOVA clearing helper
- Add patch to change polarity of mag helpers
- Avoid logically-redundant extra variable in __iova_rcache_insert()

[0] https://lore.kernel.org/linux-iommu/20190815121104.29140-3-thunder.leizhen@huawei.com/
[1] https://lore.kernel.org/linux-iommu/4b74d40a-22d1-af53-fcb6-5d70183705a8@huawei.com/

John Garry (3):
  iommu/iova: Add free_all_cpu_cached_iovas()
  iommu/iova: Avoid double-negatives in magazine helpers
  iommu/iova: Flush CPU rcache for when a depot fills

 drivers/iommu/iova.c | 58 ++++++++++++++++++++++++--------------------
 1 file changed, 32 insertions(+), 26 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 1/3] iommu/iova: Add free_all_cpu_cached_iovas()
  2020-12-09 18:23 [PATCH v4 0/3] iommu/iova: Solve longterm IOVA issue John Garry
@ 2020-12-09 18:23 ` John Garry
  2021-01-15 17:28   ` Jean-Philippe Brucker
  2020-12-09 18:23 ` [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers John Garry
  2020-12-09 18:23 ` [PATCH v4 3/3] iommu/iova: Flush CPU rcache for when a depot fills John Garry
  2 siblings, 1 reply; 15+ messages in thread
From: John Garry @ 2020-12-09 18:23 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, linux-kernel, linuxarm, robin.murphy, thunder.leizhen, John Garry

Add a helper function to free the CPU rcache for all online CPUs.

There also exists a function of the same name in
drivers/iommu/intel/iommu.c, but the parameters are different, and there
should be no conflict.

Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 drivers/iommu/iova.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index f9c35852018d..cf1aacda2fe4 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -238,6 +238,14 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 	return -ENOMEM;
 }
 
+static void free_all_cpu_cached_iovas(struct iova_domain *iovad)
+{
+	unsigned int cpu;
+
+	for_each_online_cpu(cpu)
+		free_cpu_cached_iovas(cpu, iovad);
+}
+
 static struct kmem_cache *iova_cache;
 static unsigned int iova_cache_users;
 static DEFINE_MUTEX(iova_cache_mutex);
@@ -435,15 +443,12 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
 retry:
 	new_iova = alloc_iova(iovad, size, limit_pfn, true);
 	if (!new_iova) {
-		unsigned int cpu;
-
 		if (!flush_rcache)
 			return 0;
 
 		/* Try replenishing IOVAs by flushing rcache. */
 		flush_rcache = false;
-		for_each_online_cpu(cpu)
-			free_cpu_cached_iovas(cpu, iovad);
+		free_all_cpu_cached_iovas(iovad);
 		free_global_cached_iovas(iovad);
 		goto retry;
 	}
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers
  2020-12-09 18:23 [PATCH v4 0/3] iommu/iova: Solve longterm IOVA issue John Garry
  2020-12-09 18:23 ` [PATCH v4 1/3] iommu/iova: Add free_all_cpu_cached_iovas() John Garry
@ 2020-12-09 18:23 ` John Garry
  2021-01-15 17:30   ` Jean-Philippe Brucker
  2020-12-09 18:23 ` [PATCH v4 3/3] iommu/iova: Flush CPU rcache for when a depot fills John Garry
  2 siblings, 1 reply; 15+ messages in thread
From: John Garry @ 2020-12-09 18:23 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, linux-kernel, linuxarm, robin.murphy, thunder.leizhen, John Garry

A similar crash to the following could be observed if initial CPU rcache
magazine allocations fail in init_iova_rcaches():

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Mem abort info:

  free_iova_fast+0xfc/0x280
  iommu_dma_free_iova+0x64/0x70
  __iommu_dma_unmap+0x9c/0xf8
  iommu_dma_unmap_sg+0xa8/0xc8
  dma_unmap_sg_attrs+0x28/0x50
  cq_thread_v3_hw+0x2dc/0x528
  irq_thread_fn+0x2c/0xa0
  irq_thread+0x130/0x1e0
  kthread+0x154/0x158
  ret_from_fork+0x10/0x34

The issue is that expression !iova_magazine_full(NULL) evaluates true; this
falls over in __iova_rcache_insert() when we attempt to cache a mag and
cpu_rcache->loaded == NULL:

if (!iova_magazine_full(cpu_rcache->loaded)) {
	can_insert = true;
...

if (can_insert)
	iova_magazine_push(cpu_rcache->loaded, iova_pfn);

As above, can_insert is evaluated true, which it shouldn't be, and we try
to insert pfns in a NULL mag, which is not safe.

To avoid this, stop using double-negatives, like !iova_magazine_full() and
!iova_magazine_empty(), and use positive tests, like
iova_magazine_has_space() and iova_magazine_has_pfns(), respectively; these
can safely deal with cpu_rcache->{loaded, prev} = NULL.

Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 drivers/iommu/iova.c | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index cf1aacda2fe4..732ee687e0e2 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -767,14 +767,18 @@ iova_magazine_free_pfns(struct iova_magazine *mag, struct iova_domain *iovad)
 	mag->size = 0;
 }
 
-static bool iova_magazine_full(struct iova_magazine *mag)
+static bool iova_magazine_has_space(struct iova_magazine *mag)
 {
-	return (mag && mag->size == IOVA_MAG_SIZE);
+	if (!mag)
+		return false;
+	return mag->size < IOVA_MAG_SIZE;
 }
 
-static bool iova_magazine_empty(struct iova_magazine *mag)
+static bool iova_magazine_has_pfns(struct iova_magazine *mag)
 {
-	return (!mag || mag->size == 0);
+	if (!mag)
+		return false;
+	return mag->size;
 }
 
 static unsigned long iova_magazine_pop(struct iova_magazine *mag,
@@ -783,7 +787,7 @@ static unsigned long iova_magazine_pop(struct iova_magazine *mag,
 	int i;
 	unsigned long pfn;
 
-	BUG_ON(iova_magazine_empty(mag));
+	BUG_ON(!iova_magazine_has_pfns(mag));
 
 	/* Only fall back to the rbtree if we have no suitable pfns at all */
 	for (i = mag->size - 1; mag->pfns[i] > limit_pfn; i--)
@@ -799,7 +803,7 @@ static unsigned long iova_magazine_pop(struct iova_magazine *mag,
 
 static void iova_magazine_push(struct iova_magazine *mag, unsigned long pfn)
 {
-	BUG_ON(iova_magazine_full(mag));
+	BUG_ON(!iova_magazine_has_space(mag));
 
 	mag->pfns[mag->size++] = pfn;
 }
@@ -845,9 +849,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 	cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
 	spin_lock_irqsave(&cpu_rcache->lock, flags);
 
-	if (!iova_magazine_full(cpu_rcache->loaded)) {
+	if (iova_magazine_has_space(cpu_rcache->loaded)) {
 		can_insert = true;
-	} else if (!iova_magazine_full(cpu_rcache->prev)) {
+	} else if (iova_magazine_has_space(cpu_rcache->prev)) {
 		swap(cpu_rcache->prev, cpu_rcache->loaded);
 		can_insert = true;
 	} else {
@@ -856,8 +860,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 		if (new_mag) {
 			spin_lock(&rcache->lock);
 			if (rcache->depot_size < MAX_GLOBAL_MAGS) {
-				rcache->depot[rcache->depot_size++] =
-						cpu_rcache->loaded;
+				if (cpu_rcache->loaded)
+					rcache->depot[rcache->depot_size++] =
+							cpu_rcache->loaded;
 			} else {
 				mag_to_free = cpu_rcache->loaded;
 			}
@@ -908,9 +913,9 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
 	cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
 	spin_lock_irqsave(&cpu_rcache->lock, flags);
 
-	if (!iova_magazine_empty(cpu_rcache->loaded)) {
+	if (iova_magazine_has_pfns(cpu_rcache->loaded)) {
 		has_pfn = true;
-	} else if (!iova_magazine_empty(cpu_rcache->prev)) {
+	} else if (iova_magazine_has_pfns(cpu_rcache->prev)) {
 		swap(cpu_rcache->prev, cpu_rcache->loaded);
 		has_pfn = true;
 	} else {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 3/3] iommu/iova: Flush CPU rcache for when a depot fills
  2020-12-09 18:23 [PATCH v4 0/3] iommu/iova: Solve longterm IOVA issue John Garry
  2020-12-09 18:23 ` [PATCH v4 1/3] iommu/iova: Add free_all_cpu_cached_iovas() John Garry
  2020-12-09 18:23 ` [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers John Garry
@ 2020-12-09 18:23 ` John Garry
  2021-01-15 17:32   ` Jean-Philippe Brucker
  2 siblings, 1 reply; 15+ messages in thread
From: John Garry @ 2020-12-09 18:23 UTC (permalink / raw)
  To: joro, will
  Cc: iommu, linux-kernel, linuxarm, robin.murphy, thunder.leizhen, John Garry

Leizhen reported some time ago that IOVA performance may degrade over time
[0], but unfortunately his solution to fix this problem was not given
attention.

To summarize, the issue is that as time goes by, the CPU rcache and depot
rcache continue to grow. As such, IOVA RB tree access time also continues
to grow.

At a certain point, a depot may become full, and also some CPU rcaches may
also be full when inserting another IOVA is attempted. For this scenario,
currently the "loaded" CPU rcache is freed and a new one is created. This
freeing means that many IOVAs in the RB tree need to be freed, which
makes IO throughput performance fall off a cliff in some storage scenarios:

Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6314MB/0KB/0KB /s] [1616K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [5669MB/0KB/0KB /s] [1451K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6673MB/0KB/0KB /s] [1708K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6761MB/0KB/0KB /s] [1731K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6685MB/0KB/0KB /s] [1711K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6178MB/0KB/0KB /s] [1582K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6731MB/0KB/0KB /s] [1723K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [2387MB/0KB/0KB /s] [611K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [2689MB/0KB/0KB /s] [688K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [2278MB/0KB/0KB /s] [583K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [1288MB/0KB/0KB /s] [330K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [1632MB/0KB/0KB /s] [418K/0/0 iops]
Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [1765MB/0KB/0KB /s] [452K/0/0 iops]

And continue in this fashion, without recovering. Note that in this
example it was required to wait 16 hours for this to occur. Also note that
IO throughput also becomes gradually becomes more unstable leading up to
this point.

This problem is only seen for non-strict mode. For strict mode, the rcaches
stay quite compact.

As a solution to this issue, judge that the IOVA caches have grown too big
when cached magazines need to be free, and just flush all the CPUs rcaches
instead.

The depot rcaches, however, are not flushed, as they can be used to
immediately replenish active CPUs.

In future, some IOVA compaction could be implemented to solve the
instability issue, which I figure could be quite complex to implement.

[0] https://lore.kernel.org/linux-iommu/20190815121104.29140-3-thunder.leizhen@huawei.com/

Analyzed-by: Zhen Lei <thunder.leizhen@huawei.com>
Reported-by: Xiang Chen <chenxiang66@hisilicon.com>
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 drivers/iommu/iova.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 732ee687e0e2..39b7488de8bb 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -841,7 +841,6 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 				 struct iova_rcache *rcache,
 				 unsigned long iova_pfn)
 {
-	struct iova_magazine *mag_to_free = NULL;
 	struct iova_cpu_rcache *cpu_rcache;
 	bool can_insert = false;
 	unsigned long flags;
@@ -863,13 +862,12 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 				if (cpu_rcache->loaded)
 					rcache->depot[rcache->depot_size++] =
 							cpu_rcache->loaded;
-			} else {
-				mag_to_free = cpu_rcache->loaded;
+				can_insert = true;
+				cpu_rcache->loaded = new_mag;
 			}
 			spin_unlock(&rcache->lock);
-
-			cpu_rcache->loaded = new_mag;
-			can_insert = true;
+			if (!can_insert)
+				iova_magazine_free(new_mag);
 		}
 	}
 
@@ -878,10 +876,8 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
 
 	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
 
-	if (mag_to_free) {
-		iova_magazine_free_pfns(mag_to_free, iovad);
-		iova_magazine_free(mag_to_free);
-	}
+	if (!can_insert)
+		free_all_cpu_cached_iovas(iovad);
 
 	return can_insert;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/3] iommu/iova: Add free_all_cpu_cached_iovas()
  2020-12-09 18:23 ` [PATCH v4 1/3] iommu/iova: Add free_all_cpu_cached_iovas() John Garry
@ 2021-01-15 17:28   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 15+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-15 17:28 UTC (permalink / raw)
  To: John Garry; +Cc: joro, will, linux-kernel, linuxarm, iommu, robin.murphy

On Thu, Dec 10, 2020 at 02:23:07AM +0800, John Garry wrote:
> Add a helper function to free the CPU rcache for all online CPUs.
> 
> There also exists a function of the same name in
> drivers/iommu/intel/iommu.c, but the parameters are different, and there
> should be no conflict.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>

Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

(unless we find a better solution for patch 3)

> ---
>  drivers/iommu/iova.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index f9c35852018d..cf1aacda2fe4 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -238,6 +238,14 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
>  	return -ENOMEM;
>  }
>  
> +static void free_all_cpu_cached_iovas(struct iova_domain *iovad)
> +{
> +	unsigned int cpu;
> +
> +	for_each_online_cpu(cpu)
> +		free_cpu_cached_iovas(cpu, iovad);
> +}
> +
>  static struct kmem_cache *iova_cache;
>  static unsigned int iova_cache_users;
>  static DEFINE_MUTEX(iova_cache_mutex);
> @@ -435,15 +443,12 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
>  retry:
>  	new_iova = alloc_iova(iovad, size, limit_pfn, true);
>  	if (!new_iova) {
> -		unsigned int cpu;
> -
>  		if (!flush_rcache)
>  			return 0;
>  
>  		/* Try replenishing IOVAs by flushing rcache. */
>  		flush_rcache = false;
> -		for_each_online_cpu(cpu)
> -			free_cpu_cached_iovas(cpu, iovad);
> +		free_all_cpu_cached_iovas(iovad);
>  		free_global_cached_iovas(iovad);
>  		goto retry;
>  	}
> -- 
> 2.26.2
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers
  2020-12-09 18:23 ` [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers John Garry
@ 2021-01-15 17:30   ` Jean-Philippe Brucker
  2021-01-18  9:24     ` John Garry
  0 siblings, 1 reply; 15+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-15 17:30 UTC (permalink / raw)
  To: John Garry; +Cc: joro, will, linux-kernel, linuxarm, iommu, robin.murphy

On Thu, Dec 10, 2020 at 02:23:08AM +0800, John Garry wrote:
> A similar crash to the following could be observed if initial CPU rcache
> magazine allocations fail in init_iova_rcaches():

Any idea why that's happening?  This fix seems ok but if we're expecting
allocation failures for the loaded magazine then we could easily get it
for cpu_rcaches too, and get a similar abort at runtime.

Thanks,
Jean

> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> Mem abort info:
> 
>   free_iova_fast+0xfc/0x280
>   iommu_dma_free_iova+0x64/0x70
>   __iommu_dma_unmap+0x9c/0xf8
>   iommu_dma_unmap_sg+0xa8/0xc8
>   dma_unmap_sg_attrs+0x28/0x50
>   cq_thread_v3_hw+0x2dc/0x528
>   irq_thread_fn+0x2c/0xa0
>   irq_thread+0x130/0x1e0
>   kthread+0x154/0x158
>   ret_from_fork+0x10/0x34
> 
> The issue is that expression !iova_magazine_full(NULL) evaluates true; this
> falls over in __iova_rcache_insert() when we attempt to cache a mag and
> cpu_rcache->loaded == NULL:
> 
> if (!iova_magazine_full(cpu_rcache->loaded)) {
> 	can_insert = true;
> ...
> 
> if (can_insert)
> 	iova_magazine_push(cpu_rcache->loaded, iova_pfn);
> 
> As above, can_insert is evaluated true, which it shouldn't be, and we try
> to insert pfns in a NULL mag, which is not safe.
> 
> To avoid this, stop using double-negatives, like !iova_magazine_full() and
> !iova_magazine_empty(), and use positive tests, like
> iova_magazine_has_space() and iova_magazine_has_pfns(), respectively; these
> can safely deal with cpu_rcache->{loaded, prev} = NULL.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>

> ---
>  drivers/iommu/iova.c | 29 +++++++++++++++++------------
>  1 file changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index cf1aacda2fe4..732ee687e0e2 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -767,14 +767,18 @@ iova_magazine_free_pfns(struct iova_magazine *mag, struct iova_domain *iovad)
>  	mag->size = 0;
>  }
>  
> -static bool iova_magazine_full(struct iova_magazine *mag)
> +static bool iova_magazine_has_space(struct iova_magazine *mag)
>  {
> -	return (mag && mag->size == IOVA_MAG_SIZE);
> +	if (!mag)
> +		return false;
> +	return mag->size < IOVA_MAG_SIZE;
>  }
>  
> -static bool iova_magazine_empty(struct iova_magazine *mag)
> +static bool iova_magazine_has_pfns(struct iova_magazine *mag)
>  {
> -	return (!mag || mag->size == 0);
> +	if (!mag)
> +		return false;
> +	return mag->size;
>  }
>  
>  static unsigned long iova_magazine_pop(struct iova_magazine *mag,
> @@ -783,7 +787,7 @@ static unsigned long iova_magazine_pop(struct iova_magazine *mag,
>  	int i;
>  	unsigned long pfn;
>  
> -	BUG_ON(iova_magazine_empty(mag));
> +	BUG_ON(!iova_magazine_has_pfns(mag));
>  
>  	/* Only fall back to the rbtree if we have no suitable pfns at all */
>  	for (i = mag->size - 1; mag->pfns[i] > limit_pfn; i--)
> @@ -799,7 +803,7 @@ static unsigned long iova_magazine_pop(struct iova_magazine *mag,
>  
>  static void iova_magazine_push(struct iova_magazine *mag, unsigned long pfn)
>  {
> -	BUG_ON(iova_magazine_full(mag));
> +	BUG_ON(!iova_magazine_has_space(mag));
>  
>  	mag->pfns[mag->size++] = pfn;
>  }
> @@ -845,9 +849,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
>  	cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
>  	spin_lock_irqsave(&cpu_rcache->lock, flags);
>  
> -	if (!iova_magazine_full(cpu_rcache->loaded)) {
> +	if (iova_magazine_has_space(cpu_rcache->loaded)) {
>  		can_insert = true;
> -	} else if (!iova_magazine_full(cpu_rcache->prev)) {
> +	} else if (iova_magazine_has_space(cpu_rcache->prev)) {
>  		swap(cpu_rcache->prev, cpu_rcache->loaded);
>  		can_insert = true;
>  	} else {
> @@ -856,8 +860,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
>  		if (new_mag) {
>  			spin_lock(&rcache->lock);
>  			if (rcache->depot_size < MAX_GLOBAL_MAGS) {
> -				rcache->depot[rcache->depot_size++] =
> -						cpu_rcache->loaded;
> +				if (cpu_rcache->loaded)
> +					rcache->depot[rcache->depot_size++] =
> +							cpu_rcache->loaded;
>  			} else {
>  				mag_to_free = cpu_rcache->loaded;
>  			}
> @@ -908,9 +913,9 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
>  	cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
>  	spin_lock_irqsave(&cpu_rcache->lock, flags);
>  
> -	if (!iova_magazine_empty(cpu_rcache->loaded)) {
> +	if (iova_magazine_has_pfns(cpu_rcache->loaded)) {
>  		has_pfn = true;
> -	} else if (!iova_magazine_empty(cpu_rcache->prev)) {
> +	} else if (iova_magazine_has_pfns(cpu_rcache->prev)) {
>  		swap(cpu_rcache->prev, cpu_rcache->loaded);
>  		has_pfn = true;
>  	} else {
> -- 
> 2.26.2
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 3/3] iommu/iova: Flush CPU rcache for when a depot fills
  2020-12-09 18:23 ` [PATCH v4 3/3] iommu/iova: Flush CPU rcache for when a depot fills John Garry
@ 2021-01-15 17:32   ` Jean-Philippe Brucker
  2021-01-15 19:21     ` Robin Murphy
  0 siblings, 1 reply; 15+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-15 17:32 UTC (permalink / raw)
  To: John Garry; +Cc: joro, will, linux-kernel, linuxarm, iommu, robin.murphy

On Thu, Dec 10, 2020 at 02:23:09AM +0800, John Garry wrote:
> Leizhen reported some time ago that IOVA performance may degrade over time
> [0], but unfortunately his solution to fix this problem was not given
> attention.
> 
> To summarize, the issue is that as time goes by, the CPU rcache and depot
> rcache continue to grow. As such, IOVA RB tree access time also continues
> to grow.
> 
> At a certain point, a depot may become full, and also some CPU rcaches may
> also be full when inserting another IOVA is attempted. For this scenario,
> currently the "loaded" CPU rcache is freed and a new one is created. This
> freeing means that many IOVAs in the RB tree need to be freed, which
> makes IO throughput performance fall off a cliff in some storage scenarios:
> 
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6314MB/0KB/0KB /s] [1616K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [5669MB/0KB/0KB /s] [1451K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6673MB/0KB/0KB /s] [1708K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6761MB/0KB/0KB /s] [1731K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6685MB/0KB/0KB /s] [1711K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6178MB/0KB/0KB /s] [1582K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6731MB/0KB/0KB /s] [1723K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [2387MB/0KB/0KB /s] [611K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [2689MB/0KB/0KB /s] [688K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [2278MB/0KB/0KB /s] [583K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [1288MB/0KB/0KB /s] [330K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [1632MB/0KB/0KB /s] [418K/0/0 iops]
> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [1765MB/0KB/0KB /s] [452K/0/0 iops]
> 
> And continue in this fashion, without recovering. Note that in this
> example it was required to wait 16 hours for this to occur. Also note that
> IO throughput also becomes gradually becomes more unstable leading up to
> this point.
> 
> This problem is only seen for non-strict mode. For strict mode, the rcaches
> stay quite compact.

It would be good to understand why the rcache doesn't stabilize. Could be
a bug, or just need some tuning

In strict mode, if a driver does Alloc-Free-Alloc and the first alloc
misses the rcache, the second allocation hits it. The same sequence in
non-strict mode misses the cache twice, because the IOVA is added to the
flush queue on Free. 

So rather than AFAFAF.. we get AAA..FFF.., only once the fq_timer triggers
or the FQ is full. Interestingly the FQ size is 2x IOVA_MAG_SIZE, so we
could allocate 2 magazines worth of fresh IOVAs before alloc starts
hitting the cache. If a job allocates more than that, some magazines are
going to the depot, and with multi-CPU jobs those will get used on other
CPUs during the next alloc bursts, causing the progressive increase in
rcache consumption. I wonder if setting IOVA_MAG_SIZE > IOVA_FQ_SIZE helps
reuse of IOVAs?

Then again I haven't worked out the details, might be entirely wrong. I'll
have another look next week.

Thanks,
Jean

> As a solution to this issue, judge that the IOVA caches have grown too big
> when cached magazines need to be free, and just flush all the CPUs rcaches
> instead.
> 
> The depot rcaches, however, are not flushed, as they can be used to
> immediately replenish active CPUs.
> 
> In future, some IOVA compaction could be implemented to solve the
> instability issue, which I figure could be quite complex to implement.
> 
> [0] https://lore.kernel.org/linux-iommu/20190815121104.29140-3-thunder.leizhen@huawei.com/
> 
> Analyzed-by: Zhen Lei <thunder.leizhen@huawei.com>
> Reported-by: Xiang Chen <chenxiang66@hisilicon.com>
> Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
> Signed-off-by: John Garry <john.garry@huawei.com>
> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  drivers/iommu/iova.c | 16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 732ee687e0e2..39b7488de8bb 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -841,7 +841,6 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
>  				 struct iova_rcache *rcache,
>  				 unsigned long iova_pfn)
>  {
> -	struct iova_magazine *mag_to_free = NULL;
>  	struct iova_cpu_rcache *cpu_rcache;
>  	bool can_insert = false;
>  	unsigned long flags;
> @@ -863,13 +862,12 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
>  				if (cpu_rcache->loaded)
>  					rcache->depot[rcache->depot_size++] =
>  							cpu_rcache->loaded;
> -			} else {
> -				mag_to_free = cpu_rcache->loaded;
> +				can_insert = true;
> +				cpu_rcache->loaded = new_mag;
>  			}
>  			spin_unlock(&rcache->lock);
> -
> -			cpu_rcache->loaded = new_mag;
> -			can_insert = true;
> +			if (!can_insert)
> +				iova_magazine_free(new_mag);
>  		}
>  	}
>  
> @@ -878,10 +876,8 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
>  
>  	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
>  
> -	if (mag_to_free) {
> -		iova_magazine_free_pfns(mag_to_free, iovad);
> -		iova_magazine_free(mag_to_free);
> -	}
> +	if (!can_insert)
> +		free_all_cpu_cached_iovas(iovad);
>  
>  	return can_insert;
>  }
> -- 
> 2.26.2
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 3/3] iommu/iova: Flush CPU rcache for when a depot fills
  2021-01-15 17:32   ` Jean-Philippe Brucker
@ 2021-01-15 19:21     ` Robin Murphy
  2021-01-18 12:38       ` John Garry
  0 siblings, 1 reply; 15+ messages in thread
From: Robin Murphy @ 2021-01-15 19:21 UTC (permalink / raw)
  To: Jean-Philippe Brucker, John Garry
  Cc: joro, will, linux-kernel, linuxarm, iommu

On 2021-01-15 17:32, Jean-Philippe Brucker wrote:
> On Thu, Dec 10, 2020 at 02:23:09AM +0800, John Garry wrote:
>> Leizhen reported some time ago that IOVA performance may degrade over time
>> [0], but unfortunately his solution to fix this problem was not given
>> attention.
>>
>> To summarize, the issue is that as time goes by, the CPU rcache and depot
>> rcache continue to grow. As such, IOVA RB tree access time also continues
>> to grow.
>>
>> At a certain point, a depot may become full, and also some CPU rcaches may
>> also be full when inserting another IOVA is attempted. For this scenario,
>> currently the "loaded" CPU rcache is freed and a new one is created. This
>> freeing means that many IOVAs in the RB tree need to be freed, which
>> makes IO throughput performance fall off a cliff in some storage scenarios:
>>
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6314MB/0KB/0KB /s] [1616K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [5669MB/0KB/0KB /s] [1451K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6673MB/0KB/0KB /s] [1708K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6761MB/0KB/0KB /s] [1731K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6685MB/0KB/0KB /s] [1711K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6178MB/0KB/0KB /s] [1582K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [6731MB/0KB/0KB /s] [1723K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [2387MB/0KB/0KB /s] [611K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [2689MB/0KB/0KB /s] [688K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [2278MB/0KB/0KB /s] [583K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [1288MB/0KB/0KB /s] [330K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [1632MB/0KB/0KB /s] [418K/0/0 iops]
>> Jobs: 12 (f=12): [RRRRRRRRRRRR] [0.0% done] [1765MB/0KB/0KB /s] [452K/0/0 iops]
>>
>> And continue in this fashion, without recovering. Note that in this
>> example it was required to wait 16 hours for this to occur. Also note that
>> IO throughput also becomes gradually becomes more unstable leading up to
>> this point.
>>
>> This problem is only seen for non-strict mode. For strict mode, the rcaches
>> stay quite compact.
> 
> It would be good to understand why the rcache doesn't stabilize. Could be
> a bug, or just need some tuning
> 
> In strict mode, if a driver does Alloc-Free-Alloc and the first alloc
> misses the rcache, the second allocation hits it. The same sequence in
> non-strict mode misses the cache twice, because the IOVA is added to the
> flush queue on Free.
> 
> So rather than AFAFAF.. we get AAA..FFF.., only once the fq_timer triggers
> or the FQ is full. Interestingly the FQ size is 2x IOVA_MAG_SIZE, so we
> could allocate 2 magazines worth of fresh IOVAs before alloc starts
> hitting the cache. If a job allocates more than that, some magazines are
> going to the depot, and with multi-CPU jobs those will get used on other
> CPUs during the next alloc bursts, causing the progressive increase in
> rcache consumption. I wonder if setting IOVA_MAG_SIZE > IOVA_FQ_SIZE helps
> reuse of IOVAs?
> 
> Then again I haven't worked out the details, might be entirely wrong. I'll
> have another look next week.

I did start digging into the data (thanks for that!) before Christmas, 
but between being generally frazzled and trying to remember how to write 
Perl to massage the numbers out of the log dump I never got round to 
responding, sorry.

The partial thoughts that I can recall right now are firstly that the 
total numbers of IOVAs are actually pretty meaningless, it really needs 
to be broken down by size (that's where my Perl-hacking stalled...); 
secondly that the pattern is far more than just a steady increase - the 
CPU rcache count looks to be heading asymptotically towards ~65K IOVAs 
all the time, representing (IIRC) two sizes being in heavy rotation, 
while the depot is happily ticking along in a steady state as expected, 
until it suddenly explodes out of nowhere; thirdly, I'd really like to 
see instrumentation of the flush queues at the same time, since I think 
they're the real culprit.

My theory so far is that everyone is calling queue_iova() frequently 
enough to keep the timer at bay and their own queues drained. Then at 
the ~16H mark, *something* happens that pauses unmaps long enough for 
the timer to fire, and at that point all hell breaks loose. The depot is 
suddenly flooded with IOVAs of *all* sizes, indicative of all the queues 
being flushed at once (note that the two most common sizes have been 
hovering perilously close to "full" the whole time), but then, 
crucially, *that keeps happening*. My guess is that the load of 
fq_flush_timeout() slows things down enough that the the timer then 
keeps getting the chance to expire and repeat the situation.

The main conclusion I draw from this is the same one that was my initial 
gut feeling; that MAX_GLOBAL_MAGS = 32 is utter bollocks. The CPU rcache 
capacity scales with the number of CPUs; the flush queue capacity scales 
with the number of CPUs; it is nonsensical that the depot size does not 
correspondingly scale with the number of CPUs (I note that the testing 
on the original patchset cites a 16-CPU system, where that depot 
capacity is conveniently equal to the total rcache capacity).

Now yes, purging the rcaches when the depot is full does indeed help 
mitigate this scenario - I assume it provides enough of a buffer where 
the regular free_iova_fast() calls don't hit queue_iova() for a while 
(and gives fq_ring_free() some reprieve on the CPU handling the 
timeout), giving enough leeway for the flood to finish before anyone 
starts hitting queues/locks/etc. and stalling again, and thus break the 
self-perpetuating timeout cycle. But that's still only a damage 
limitation exercise! It's planning for failure to just lie down and 
assume that the depot is going to be full if fq_flush_timeout() ever 
fires because it's something like an order of magnitude smaller than the 
flush queue capacity (even for a uniform distribution of IOVA sizes) on 
super-large systems.

I'm honestly tempted to move my position further towards a hard NAK on 
this approach, because all the evidence so far points to it being a 
bodge around a clear and easily-fixed scalability oversight. At the very 
least I'd now want to hear a reasoned justification for why you want to 
keep the depot at an arbitrary fixed size while the whole rest of the 
system scales (I'm assuming that since my previous suggestion to try 
changes in that area seems to have been ignored).

Cheers,
Robin.

> 
> Thanks,
> Jean
> 
>> As a solution to this issue, judge that the IOVA caches have grown too big
>> when cached magazines need to be free, and just flush all the CPUs rcaches
>> instead.
>>
>> The depot rcaches, however, are not flushed, as they can be used to
>> immediately replenish active CPUs.
>>
>> In future, some IOVA compaction could be implemented to solve the
>> instability issue, which I figure could be quite complex to implement.
>>
>> [0] https://lore.kernel.org/linux-iommu/20190815121104.29140-3-thunder.leizhen@huawei.com/
>>
>> Analyzed-by: Zhen Lei <thunder.leizhen@huawei.com>
>> Reported-by: Xiang Chen <chenxiang66@hisilicon.com>
>> Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>   drivers/iommu/iova.c | 16 ++++++----------
>>   1 file changed, 6 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index 732ee687e0e2..39b7488de8bb 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -841,7 +841,6 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
>>   				 struct iova_rcache *rcache,
>>   				 unsigned long iova_pfn)
>>   {
>> -	struct iova_magazine *mag_to_free = NULL;
>>   	struct iova_cpu_rcache *cpu_rcache;
>>   	bool can_insert = false;
>>   	unsigned long flags;
>> @@ -863,13 +862,12 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
>>   				if (cpu_rcache->loaded)
>>   					rcache->depot[rcache->depot_size++] =
>>   							cpu_rcache->loaded;
>> -			} else {
>> -				mag_to_free = cpu_rcache->loaded;
>> +				can_insert = true;
>> +				cpu_rcache->loaded = new_mag;
>>   			}
>>   			spin_unlock(&rcache->lock);
>> -
>> -			cpu_rcache->loaded = new_mag;
>> -			can_insert = true;
>> +			if (!can_insert)
>> +				iova_magazine_free(new_mag);
>>   		}
>>   	}
>>   
>> @@ -878,10 +876,8 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
>>   
>>   	spin_unlock_irqrestore(&cpu_rcache->lock, flags);
>>   
>> -	if (mag_to_free) {
>> -		iova_magazine_free_pfns(mag_to_free, iovad);
>> -		iova_magazine_free(mag_to_free);
>> -	}
>> +	if (!can_insert)
>> +		free_all_cpu_cached_iovas(iovad);
>>   
>>   	return can_insert;
>>   }
>> -- 
>> 2.26.2
>>
>> _______________________________________________
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers
  2021-01-15 17:30   ` Jean-Philippe Brucker
@ 2021-01-18  9:24     ` John Garry
  2021-01-18 10:08       ` Jean-Philippe Brucker
  0 siblings, 1 reply; 15+ messages in thread
From: John Garry @ 2021-01-18  9:24 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: joro, will, linux-kernel, linuxarm, iommu, robin.murphy

On 15/01/2021 17:30, Jean-Philippe Brucker wrote:
> On Thu, Dec 10, 2020 at 02:23:08AM +0800, John Garry wrote:
>> A similar crash to the following could be observed if initial CPU rcache
>> magazine allocations fail in init_iova_rcaches():
> 

thanks for having a look

> Any idea why that's happening?  This fix seems ok but if we're expecting
> allocation failures for the loaded magazine then we could easily get it
> for cpu_rcaches too, and get a similar abort at runtime.

It's not specifically that we expect them (allocation failures for the 
loaded magazine), rather we should make safe against it.

So could you be more specific in your concern for the cpu_rcache 
failure? cpu_rcache magazine assignment comes from this logic.

Anyway, logic like "if not full" or "if not empty" is poor as the 
outcome for NULL is ambiguous (maybe there's a better word) and the code 
is not safe against it, and so I replace with "if space" or "if have an 
IOVA", respectively.

Thanks,
John

> 
> Thanks,
> Jean
> 
>> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>> Mem abort info:
>>
>>    free_iova_fast+0xfc/0x280
>>    iommu_dma_free_iova+0x64/0x70
>>    __iommu_dma_unmap+0x9c/0xf8
>>    iommu_dma_unmap_sg+0xa8/0xc8
>>    dma_unmap_sg_attrs+0x28/0x50
>>    cq_thread_v3_hw+0x2dc/0x528
>>    irq_thread_fn+0x2c/0xa0
>>    irq_thread+0x130/0x1e0
>>    kthread+0x154/0x158
>>    ret_from_fork+0x10/0x34
>>
>> The issue is that expression !iova_magazine_full(NULL) evaluates true; this
>> falls over in __iova_rcache_insert() when we attempt to cache a mag and
>> cpu_rcache->loaded == NULL:
>>
>> if (!iova_magazine_full(cpu_rcache->loaded)) {
>> 	can_insert = true;
>> ...
>>
>> if (can_insert)
>> 	iova_magazine_push(cpu_rcache->loaded, iova_pfn);
>>
>> As above, can_insert is evaluated true, which it shouldn't be, and we try
>> to insert pfns in a NULL mag, which is not safe.
>>
>> To avoid this, stop using double-negatives, like !iova_magazine_full() and
>> !iova_magazine_empty(), and use positive tests, like
>> iova_magazine_has_space() and iova_magazine_has_pfns(), respectively; these
>> can safely deal with cpu_rcache->{loaded, prev} = NULL.
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
>> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
> 
>> ---
>>   drivers/iommu/iova.c | 29 +++++++++++++++++------------
>>   1 file changed, 17 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index cf1aacda2fe4..732ee687e0e2 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -767,14 +767,18 @@ iova_magazine_free_pfns(struct iova_magazine *mag, struct iova_domain *iovad)
>>   	mag->size = 0;
>>   }
>>   
>> -static bool iova_magazine_full(struct iova_magazine *mag)
>> +static bool iova_magazine_has_space(struct iova_magazine *mag)
>>   {
>> -	return (mag && mag->size == IOVA_MAG_SIZE);
>> +	if (!mag)
>> +		return false;
>> +	return mag->size < IOVA_MAG_SIZE;
>>   }
>>   
>> -static bool iova_magazine_empty(struct iova_magazine *mag)
>> +static bool iova_magazine_has_pfns(struct iova_magazine *mag)
>>   {
>> -	return (!mag || mag->size == 0);
>> +	if (!mag)
>> +		return false;
>> +	return mag->size;
>>   }
>>   
>>   static unsigned long iova_magazine_pop(struct iova_magazine *mag,
>> @@ -783,7 +787,7 @@ static unsigned long iova_magazine_pop(struct iova_magazine *mag,
>>   	int i;
>>   	unsigned long pfn;
>>   
>> -	BUG_ON(iova_magazine_empty(mag));
>> +	BUG_ON(!iova_magazine_has_pfns(mag));
>>   
>>   	/* Only fall back to the rbtree if we have no suitable pfns at all */
>>   	for (i = mag->size - 1; mag->pfns[i] > limit_pfn; i--)
>> @@ -799,7 +803,7 @@ static unsigned long iova_magazine_pop(struct iova_magazine *mag,
>>   
>>   static void iova_magazine_push(struct iova_magazine *mag, unsigned long pfn)
>>   {
>> -	BUG_ON(iova_magazine_full(mag));
>> +	BUG_ON(!iova_magazine_has_space(mag));
>>   
>>   	mag->pfns[mag->size++] = pfn;
>>   }
>> @@ -845,9 +849,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
>>   	cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
>>   	spin_lock_irqsave(&cpu_rcache->lock, flags);
>>   
>> -	if (!iova_magazine_full(cpu_rcache->loaded)) {
>> +	if (iova_magazine_has_space(cpu_rcache->loaded)) {
>>   		can_insert = true;
>> -	} else if (!iova_magazine_full(cpu_rcache->prev)) {
>> +	} else if (iova_magazine_has_space(cpu_rcache->prev)) {
>>   		swap(cpu_rcache->prev, cpu_rcache->loaded);
>>   		can_insert = true;
>>   	} else {
>> @@ -856,8 +860,9 @@ static bool __iova_rcache_insert(struct iova_domain *iovad,
>>   		if (new_mag) {
>>   			spin_lock(&rcache->lock);
>>   			if (rcache->depot_size < MAX_GLOBAL_MAGS) {
>> -				rcache->depot[rcache->depot_size++] =
>> -						cpu_rcache->loaded;
>> +				if (cpu_rcache->loaded)
>> +					rcache->depot[rcache->depot_size++] =
>> +							cpu_rcache->loaded;
>>   			} else {
>>   				mag_to_free = cpu_rcache->loaded;
>>   			}
>> @@ -908,9 +913,9 @@ static unsigned long __iova_rcache_get(struct iova_rcache *rcache,
>>   	cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
>>   	spin_lock_irqsave(&cpu_rcache->lock, flags);
>>   
>> -	if (!iova_magazine_empty(cpu_rcache->loaded)) {
>> +	if (iova_magazine_has_pfns(cpu_rcache->loaded)) {
>>   		has_pfn = true;
>> -	} else if (!iova_magazine_empty(cpu_rcache->prev)) {
>> +	} else if (iova_magazine_has_pfns(cpu_rcache->prev)) {
>>   		swap(cpu_rcache->prev, cpu_rcache->loaded);
>>   		has_pfn = true;
>>   	} else {
>> -- 
>> 2.26.2
>>
>> _______________________________________________
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> .
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers
  2021-01-18  9:24     ` John Garry
@ 2021-01-18 10:08       ` Jean-Philippe Brucker
  2021-01-18 10:55         ` John Garry
  0 siblings, 1 reply; 15+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-18 10:08 UTC (permalink / raw)
  To: John Garry; +Cc: joro, will, linux-kernel, linuxarm, iommu, robin.murphy

On Mon, Jan 18, 2021 at 09:24:17AM +0000, John Garry wrote:
> On 15/01/2021 17:30, Jean-Philippe Brucker wrote:
> > On Thu, Dec 10, 2020 at 02:23:08AM +0800, John Garry wrote:
> > > A similar crash to the following could be observed if initial CPU rcache
> > > magazine allocations fail in init_iova_rcaches():
> > 
> 
> thanks for having a look
> 
> > Any idea why that's happening?  This fix seems ok but if we're expecting
> > allocation failures for the loaded magazine then we could easily get it
> > for cpu_rcaches too, and get a similar abort at runtime.
> 
> It's not specifically that we expect them (allocation failures for the
> loaded magazine), rather we should make safe against it.
> 
> So could you be more specific in your concern for the cpu_rcache failure?
> cpu_rcache magazine assignment comes from this logic.

If this fails:

drivers/iommu/iova.c:847: rcache->cpu_rcaches = __alloc_percpu(sizeof(*cpu_rcache), cache_line_size());

then we'll get an Oops in __iova_rcache_get(). So if we're making the
module safer against magazine allocation failure, shouldn't we also
protect against cpu_rcaches allocation failure?

Thanks,
Jean

> 
> Anyway, logic like "if not full" or "if not empty" is poor as the outcome
> for NULL is ambiguous (maybe there's a better word) and the code is not safe
> against it, and so I replace with "if space" or "if have an IOVA",
> respectively.
> 
> Thanks,
> John
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers
  2021-01-18 10:08       ` Jean-Philippe Brucker
@ 2021-01-18 10:55         ` John Garry
  2021-01-18 12:40           ` Jean-Philippe Brucker
  2021-01-18 12:59           ` Robin Murphy
  0 siblings, 2 replies; 15+ messages in thread
From: John Garry @ 2021-01-18 10:55 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: joro, will, linux-kernel, linuxarm, iommu, robin.murphy

On 18/01/2021 10:08, Jean-Philippe Brucker wrote:
>>> Any idea why that's happening?  This fix seems ok but if we're expecting
>>> allocation failures for the loaded magazine then we could easily get it
>>> for cpu_rcaches too, and get a similar abort at runtime.
>> It's not specifically that we expect them (allocation failures for the
>> loaded magazine), rather we should make safe against it.
>>
>> So could you be more specific in your concern for the cpu_rcache failure?
>> cpu_rcache magazine assignment comes from this logic.
> If this fails:
> 
> drivers/iommu/iova.c:847: rcache->cpu_rcaches = __alloc_percpu(sizeof(*cpu_rcache), cache_line_size());
> 
> then we'll get an Oops in __iova_rcache_get(). So if we're making the
> module safer against magazine allocation failure, shouldn't we also
> protect against cpu_rcaches allocation failure?

Ah, gotcha. So we have the WARN there, but that's not much use as this 
would still crash, as you say.

So maybe we can embed the cpu rcaches in iova_domain struct, to avoid 
the separate (failable) cpu rcache allocation.

Alternatively, we could add NULL checks __iova_rcache_get() et al for 
this allocation failure but that's not preferable as it's fastpath.

Finally so we could pass back an error code from init_iova_rcache() to 
its only caller, init_iova_domain(); but that has multiple callers and 
would need to be fixed up.

Not sure which is best or on other options.

Thanks,
John

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 3/3] iommu/iova: Flush CPU rcache for when a depot fills
  2021-01-15 19:21     ` Robin Murphy
@ 2021-01-18 12:38       ` John Garry
  0 siblings, 0 replies; 15+ messages in thread
From: John Garry @ 2021-01-18 12:38 UTC (permalink / raw)
  To: Robin Murphy, Jean-Philippe Brucker
  Cc: joro, will, linux-kernel, linuxarm, iommu

On 15/01/2021 19:21, Robin Murphy wrote:
>>
>> It would be good to understand why the rcache doesn't stabilize. Could be
>> a bug, or just need some tuning
>>
>> In strict mode, if a driver does Alloc-Free-Alloc and the first alloc
>> misses the rcache, the second allocation hits it. The same sequence in
>> non-strict mode misses the cache twice, because the IOVA is added to the
>> flush queue on Free.
>>
>> So rather than AFAFAF.. we get AAA..FFF.., only once the fq_timer 
>> triggers
>> or the FQ is full.

Sounds right

>> Interestingly the FQ size is 2x IOVA_MAG_SIZE, so we
>> could allocate 2 magazines worth of fresh IOVAs before alloc starts
>> hitting the cache. If a job allocates more than that, some magazines are
>> going to the depot, and with multi-CPU jobs those will get used on other
>> CPUs during the next alloc bursts, causing the progressive increase in
>> rcache consumption. I wonder if setting IOVA_MAG_SIZE > IOVA_FQ_SIZE 
>> helps
>> reuse of IOVAs?

Looking back through the lore history, I don't know where the 
IOVA_FQ_SIZE = 256 came from. I guess it's size of 2x IOVA_MAG_SIZE (1x 
for loaded and 1x for prev) for the reason you mention.

>>
>> Then again I haven't worked out the details, might be entirely wrong. 
>> I'll
>> have another look next week.
> 

cheers

> I did start digging into the data (thanks for that!) before Christmas, 
> but between being generally frazzled and trying to remember how to write 
> Perl to massage the numbers out of the log dump I never got round to 
> responding, sorry.

As you may have seen:
https://raw.githubusercontent.com/hisilicon/kernel-dev/064c4dc8869b3f2ad07edffceafde0b129f276b0/lsi3008_dmesg

I had to change some block configs via sysfs to ever get IOVA locations 
for size > 0. And even then, I still got none bigger than 
IOVA_RANGE_CACHE_MAX_SIZE.

Note: For a log like:
[13175.361915] print_iova2 iova_allocs(=5000000 ... too_big=47036

47036 is number of IOVA size > IOVA_RANGE_CACHE_MAX_SIZE, in case it was 
not clear.

And I never hit the critical point of a depot bin filling, but it may 
just take even longer.

However with IOVA size = 0 always occurring, then I noticed that the 
depot size = 0 bin fills up relatively quickly. As such, I am now 
slightly skeptical of the approach I have taken here, i.e purge the 
whole rcache.

> 
> The partial thoughts that I can recall right now are firstly that the 
> total numbers of IOVAs are actually pretty meaningless, it really needs 
> to be broken down by size (that's where my Perl-hacking stalled...); 
> secondly that the pattern is far more than just a steady increase - the 
> CPU rcache count looks to be heading asymptotically towards ~65K IOVAs 
> all the time, representing (IIRC) two sizes being in heavy rotation, 
> while the depot is happily ticking along in a steady state as expected, 
> until it suddenly explodes out of nowhere; thirdly, I'd really like to 
> see instrumentation of the flush queues at the same time, since I think 
> they're the real culprit.
> 
> My theory so far is that everyone is calling queue_iova() frequently 
> enough to keep the timer at bay and their own queues drained. Then at 
> the ~16H mark, *something* happens that pauses unmaps long enough for 
> the timer to fire, and at that point all hell breaks loose.

So do you think that the freeing the IOVA magazines when the depot fills 
is the cause of this? That was our analysis.

> The depot is 
> suddenly flooded with IOVAs of *all* sizes, indicative of all the queues 
> being flushed at once (note that the two most common sizes have been 
> hovering perilously close to "full" the whole time), but then, 
> crucially, *that keeps happening*. My guess is that the load of 
> fq_flush_timeout() slows things down enough that the the timer then 
> keeps getting the chance to expire and repeat the situation.

Not sure on that one.

> 
> The main conclusion I draw from this is the same one that was my initial 
> gut feeling; that MAX_GLOBAL_MAGS = 32 is utter bollocks.

Yeah, I tend to agree with that. Or, more specifically, how things work 
today is broken, and MAX_GLOBAL_MAGS = 32 is very much involved with that.

> The CPU rcache 
> capacity scales with the number of CPUs; the flush queue capacity scales 
> with the number of CPUs; it is nonsensical that the depot size does not 
> correspondingly scale with the number of CPUs (I note that the testing 
> on the original patchset cites a 16-CPU system, where that depot 
> capacity is conveniently equal to the total rcache capacity).
> 
> Now yes, purging the rcaches when the depot is full does indeed help 
> mitigate this scenario - I assume it provides enough of a buffer where 
> the regular free_iova_fast() calls don't hit queue_iova() for a while 
> (and gives fq_ring_free() some reprieve on the CPU handling the 
> timeout), giving enough leeway for the flood to finish before anyone 
> starts hitting queues/locks/etc. and stalling again, and thus break the 
> self-perpetuating timeout cycle. But that's still only a damage 
> limitation exercise! It's planning for failure to just lie down and 
> assume that the depot is going to be full if fq_flush_timeout() ever 
> fires because it's something like an order of magnitude smaller than the 
> flush queue capacity (even for a uniform distribution of IOVA sizes) on 
> super-large systems.
> 
> I'm honestly tempted to move my position further towards a hard NAK on 
> this approach, because all the evidence so far points to it being a 
> bodge around a clear and easily-fixed scalability oversight. At the very 
> least I'd now want to hear a reasoned justification for why you want to 
> keep the depot at an arbitrary fixed size while the whole rest of the 
> system scales 


>(I'm assuming that since my previous suggestion to try 
> changes in that area seems to have been ignored).
> 

So I said that it should fix the problem of the throughput going through 
the floor at this 16h mark.

But we see 2x tightly coupled problems:
a. leading up to the ~16H critical point, throughput is slowly degrading 
and becomes quite unstable (not shown in the log)
For the LSI3008 card, we don't see that. But then no IOVA size > 
IOVA_RANGE_CACHE_MAX_SIZE occur there.

b. at the critical point, throughput goes through the floor

So b. should be fixed with the suggestion to have unlimited/higher depot 
max bin size, but I reckon that we would still see a. And I put that 
down to the fact that we have IOVA sizes > IOVA_RANGE_CACHE_MAX_SIZE at 
a certain rate always. As the rb tree grows over time, they become 
slower and slower to alloc+free - that's our theory. Allowing the depot 
to grow further isn’t going to help that.

Maybe Leizhen's idea to trim the rcache periodically is overall better, 
but I am concerned on implementation.

If not, then if we allow depot bin size to scale/grow, I would like to 
see more efficient handling for IOVA size > IOVA_RANGE_CACHE_MAX_SIZE.

Thanks,
John

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers
  2021-01-18 10:55         ` John Garry
@ 2021-01-18 12:40           ` Jean-Philippe Brucker
  2021-01-18 12:59           ` Robin Murphy
  1 sibling, 0 replies; 15+ messages in thread
From: Jean-Philippe Brucker @ 2021-01-18 12:40 UTC (permalink / raw)
  To: John Garry; +Cc: joro, will, linux-kernel, linuxarm, iommu, robin.murphy

On Mon, Jan 18, 2021 at 10:55:52AM +0000, John Garry wrote:
> On 18/01/2021 10:08, Jean-Philippe Brucker wrote:
> > > > Any idea why that's happening?  This fix seems ok but if we're expecting
> > > > allocation failures for the loaded magazine then we could easily get it
> > > > for cpu_rcaches too, and get a similar abort at runtime.
> > > It's not specifically that we expect them (allocation failures for the
> > > loaded magazine), rather we should make safe against it.
> > > 
> > > So could you be more specific in your concern for the cpu_rcache failure?
> > > cpu_rcache magazine assignment comes from this logic.
> > If this fails:
> > 
> > drivers/iommu/iova.c:847: rcache->cpu_rcaches = __alloc_percpu(sizeof(*cpu_rcache), cache_line_size());
> > 
> > then we'll get an Oops in __iova_rcache_get(). So if we're making the
> > module safer against magazine allocation failure, shouldn't we also
> > protect against cpu_rcaches allocation failure?
> 
> Ah, gotcha. So we have the WARN there, but that's not much use as this would
> still crash, as you say.
> 
> So maybe we can embed the cpu rcaches in iova_domain struct, to avoid the
> separate (failable) cpu rcache allocation.
> 
> Alternatively, we could add NULL checks __iova_rcache_get() et al for this
> allocation failure but that's not preferable as it's fastpath.
> 
> Finally so we could pass back an error code from init_iova_rcache() to its
> only caller, init_iova_domain(); but that has multiple callers and would
> need to be fixed up.
> 
> Not sure which is best or on other options.

I would have initially gone with option 2 which seems the simplest, but I
don't have a setup to measure that overhead

Thanks,
Jean

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers
  2021-01-18 10:55         ` John Garry
  2021-01-18 12:40           ` Jean-Philippe Brucker
@ 2021-01-18 12:59           ` Robin Murphy
  2021-01-18 15:09             ` John Garry
  1 sibling, 1 reply; 15+ messages in thread
From: Robin Murphy @ 2021-01-18 12:59 UTC (permalink / raw)
  To: John Garry, Jean-Philippe Brucker
  Cc: joro, will, linux-kernel, linuxarm, iommu

On 2021-01-18 10:55, John Garry wrote:
> On 18/01/2021 10:08, Jean-Philippe Brucker wrote:
>>>> Any idea why that's happening?  This fix seems ok but if we're 
>>>> expecting
>>>> allocation failures for the loaded magazine then we could easily get it
>>>> for cpu_rcaches too, and get a similar abort at runtime.
>>> It's not specifically that we expect them (allocation failures for the
>>> loaded magazine), rather we should make safe against it.
>>>
>>> So could you be more specific in your concern for the cpu_rcache 
>>> failure?
>>> cpu_rcache magazine assignment comes from this logic.
>> If this fails:
>>
>> drivers/iommu/iova.c:847: rcache->cpu_rcaches = 
>> __alloc_percpu(sizeof(*cpu_rcache), cache_line_size());
>>
>> then we'll get an Oops in __iova_rcache_get(). So if we're making the
>> module safer against magazine allocation failure, shouldn't we also
>> protect against cpu_rcaches allocation failure?
> 
> Ah, gotcha. So we have the WARN there, but that's not much use as this 
> would still crash, as you say.
> 
> So maybe we can embed the cpu rcaches in iova_domain struct, to avoid 
> the separate (failable) cpu rcache allocation.

Is that even possible? The size of percpu data isn't known at compile 
time, so at best it would add ugly runtime complexity to any allocation 
of a struct iova_domain by itself, but worse than that it means that 
embedding iova_domain in any other structure becomes completely broken, no?

Robin.

> Alternatively, we could add NULL checks __iova_rcache_get() et al for 
> this allocation failure but that's not preferable as it's fastpath.
> 
> Finally so we could pass back an error code from init_iova_rcache() to 
> its only caller, init_iova_domain(); but that has multiple callers and 
> would need to be fixed up.
> 
> Not sure which is best or on other options.
> 
> Thanks,
> John

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers
  2021-01-18 12:59           ` Robin Murphy
@ 2021-01-18 15:09             ` John Garry
  0 siblings, 0 replies; 15+ messages in thread
From: John Garry @ 2021-01-18 15:09 UTC (permalink / raw)
  To: Robin Murphy, Jean-Philippe Brucker
  Cc: joro, will, linux-kernel, linuxarm, iommu

On 18/01/2021 12:59, Robin Murphy wrote:
>>>>> for cpu_rcaches too, and get a similar abort at runtime.
>>>> It's not specifically that we expect them (allocation failures for the
>>>> loaded magazine), rather we should make safe against it.
>>>>
>>>> So could you be more specific in your concern for the cpu_rcache 
>>>> failure?
>>>> cpu_rcache magazine assignment comes from this logic.
>>> If this fails:
>>>
>>> drivers/iommu/iova.c:847: rcache->cpu_rcaches = 
>>> __alloc_percpu(sizeof(*cpu_rcache), cache_line_size());
>>>
>>> then we'll get an Oops in __iova_rcache_get(). So if we're making the
>>> module safer against magazine allocation failure, shouldn't we also
>>> protect against cpu_rcaches allocation failure?
>>
>> Ah, gotcha. So we have the WARN there, but that's not much use as this 
>> would still crash, as you say.
>>
>> So maybe we can embed the cpu rcaches in iova_domain struct, to avoid 
>> the separate (failable) cpu rcache allocation.
> 
> Is that even possible? The size of percpu data isn't known at compile 
> time, so at best it would add ugly runtime complexity to any allocation 
> of a struct iova_domain by itself, but worse than that it means that 
> embedding iova_domain in any other structure becomes completely broken, no?

Ah, now I see that it's not possible. I was thinking of using 
DEFINE_PER_CPU(), but it's not permitted.

So even though this patch saves us from cpu_rcache->loaded / ->prev == 
NULL, I still prefer not to add explicit checks for cpu_rcache == NULL 
in the IOVA alloc/free paths, and would rather pass an error back in 
init_iova_rcaches(), but adding code for tidy-up looks messy.

Thanks,
John

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-01-18 15:20 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-09 18:23 [PATCH v4 0/3] iommu/iova: Solve longterm IOVA issue John Garry
2020-12-09 18:23 ` [PATCH v4 1/3] iommu/iova: Add free_all_cpu_cached_iovas() John Garry
2021-01-15 17:28   ` Jean-Philippe Brucker
2020-12-09 18:23 ` [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers John Garry
2021-01-15 17:30   ` Jean-Philippe Brucker
2021-01-18  9:24     ` John Garry
2021-01-18 10:08       ` Jean-Philippe Brucker
2021-01-18 10:55         ` John Garry
2021-01-18 12:40           ` Jean-Philippe Brucker
2021-01-18 12:59           ` Robin Murphy
2021-01-18 15:09             ` John Garry
2020-12-09 18:23 ` [PATCH v4 3/3] iommu/iova: Flush CPU rcache for when a depot fills John Garry
2021-01-15 17:32   ` Jean-Philippe Brucker
2021-01-15 19:21     ` Robin Murphy
2021-01-18 12:38       ` John Garry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).