All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 1/2] iommu/iova: Retry from last rb tree node if iova search fails
@ 2020-09-30  7:44 ` vjitta
  0 siblings, 0 replies; 18+ messages in thread
From: vjitta @ 2020-09-30  7:44 UTC (permalink / raw)
  To: joro, iommu, linux-kernel; +Cc: vinmenon, kernel-team, vjitta, robin.murphy

From: Vijayanand Jitta <vjitta@codeaurora.org>

When ever a new iova alloc request comes iova is always searched
from the cached node and the nodes which are previous to cached
node. So, even if there is free iova space available in the nodes
which are next to the cached node iova allocation can still fail
because of this approach.

Consider the following sequence of iova alloc and frees on
1GB of iova space

1) alloc - 500MB
2) alloc - 12MB
3) alloc - 499MB
4) free -  12MB which was allocated in step 2
5) alloc - 13MB

After the above sequence we will have 12MB of free iova space and
cached node will be pointing to the iova pfn of last alloc of 13MB
which will be the lowest iova pfn of that iova space. Now if we get an
alloc request of 2MB we just search from cached node and then look
for lower iova pfn's for free iova and as they aren't any, iova alloc
fails though there is 12MB of free iova space.

To avoid such iova search failures do a retry from the last rb tree node
when iova search fails, this will search the entire tree and get an iova
if its available.

Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/iova.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 30d969a..c3a1a8e 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -184,8 +184,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 	struct rb_node *curr, *prev;
 	struct iova *curr_iova;
 	unsigned long flags;
-	unsigned long new_pfn;
+	unsigned long new_pfn, retry_pfn;
 	unsigned long align_mask = ~0UL;
+	unsigned long high_pfn = limit_pfn, low_pfn = iovad->start_pfn;
 
 	if (size_aligned)
 		align_mask <<= fls_long(size - 1);
@@ -198,15 +199,25 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 
 	curr = __get_cached_rbnode(iovad, limit_pfn);
 	curr_iova = rb_entry(curr, struct iova, node);
+	retry_pfn = curr_iova->pfn_hi + 1;
+
+retry:
 	do {
-		limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
-		new_pfn = (limit_pfn - size) & align_mask;
+		high_pfn = min(high_pfn, curr_iova->pfn_lo);
+		new_pfn = (high_pfn - size) & align_mask;
 		prev = curr;
 		curr = rb_prev(curr);
 		curr_iova = rb_entry(curr, struct iova, node);
-	} while (curr && new_pfn <= curr_iova->pfn_hi);
-
-	if (limit_pfn < size || new_pfn < iovad->start_pfn) {
+	} while (curr && new_pfn <= curr_iova->pfn_hi && new_pfn >= low_pfn);
+
+	if (high_pfn < size || new_pfn < low_pfn) {
+		if (low_pfn == iovad->start_pfn && retry_pfn < limit_pfn) {
+			high_pfn = limit_pfn;
+			low_pfn = retry_pfn;
+			curr = &iovad->anchor.node;
+			curr_iova = rb_entry(curr, struct iova, node);
+			goto retry;
+		}
 		iovad->max32_alloc_size = size;
 		goto iova32_full;
 	}
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
2.7.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 1/2] iommu/iova: Retry from last rb tree node if iova search fails
@ 2020-09-30  7:44 ` vjitta
  0 siblings, 0 replies; 18+ messages in thread
From: vjitta @ 2020-09-30  7:44 UTC (permalink / raw)
  To: joro, iommu, linux-kernel; +Cc: vjitta, robin.murphy, vinmenon, kernel-team

From: Vijayanand Jitta <vjitta@codeaurora.org>

When ever a new iova alloc request comes iova is always searched
from the cached node and the nodes which are previous to cached
node. So, even if there is free iova space available in the nodes
which are next to the cached node iova allocation can still fail
because of this approach.

Consider the following sequence of iova alloc and frees on
1GB of iova space

1) alloc - 500MB
2) alloc - 12MB
3) alloc - 499MB
4) free -  12MB which was allocated in step 2
5) alloc - 13MB

After the above sequence we will have 12MB of free iova space and
cached node will be pointing to the iova pfn of last alloc of 13MB
which will be the lowest iova pfn of that iova space. Now if we get an
alloc request of 2MB we just search from cached node and then look
for lower iova pfn's for free iova and as they aren't any, iova alloc
fails though there is 12MB of free iova space.

To avoid such iova search failures do a retry from the last rb tree node
when iova search fails, this will search the entire tree and get an iova
if its available.

Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/iova.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 30d969a..c3a1a8e 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -184,8 +184,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 	struct rb_node *curr, *prev;
 	struct iova *curr_iova;
 	unsigned long flags;
-	unsigned long new_pfn;
+	unsigned long new_pfn, retry_pfn;
 	unsigned long align_mask = ~0UL;
+	unsigned long high_pfn = limit_pfn, low_pfn = iovad->start_pfn;
 
 	if (size_aligned)
 		align_mask <<= fls_long(size - 1);
@@ -198,15 +199,25 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
 
 	curr = __get_cached_rbnode(iovad, limit_pfn);
 	curr_iova = rb_entry(curr, struct iova, node);
+	retry_pfn = curr_iova->pfn_hi + 1;
+
+retry:
 	do {
-		limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
-		new_pfn = (limit_pfn - size) & align_mask;
+		high_pfn = min(high_pfn, curr_iova->pfn_lo);
+		new_pfn = (high_pfn - size) & align_mask;
 		prev = curr;
 		curr = rb_prev(curr);
 		curr_iova = rb_entry(curr, struct iova, node);
-	} while (curr && new_pfn <= curr_iova->pfn_hi);
-
-	if (limit_pfn < size || new_pfn < iovad->start_pfn) {
+	} while (curr && new_pfn <= curr_iova->pfn_hi && new_pfn >= low_pfn);
+
+	if (high_pfn < size || new_pfn < low_pfn) {
+		if (low_pfn == iovad->start_pfn && retry_pfn < limit_pfn) {
+			high_pfn = limit_pfn;
+			low_pfn = retry_pfn;
+			curr = &iovad->anchor.node;
+			curr_iova = rb_entry(curr, struct iova, node);
+			goto retry;
+		}
 		iovad->max32_alloc_size = size;
 		goto iova32_full;
 	}
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure
  2020-09-30  7:44 ` vjitta
@ 2020-09-30  7:44   ` vjitta
  -1 siblings, 0 replies; 18+ messages in thread
From: vjitta @ 2020-09-30  7:44 UTC (permalink / raw)
  To: joro, iommu, linux-kernel; +Cc: vinmenon, kernel-team, vjitta, robin.murphy

From: Vijayanand Jitta <vjitta@codeaurora.org>

When ever an iova alloc request fails we free the iova
ranges present in the percpu iova rcaches and then retry
but the global iova rcache is not freed as a result we could
still see iova alloc failure even after retry as global
rcache is holding the iova's which can cause fragmentation.
So, free the global iova rcache as well and then go for the
retry.

Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
---
 drivers/iommu/iova.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index c3a1a8e..faf9b13 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain *iovad);
 static void free_iova_rcaches(struct iova_domain *iovad);
 static void fq_destroy_all_entries(struct iova_domain *iovad);
 static void fq_flush_timeout(struct timer_list *t);
+static void free_global_cached_iovas(struct iova_domain *iovad);
 
 void
 init_iova_domain(struct iova_domain *iovad, unsigned long granule,
@@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
 		flush_rcache = false;
 		for_each_online_cpu(cpu)
 			free_cpu_cached_iovas(cpu, iovad);
+		free_global_cached_iovas(iovad);
 		goto retry;
 	}
 
@@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
 	}
 }
 
+/*
+ * free all the IOVA ranges of global cache
+ */
+static void free_global_cached_iovas(struct iova_domain *iovad)
+{
+	struct iova_rcache *rcache;
+	unsigned long flags;
+	int i, j;
+
+	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+		rcache = &iovad->rcaches[i];
+		spin_lock_irqsave(&rcache->lock, flags);
+		for (j = 0; j < rcache->depot_size; ++j) {
+			iova_magazine_free_pfns(rcache->depot[j], iovad);
+			iova_magazine_free(rcache->depot[j]);
+			rcache->depot[j] = NULL;
+		}
+		rcache->depot_size = 0;
+		spin_unlock_irqrestore(&rcache->lock, flags);
+	}
+}
 MODULE_AUTHOR("Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>");
 MODULE_LICENSE("GPL");
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
2.7.4


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure
@ 2020-09-30  7:44   ` vjitta
  0 siblings, 0 replies; 18+ messages in thread
From: vjitta @ 2020-09-30  7:44 UTC (permalink / raw)
  To: joro, iommu, linux-kernel; +Cc: vjitta, robin.murphy, vinmenon, kernel-team

From: Vijayanand Jitta <vjitta@codeaurora.org>

When ever an iova alloc request fails we free the iova
ranges present in the percpu iova rcaches and then retry
but the global iova rcache is not freed as a result we could
still see iova alloc failure even after retry as global
rcache is holding the iova's which can cause fragmentation.
So, free the global iova rcache as well and then go for the
retry.

Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
---
 drivers/iommu/iova.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index c3a1a8e..faf9b13 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain *iovad);
 static void free_iova_rcaches(struct iova_domain *iovad);
 static void fq_destroy_all_entries(struct iova_domain *iovad);
 static void fq_flush_timeout(struct timer_list *t);
+static void free_global_cached_iovas(struct iova_domain *iovad);
 
 void
 init_iova_domain(struct iova_domain *iovad, unsigned long granule,
@@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
 		flush_rcache = false;
 		for_each_online_cpu(cpu)
 			free_cpu_cached_iovas(cpu, iovad);
+		free_global_cached_iovas(iovad);
 		goto retry;
 	}
 
@@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
 	}
 }
 
+/*
+ * free all the IOVA ranges of global cache
+ */
+static void free_global_cached_iovas(struct iova_domain *iovad)
+{
+	struct iova_rcache *rcache;
+	unsigned long flags;
+	int i, j;
+
+	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+		rcache = &iovad->rcaches[i];
+		spin_lock_irqsave(&rcache->lock, flags);
+		for (j = 0; j < rcache->depot_size; ++j) {
+			iova_magazine_free_pfns(rcache->depot[j], iovad);
+			iova_magazine_free(rcache->depot[j]);
+			rcache->depot[j] = NULL;
+		}
+		rcache->depot_size = 0;
+		spin_unlock_irqrestore(&rcache->lock, flags);
+	}
+}
 MODULE_AUTHOR("Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>");
 MODULE_LICENSE("GPL");
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/2] iommu/iova: Retry from last rb tree node if iova search fails
  2020-09-30  7:44 ` vjitta
@ 2020-10-20  9:17   ` Vijayanand Jitta
  -1 siblings, 0 replies; 18+ messages in thread
From: Vijayanand Jitta @ 2020-10-20  9:17 UTC (permalink / raw)
  To: joro, iommu, linux-kernel; +Cc: vinmenon, kernel-team, robin.murphy



On 9/30/2020 1:14 PM, vjitta@codeaurora.org wrote:
> From: Vijayanand Jitta <vjitta@codeaurora.org>
> 
> When ever a new iova alloc request comes iova is always searched
> from the cached node and the nodes which are previous to cached
> node. So, even if there is free iova space available in the nodes
> which are next to the cached node iova allocation can still fail
> because of this approach.
> 
> Consider the following sequence of iova alloc and frees on
> 1GB of iova space
> 
> 1) alloc - 500MB
> 2) alloc - 12MB
> 3) alloc - 499MB
> 4) free -  12MB which was allocated in step 2
> 5) alloc - 13MB
> 
> After the above sequence we will have 12MB of free iova space and
> cached node will be pointing to the iova pfn of last alloc of 13MB
> which will be the lowest iova pfn of that iova space. Now if we get an
> alloc request of 2MB we just search from cached node and then look
> for lower iova pfn's for free iova and as they aren't any, iova alloc
> fails though there is 12MB of free iova space.
> 
> To avoid such iova search failures do a retry from the last rb tree node
> when iova search fails, this will search the entire tree and get an iova
> if its available.
> 
> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/iova.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 30d969a..c3a1a8e 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -184,8 +184,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
>  	struct rb_node *curr, *prev;
>  	struct iova *curr_iova;
>  	unsigned long flags;
> -	unsigned long new_pfn;
> +	unsigned long new_pfn, retry_pfn;
>  	unsigned long align_mask = ~0UL;
> +	unsigned long high_pfn = limit_pfn, low_pfn = iovad->start_pfn;
>  
>  	if (size_aligned)
>  		align_mask <<= fls_long(size - 1);
> @@ -198,15 +199,25 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
>  
>  	curr = __get_cached_rbnode(iovad, limit_pfn);
>  	curr_iova = rb_entry(curr, struct iova, node);
> +	retry_pfn = curr_iova->pfn_hi + 1;
> +
> +retry:
>  	do {
> -		limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
> -		new_pfn = (limit_pfn - size) & align_mask;
> +		high_pfn = min(high_pfn, curr_iova->pfn_lo);
> +		new_pfn = (high_pfn - size) & align_mask;
>  		prev = curr;
>  		curr = rb_prev(curr);
>  		curr_iova = rb_entry(curr, struct iova, node);
> -	} while (curr && new_pfn <= curr_iova->pfn_hi);
> -
> -	if (limit_pfn < size || new_pfn < iovad->start_pfn) {
> +	} while (curr && new_pfn <= curr_iova->pfn_hi && new_pfn >= low_pfn);
> +
> +	if (high_pfn < size || new_pfn < low_pfn) {
> +		if (low_pfn == iovad->start_pfn && retry_pfn < limit_pfn) {
> +			high_pfn = limit_pfn;
> +			low_pfn = retry_pfn;
> +			curr = &iovad->anchor.node;
> +			curr_iova = rb_entry(curr, struct iova, node);
> +			goto retry;
> +		}
>  		iovad->max32_alloc_size = size;
>  		goto iova32_full;
>  	}
> 

Gentle ping.

Thanks,
Vijay
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/2] iommu/iova: Retry from last rb tree node if iova search fails
@ 2020-10-20  9:17   ` Vijayanand Jitta
  0 siblings, 0 replies; 18+ messages in thread
From: Vijayanand Jitta @ 2020-10-20  9:17 UTC (permalink / raw)
  To: joro, iommu, linux-kernel; +Cc: robin.murphy, vinmenon, kernel-team



On 9/30/2020 1:14 PM, vjitta@codeaurora.org wrote:
> From: Vijayanand Jitta <vjitta@codeaurora.org>
> 
> When ever a new iova alloc request comes iova is always searched
> from the cached node and the nodes which are previous to cached
> node. So, even if there is free iova space available in the nodes
> which are next to the cached node iova allocation can still fail
> because of this approach.
> 
> Consider the following sequence of iova alloc and frees on
> 1GB of iova space
> 
> 1) alloc - 500MB
> 2) alloc - 12MB
> 3) alloc - 499MB
> 4) free -  12MB which was allocated in step 2
> 5) alloc - 13MB
> 
> After the above sequence we will have 12MB of free iova space and
> cached node will be pointing to the iova pfn of last alloc of 13MB
> which will be the lowest iova pfn of that iova space. Now if we get an
> alloc request of 2MB we just search from cached node and then look
> for lower iova pfn's for free iova and as they aren't any, iova alloc
> fails though there is 12MB of free iova space.
> 
> To avoid such iova search failures do a retry from the last rb tree node
> when iova search fails, this will search the entire tree and get an iova
> if its available.
> 
> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/iova.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 30d969a..c3a1a8e 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -184,8 +184,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
>  	struct rb_node *curr, *prev;
>  	struct iova *curr_iova;
>  	unsigned long flags;
> -	unsigned long new_pfn;
> +	unsigned long new_pfn, retry_pfn;
>  	unsigned long align_mask = ~0UL;
> +	unsigned long high_pfn = limit_pfn, low_pfn = iovad->start_pfn;
>  
>  	if (size_aligned)
>  		align_mask <<= fls_long(size - 1);
> @@ -198,15 +199,25 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad,
>  
>  	curr = __get_cached_rbnode(iovad, limit_pfn);
>  	curr_iova = rb_entry(curr, struct iova, node);
> +	retry_pfn = curr_iova->pfn_hi + 1;
> +
> +retry:
>  	do {
> -		limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
> -		new_pfn = (limit_pfn - size) & align_mask;
> +		high_pfn = min(high_pfn, curr_iova->pfn_lo);
> +		new_pfn = (high_pfn - size) & align_mask;
>  		prev = curr;
>  		curr = rb_prev(curr);
>  		curr_iova = rb_entry(curr, struct iova, node);
> -	} while (curr && new_pfn <= curr_iova->pfn_hi);
> -
> -	if (limit_pfn < size || new_pfn < iovad->start_pfn) {
> +	} while (curr && new_pfn <= curr_iova->pfn_hi && new_pfn >= low_pfn);
> +
> +	if (high_pfn < size || new_pfn < low_pfn) {
> +		if (low_pfn == iovad->start_pfn && retry_pfn < limit_pfn) {
> +			high_pfn = limit_pfn;
> +			low_pfn = retry_pfn;
> +			curr = &iovad->anchor.node;
> +			curr_iova = rb_entry(curr, struct iova, node);
> +			goto retry;
> +		}
>  		iovad->max32_alloc_size = size;
>  		goto iova32_full;
>  	}
> 

Gentle ping.

Thanks,
Vijay
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of Code Aurora Forum, hosted by The Linux Foundation
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure
  2020-09-30  7:44   ` vjitta
@ 2020-10-20  9:18     ` Vijayanand Jitta
  -1 siblings, 0 replies; 18+ messages in thread
From: Vijayanand Jitta @ 2020-10-20  9:18 UTC (permalink / raw)
  To: joro, iommu, linux-kernel; +Cc: vinmenon, kernel-team, robin.murphy



On 9/30/2020 1:14 PM, vjitta@codeaurora.org wrote:
> From: Vijayanand Jitta <vjitta@codeaurora.org>
> 
> When ever an iova alloc request fails we free the iova
> ranges present in the percpu iova rcaches and then retry
> but the global iova rcache is not freed as a result we could
> still see iova alloc failure even after retry as global
> rcache is holding the iova's which can cause fragmentation.
> So, free the global iova rcache as well and then go for the
> retry.
> 
> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
> ---
>  drivers/iommu/iova.c | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index c3a1a8e..faf9b13 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain *iovad);
>  static void free_iova_rcaches(struct iova_domain *iovad);
>  static void fq_destroy_all_entries(struct iova_domain *iovad);
>  static void fq_flush_timeout(struct timer_list *t);
> +static void free_global_cached_iovas(struct iova_domain *iovad);
>  
>  void
>  init_iova_domain(struct iova_domain *iovad, unsigned long granule,
> @@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
>  		flush_rcache = false;
>  		for_each_online_cpu(cpu)
>  			free_cpu_cached_iovas(cpu, iovad);
> +		free_global_cached_iovas(iovad);
>  		goto retry;
>  	}
>  
> @@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
>  	}
>  }
>  
> +/*
> + * free all the IOVA ranges of global cache
> + */
> +static void free_global_cached_iovas(struct iova_domain *iovad)
> +{
> +	struct iova_rcache *rcache;
> +	unsigned long flags;
> +	int i, j;
> +
> +	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +		rcache = &iovad->rcaches[i];
> +		spin_lock_irqsave(&rcache->lock, flags);
> +		for (j = 0; j < rcache->depot_size; ++j) {
> +			iova_magazine_free_pfns(rcache->depot[j], iovad);
> +			iova_magazine_free(rcache->depot[j]);
> +			rcache->depot[j] = NULL;
> +		}
> +		rcache->depot_size = 0;
> +		spin_unlock_irqrestore(&rcache->lock, flags);
> +	}
> +}
>  MODULE_AUTHOR("Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>");
>  MODULE_LICENSE("GPL");
> 

Gentle ping.

Thanks,
Vijay
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure
@ 2020-10-20  9:18     ` Vijayanand Jitta
  0 siblings, 0 replies; 18+ messages in thread
From: Vijayanand Jitta @ 2020-10-20  9:18 UTC (permalink / raw)
  To: joro, iommu, linux-kernel; +Cc: robin.murphy, vinmenon, kernel-team



On 9/30/2020 1:14 PM, vjitta@codeaurora.org wrote:
> From: Vijayanand Jitta <vjitta@codeaurora.org>
> 
> When ever an iova alloc request fails we free the iova
> ranges present in the percpu iova rcaches and then retry
> but the global iova rcache is not freed as a result we could
> still see iova alloc failure even after retry as global
> rcache is holding the iova's which can cause fragmentation.
> So, free the global iova rcache as well and then go for the
> retry.
> 
> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
> ---
>  drivers/iommu/iova.c | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index c3a1a8e..faf9b13 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain *iovad);
>  static void free_iova_rcaches(struct iova_domain *iovad);
>  static void fq_destroy_all_entries(struct iova_domain *iovad);
>  static void fq_flush_timeout(struct timer_list *t);
> +static void free_global_cached_iovas(struct iova_domain *iovad);
>  
>  void
>  init_iova_domain(struct iova_domain *iovad, unsigned long granule,
> @@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
>  		flush_rcache = false;
>  		for_each_online_cpu(cpu)
>  			free_cpu_cached_iovas(cpu, iovad);
> +		free_global_cached_iovas(iovad);
>  		goto retry;
>  	}
>  
> @@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
>  	}
>  }
>  
> +/*
> + * free all the IOVA ranges of global cache
> + */
> +static void free_global_cached_iovas(struct iova_domain *iovad)
> +{
> +	struct iova_rcache *rcache;
> +	unsigned long flags;
> +	int i, j;
> +
> +	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +		rcache = &iovad->rcaches[i];
> +		spin_lock_irqsave(&rcache->lock, flags);
> +		for (j = 0; j < rcache->depot_size; ++j) {
> +			iova_magazine_free_pfns(rcache->depot[j], iovad);
> +			iova_magazine_free(rcache->depot[j]);
> +			rcache->depot[j] = NULL;
> +		}
> +		rcache->depot_size = 0;
> +		spin_unlock_irqrestore(&rcache->lock, flags);
> +	}
> +}
>  MODULE_AUTHOR("Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>");
>  MODULE_LICENSE("GPL");
> 

Gentle ping.

Thanks,
Vijay
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of Code Aurora Forum, hosted by The Linux Foundation
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure
  2020-09-30  7:44   ` vjitta
@ 2020-11-03 12:35     ` Robin Murphy
  -1 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2020-11-03 12:35 UTC (permalink / raw)
  To: vjitta, joro, iommu, linux-kernel; +Cc: vinmenon, kernel-team

On 2020-09-30 08:44, vjitta@codeaurora.org wrote:
> From: Vijayanand Jitta <vjitta@codeaurora.org>
> 
> When ever an iova alloc request fails we free the iova
> ranges present in the percpu iova rcaches and then retry
> but the global iova rcache is not freed as a result we could
> still see iova alloc failure even after retry as global
> rcache is holding the iova's which can cause fragmentation.
> So, free the global iova rcache as well and then go for the
> retry.

This looks reasonable to me - it's mildly annoying that we end up with 
so many similar-looking functions, but the necessary differences are 
right down in the middle of the loops so nothing can reasonably be 
factored out :(

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
> ---
>   drivers/iommu/iova.c | 23 +++++++++++++++++++++++
>   1 file changed, 23 insertions(+)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index c3a1a8e..faf9b13 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain *iovad);
>   static void free_iova_rcaches(struct iova_domain *iovad);
>   static void fq_destroy_all_entries(struct iova_domain *iovad);
>   static void fq_flush_timeout(struct timer_list *t);
> +static void free_global_cached_iovas(struct iova_domain *iovad);
>   
>   void
>   init_iova_domain(struct iova_domain *iovad, unsigned long granule,
> @@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
>   		flush_rcache = false;
>   		for_each_online_cpu(cpu)
>   			free_cpu_cached_iovas(cpu, iovad);
> +		free_global_cached_iovas(iovad);
>   		goto retry;
>   	}
>   
> @@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
>   	}
>   }
>   
> +/*
> + * free all the IOVA ranges of global cache
> + */
> +static void free_global_cached_iovas(struct iova_domain *iovad)
> +{
> +	struct iova_rcache *rcache;
> +	unsigned long flags;
> +	int i, j;
> +
> +	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +		rcache = &iovad->rcaches[i];
> +		spin_lock_irqsave(&rcache->lock, flags);
> +		for (j = 0; j < rcache->depot_size; ++j) {
> +			iova_magazine_free_pfns(rcache->depot[j], iovad);
> +			iova_magazine_free(rcache->depot[j]);
> +			rcache->depot[j] = NULL;
> +		}
> +		rcache->depot_size = 0;
> +		spin_unlock_irqrestore(&rcache->lock, flags);
> +	}
> +}
>   MODULE_AUTHOR("Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>");
>   MODULE_LICENSE("GPL");
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure
@ 2020-11-03 12:35     ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2020-11-03 12:35 UTC (permalink / raw)
  To: vjitta, joro, iommu, linux-kernel; +Cc: vinmenon, kernel-team

On 2020-09-30 08:44, vjitta@codeaurora.org wrote:
> From: Vijayanand Jitta <vjitta@codeaurora.org>
> 
> When ever an iova alloc request fails we free the iova
> ranges present in the percpu iova rcaches and then retry
> but the global iova rcache is not freed as a result we could
> still see iova alloc failure even after retry as global
> rcache is holding the iova's which can cause fragmentation.
> So, free the global iova rcache as well and then go for the
> retry.

This looks reasonable to me - it's mildly annoying that we end up with 
so many similar-looking functions, but the necessary differences are 
right down in the middle of the loops so nothing can reasonably be 
factored out :(

Reviewed-by: Robin Murphy <robin.murphy@arm.com>

> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
> ---
>   drivers/iommu/iova.c | 23 +++++++++++++++++++++++
>   1 file changed, 23 insertions(+)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index c3a1a8e..faf9b13 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain *iovad);
>   static void free_iova_rcaches(struct iova_domain *iovad);
>   static void fq_destroy_all_entries(struct iova_domain *iovad);
>   static void fq_flush_timeout(struct timer_list *t);
> +static void free_global_cached_iovas(struct iova_domain *iovad);
>   
>   void
>   init_iova_domain(struct iova_domain *iovad, unsigned long granule,
> @@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size,
>   		flush_rcache = false;
>   		for_each_online_cpu(cpu)
>   			free_cpu_cached_iovas(cpu, iovad);
> +		free_global_cached_iovas(iovad);
>   		goto retry;
>   	}
>   
> @@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad)
>   	}
>   }
>   
> +/*
> + * free all the IOVA ranges of global cache
> + */
> +static void free_global_cached_iovas(struct iova_domain *iovad)
> +{
> +	struct iova_rcache *rcache;
> +	unsigned long flags;
> +	int i, j;
> +
> +	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
> +		rcache = &iovad->rcaches[i];
> +		spin_lock_irqsave(&rcache->lock, flags);
> +		for (j = 0; j < rcache->depot_size; ++j) {
> +			iova_magazine_free_pfns(rcache->depot[j], iovad);
> +			iova_magazine_free(rcache->depot[j]);
> +			rcache->depot[j] = NULL;
> +		}
> +		rcache->depot_size = 0;
> +		spin_unlock_irqrestore(&rcache->lock, flags);
> +	}
> +}
>   MODULE_AUTHOR("Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>");
>   MODULE_LICENSE("GPL");
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure
  2020-11-03 12:35     ` Robin Murphy
@ 2020-11-03 14:31       ` John Garry
  -1 siblings, 0 replies; 18+ messages in thread
From: John Garry @ 2020-11-03 14:31 UTC (permalink / raw)
  To: Robin Murphy, vjitta, joro, iommu, linux-kernel; +Cc: vinmenon, kernel-team

On 03/11/2020 12:35, Robin Murphy wrote:
> On 2020-09-30 08:44, vjitta@codeaurora.org wrote:
>> From: Vijayanand Jitta <vjitta@codeaurora.org>
>>
>> When ever an iova alloc request fails we free the iova
>> ranges present in the percpu iova rcaches and then retry
>> but the global iova rcache is not freed as a result we could
>> still see iova alloc failure even after retry as global
>> rcache is holding the iova's which can cause fragmentation.
>> So, free the global iova rcache as well and then go for the
>> retry.
> 

If we do clear all the CPU rcaches, it would nice to have something 
immediately available to replenish, i.e. use the global rcache, instead 
of flushing it, if that is not required...

> This looks reasonable to me - it's mildly annoying that we end up with 
> so many similar-looking functions,

Well I did add a function to clear all CPU rcaches here, if you would 
like to check:

https://lore.kernel.org/linux-iommu/1603733501-211004-2-git-send-email-john.garry@huawei.com/

> but the necessary differences are 
> right down in the middle of the loops so nothing can reasonably be 
> factored out :(
> 
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> 
>> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
>> ---
>>   drivers/iommu/iova.c | 23 +++++++++++++++++++++++
>>   1 file changed, 23 insertions(+)
>>
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index c3a1a8e..faf9b13 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain 
>> *iovad);
>>   static void free_iova_rcaches(struct iova_domain *iovad);
>>   static void fq_destroy_all_entries(struct iova_domain *iovad);
>>   static void fq_flush_timeout(struct timer_list *t);
>> +static void free_global_cached_iovas(struct iova_domain *iovad);

a thought: It would be great if the file could be rearranged at some 
point where we don't require so many forward declarations.

>>   void
>>   init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>> @@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, 
>> unsigned long size,
>>           flush_rcache = false;
>>           for_each_online_cpu(cpu)
>>               free_cpu_cached_iovas(cpu, iovad);
>> +        free_global_cached_iovas(iovad);
>>           goto retry;
>>       }
>> @@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, 
>> struct iova_domain *iovad)
>>       }
>>   }
>> +/*
>> + * free all the IOVA ranges of global cache
>> + */
>> +static void free_global_cached_iovas(struct iova_domain *iovad)
>> +{
>> +    struct iova_rcache *rcache;
>> +    unsigned long flags;
>> +    int i, j;
>> +
>> +    for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
>> +        rcache = &iovad->rcaches[i];
>> +        spin_lock_irqsave(&rcache->lock, flags);
>> +        for (j = 0; j < rcache->depot_size; ++j) {
>> +            iova_magazine_free_pfns(rcache->depot[j], iovad);
>> +            iova_magazine_free(rcache->depot[j]);
>> +            rcache->depot[j] = NULL;

I don't think that NULLify is strictly necessary

>> +        }
>> +        rcache->depot_size = 0;
>> +        spin_unlock_irqrestore(&rcache->lock, flags);
>> +    }
>> +}
>>   MODULE_AUTHOR("Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>");
>>   MODULE_LICENSE("GPL");
>>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> .


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure
@ 2020-11-03 14:31       ` John Garry
  0 siblings, 0 replies; 18+ messages in thread
From: John Garry @ 2020-11-03 14:31 UTC (permalink / raw)
  To: Robin Murphy, vjitta, joro, iommu, linux-kernel; +Cc: vinmenon, kernel-team

On 03/11/2020 12:35, Robin Murphy wrote:
> On 2020-09-30 08:44, vjitta@codeaurora.org wrote:
>> From: Vijayanand Jitta <vjitta@codeaurora.org>
>>
>> When ever an iova alloc request fails we free the iova
>> ranges present in the percpu iova rcaches and then retry
>> but the global iova rcache is not freed as a result we could
>> still see iova alloc failure even after retry as global
>> rcache is holding the iova's which can cause fragmentation.
>> So, free the global iova rcache as well and then go for the
>> retry.
> 

If we do clear all the CPU rcaches, it would nice to have something 
immediately available to replenish, i.e. use the global rcache, instead 
of flushing it, if that is not required...

> This looks reasonable to me - it's mildly annoying that we end up with 
> so many similar-looking functions,

Well I did add a function to clear all CPU rcaches here, if you would 
like to check:

https://lore.kernel.org/linux-iommu/1603733501-211004-2-git-send-email-john.garry@huawei.com/

> but the necessary differences are 
> right down in the middle of the loops so nothing can reasonably be 
> factored out :(
> 
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> 
>> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
>> ---
>>   drivers/iommu/iova.c | 23 +++++++++++++++++++++++
>>   1 file changed, 23 insertions(+)
>>
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index c3a1a8e..faf9b13 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain 
>> *iovad);
>>   static void free_iova_rcaches(struct iova_domain *iovad);
>>   static void fq_destroy_all_entries(struct iova_domain *iovad);
>>   static void fq_flush_timeout(struct timer_list *t);
>> +static void free_global_cached_iovas(struct iova_domain *iovad);

a thought: It would be great if the file could be rearranged at some 
point where we don't require so many forward declarations.

>>   void
>>   init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>> @@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, 
>> unsigned long size,
>>           flush_rcache = false;
>>           for_each_online_cpu(cpu)
>>               free_cpu_cached_iovas(cpu, iovad);
>> +        free_global_cached_iovas(iovad);
>>           goto retry;
>>       }
>> @@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, 
>> struct iova_domain *iovad)
>>       }
>>   }
>> +/*
>> + * free all the IOVA ranges of global cache
>> + */
>> +static void free_global_cached_iovas(struct iova_domain *iovad)
>> +{
>> +    struct iova_rcache *rcache;
>> +    unsigned long flags;
>> +    int i, j;
>> +
>> +    for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
>> +        rcache = &iovad->rcaches[i];
>> +        spin_lock_irqsave(&rcache->lock, flags);
>> +        for (j = 0; j < rcache->depot_size; ++j) {
>> +            iova_magazine_free_pfns(rcache->depot[j], iovad);
>> +            iova_magazine_free(rcache->depot[j]);
>> +            rcache->depot[j] = NULL;

I don't think that NULLify is strictly necessary

>> +        }
>> +        rcache->depot_size = 0;
>> +        spin_unlock_irqrestore(&rcache->lock, flags);
>> +    }
>> +}
>>   MODULE_AUTHOR("Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>");
>>   MODULE_LICENSE("GPL");
>>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> .

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure
  2020-11-03 14:31       ` John Garry
@ 2020-11-03 15:59         ` Robin Murphy
  -1 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2020-11-03 15:59 UTC (permalink / raw)
  To: John Garry, vjitta, joro, iommu, linux-kernel; +Cc: vinmenon, kernel-team

On 2020-11-03 14:31, John Garry wrote:
> On 03/11/2020 12:35, Robin Murphy wrote:
>> On 2020-09-30 08:44, vjitta@codeaurora.org wrote:
>>> From: Vijayanand Jitta <vjitta@codeaurora.org>
>>>
>>> When ever an iova alloc request fails we free the iova
>>> ranges present in the percpu iova rcaches and then retry
>>> but the global iova rcache is not freed as a result we could
>>> still see iova alloc failure even after retry as global
>>> rcache is holding the iova's which can cause fragmentation.
>>> So, free the global iova rcache as well and then go for the
>>> retry.
>>
> 
> If we do clear all the CPU rcaches, it would nice to have something 
> immediately available to replenish, i.e. use the global rcache, instead 
> of flushing it, if that is not required...

If we've reached the point of clearing *any* caches, though, I think any 
hope of maintaining performance is already long gone. We've walked the 
rbtree for the entire address space and found that it's still too full 
to allocate from; we're teetering on the brink of hard failure and this 
is a last-ditch attempt to claw back as much as possible in the hope 
that it gives us a usable space.

TBH I'm not entirely sure what allocation pattern was expected by the 
original code such that purging only some of the caches made sense, nor 
what kind of pattern leads to lots of smaller IOVAs being allocated, 
freed, and never reused to the point of blocking larger allocations, but 
either way the reasoning does at least seem to hold up in abstract.

>> This looks reasonable to me - it's mildly annoying that we end up with 
>> so many similar-looking functions,
> 
> Well I did add a function to clear all CPU rcaches here, if you would 
> like to check:
> 
> https://lore.kernel.org/linux-iommu/1603733501-211004-2-git-send-email-john.garry@huawei.com/ 

I was thinking more of the way free_iova_rcaches(), 
free_cpu_cached_iovas(), and free_global_cached_iovas() all look pretty 
much the same shape at a glance.

>> but the necessary differences are right down in the middle of the 
>> loops so nothing can reasonably be factored out :(
>>
>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
>>
>>> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
>>> ---
>>>   drivers/iommu/iova.c | 23 +++++++++++++++++++++++
>>>   1 file changed, 23 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>>> index c3a1a8e..faf9b13 100644
>>> --- a/drivers/iommu/iova.c
>>> +++ b/drivers/iommu/iova.c
>>> @@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain 
>>> *iovad);
>>>   static void free_iova_rcaches(struct iova_domain *iovad);
>>>   static void fq_destroy_all_entries(struct iova_domain *iovad);
>>>   static void fq_flush_timeout(struct timer_list *t);
>>> +static void free_global_cached_iovas(struct iova_domain *iovad);
> 
> a thought: It would be great if the file could be rearranged at some 
> point where we don't require so many forward declarations.
> 
>>>   void
>>>   init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>>> @@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, 
>>> unsigned long size,
>>>           flush_rcache = false;
>>>           for_each_online_cpu(cpu)
>>>               free_cpu_cached_iovas(cpu, iovad);
>>> +        free_global_cached_iovas(iovad);
>>>           goto retry;
>>>       }
>>> @@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, 
>>> struct iova_domain *iovad)
>>>       }
>>>   }
>>> +/*
>>> + * free all the IOVA ranges of global cache
>>> + */
>>> +static void free_global_cached_iovas(struct iova_domain *iovad)
>>> +{
>>> +    struct iova_rcache *rcache;
>>> +    unsigned long flags;
>>> +    int i, j;
>>> +
>>> +    for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
>>> +        rcache = &iovad->rcaches[i];
>>> +        spin_lock_irqsave(&rcache->lock, flags);
>>> +        for (j = 0; j < rcache->depot_size; ++j) {
>>> +            iova_magazine_free_pfns(rcache->depot[j], iovad);
>>> +            iova_magazine_free(rcache->depot[j]);
>>> +            rcache->depot[j] = NULL;
> 
> I don't think that NULLify is strictly necessary

True, we don't explicitly clear depot entries in __iova_rcache_get() for 
normal operation, so there's not much point in doing so here.

Robin.

>>> +        }
>>> +        rcache->depot_size = 0;
>>> +        spin_unlock_irqrestore(&rcache->lock, flags);
>>> +    }
>>> +}
>>>   MODULE_AUTHOR("Anil S Keshavamurthy 
>>> <anil.s.keshavamurthy@intel.com>");
>>>   MODULE_LICENSE("GPL");
>>>
>> _______________________________________________
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>> .
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure
@ 2020-11-03 15:59         ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2020-11-03 15:59 UTC (permalink / raw)
  To: John Garry, vjitta, joro, iommu, linux-kernel; +Cc: vinmenon, kernel-team

On 2020-11-03 14:31, John Garry wrote:
> On 03/11/2020 12:35, Robin Murphy wrote:
>> On 2020-09-30 08:44, vjitta@codeaurora.org wrote:
>>> From: Vijayanand Jitta <vjitta@codeaurora.org>
>>>
>>> When ever an iova alloc request fails we free the iova
>>> ranges present in the percpu iova rcaches and then retry
>>> but the global iova rcache is not freed as a result we could
>>> still see iova alloc failure even after retry as global
>>> rcache is holding the iova's which can cause fragmentation.
>>> So, free the global iova rcache as well and then go for the
>>> retry.
>>
> 
> If we do clear all the CPU rcaches, it would nice to have something 
> immediately available to replenish, i.e. use the global rcache, instead 
> of flushing it, if that is not required...

If we've reached the point of clearing *any* caches, though, I think any 
hope of maintaining performance is already long gone. We've walked the 
rbtree for the entire address space and found that it's still too full 
to allocate from; we're teetering on the brink of hard failure and this 
is a last-ditch attempt to claw back as much as possible in the hope 
that it gives us a usable space.

TBH I'm not entirely sure what allocation pattern was expected by the 
original code such that purging only some of the caches made sense, nor 
what kind of pattern leads to lots of smaller IOVAs being allocated, 
freed, and never reused to the point of blocking larger allocations, but 
either way the reasoning does at least seem to hold up in abstract.

>> This looks reasonable to me - it's mildly annoying that we end up with 
>> so many similar-looking functions,
> 
> Well I did add a function to clear all CPU rcaches here, if you would 
> like to check:
> 
> https://lore.kernel.org/linux-iommu/1603733501-211004-2-git-send-email-john.garry@huawei.com/ 

I was thinking more of the way free_iova_rcaches(), 
free_cpu_cached_iovas(), and free_global_cached_iovas() all look pretty 
much the same shape at a glance.

>> but the necessary differences are right down in the middle of the 
>> loops so nothing can reasonably be factored out :(
>>
>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
>>
>>> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
>>> ---
>>>   drivers/iommu/iova.c | 23 +++++++++++++++++++++++
>>>   1 file changed, 23 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>>> index c3a1a8e..faf9b13 100644
>>> --- a/drivers/iommu/iova.c
>>> +++ b/drivers/iommu/iova.c
>>> @@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain 
>>> *iovad);
>>>   static void free_iova_rcaches(struct iova_domain *iovad);
>>>   static void fq_destroy_all_entries(struct iova_domain *iovad);
>>>   static void fq_flush_timeout(struct timer_list *t);
>>> +static void free_global_cached_iovas(struct iova_domain *iovad);
> 
> a thought: It would be great if the file could be rearranged at some 
> point where we don't require so many forward declarations.
> 
>>>   void
>>>   init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>>> @@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, 
>>> unsigned long size,
>>>           flush_rcache = false;
>>>           for_each_online_cpu(cpu)
>>>               free_cpu_cached_iovas(cpu, iovad);
>>> +        free_global_cached_iovas(iovad);
>>>           goto retry;
>>>       }
>>> @@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, 
>>> struct iova_domain *iovad)
>>>       }
>>>   }
>>> +/*
>>> + * free all the IOVA ranges of global cache
>>> + */
>>> +static void free_global_cached_iovas(struct iova_domain *iovad)
>>> +{
>>> +    struct iova_rcache *rcache;
>>> +    unsigned long flags;
>>> +    int i, j;
>>> +
>>> +    for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
>>> +        rcache = &iovad->rcaches[i];
>>> +        spin_lock_irqsave(&rcache->lock, flags);
>>> +        for (j = 0; j < rcache->depot_size; ++j) {
>>> +            iova_magazine_free_pfns(rcache->depot[j], iovad);
>>> +            iova_magazine_free(rcache->depot[j]);
>>> +            rcache->depot[j] = NULL;
> 
> I don't think that NULLify is strictly necessary

True, we don't explicitly clear depot entries in __iova_rcache_get() for 
normal operation, so there's not much point in doing so here.

Robin.

>>> +        }
>>> +        rcache->depot_size = 0;
>>> +        spin_unlock_irqrestore(&rcache->lock, flags);
>>> +    }
>>> +}
>>>   MODULE_AUTHOR("Anil S Keshavamurthy 
>>> <anil.s.keshavamurthy@intel.com>");
>>>   MODULE_LICENSE("GPL");
>>>
>> _______________________________________________
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>> .
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure
  2020-11-03 15:59         ` Robin Murphy
@ 2020-11-09 11:12           ` John Garry
  -1 siblings, 0 replies; 18+ messages in thread
From: John Garry @ 2020-11-09 11:12 UTC (permalink / raw)
  To: Robin Murphy, vjitta, joro, iommu, linux-kernel; +Cc: vinmenon, kernel-team

On 03/11/2020 15:59, Robin Murphy wrote:
>>>> alloc failure even after retry as global
>>>> rcache is holding the iova's which can cause fragmentation.
>>>> So, free the global iova rcache as well and then go for the
>>>> retry.
>>>
>>
>> If we do clear all the CPU rcaches, it would nice to have something 
>> immediately available to replenish, i.e. use the global rcache, 
>> instead of flushing it, if that is not required...
> 
> If we've reached the point of clearing *any* caches, though, I think any 
> hope of maintaining performance is already long gone. We've walked the 
> rbtree for the entire address space and found that it's still too full 
> to allocate from; we're teetering on the brink of hard failure and this 
> is a last-ditch attempt to claw back as much as possible in the hope 
> that it gives us a usable space. >
> TBH I'm not entirely sure what allocation pattern was expected by the 
> original code such that purging only some of the caches made sense,

I'd say that the assumption is that once the CPU rcaches are flushed, 
then we should have space again. No need to go any further.

> nor 
> what kind of pattern leads to lots of smaller IOVAs being allocated, 
> freed, and never reused to the point of blocking larger allocations, but 
> either way the reasoning does at least seem to hold up in abstract.

Ok, but I'd like to see that hard failure (if you get my meaning). 
Flushing the depot rcache may be papering over some other bug.

Either way, I don't feel to strongly, so if you're happy then I won't 
try to block, so [apart from comment, below]:
Acked-by: John Garry <john.garry@huaqwei.com>

> 
>>> This looks reasonable to me - it's mildly annoying that we end up 
>>> with so many similar-looking functions,
>>
>> Well I did add a function to clear all CPU rcaches here, if you would 
>> like to check:
>>
>> https://lore.kernel.org/linux-iommu/1603733501-211004-2-git-send-email-john.garry@huawei.com/ 
> 
> 
> I was thinking more of the way free_iova_rcaches(), 
> free_cpu_cached_iovas(), and free_global_cached_iovas() all look pretty 
> much the same shape at a glance.
> 
>>> but the necessary differences are right down in the middle of the 
>>> loops so nothing can reasonably be factored out :(
>>>
>>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
>>>
>>>> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
>>>> ---
>>>>   drivers/iommu/iova.c | 23 +++++++++++++++++++++++
>>>>   1 file changed, 23 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>>>> index c3a1a8e..faf9b13 100644
>>>> --- a/drivers/iommu/iova.c
>>>> +++ b/drivers/iommu/iova.c
>>>> @@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain 
>>>> *iovad);
>>>>   static void free_iova_rcaches(struct iova_domain *iovad);
>>>>   static void fq_destroy_all_entries(struct iova_domain *iovad);
>>>>   static void fq_flush_timeout(struct timer_list *t);
>>>> +static void free_global_cached_iovas(struct iova_domain *iovad);
>>
>> a thought: It would be great if the file could be rearranged at some 
>> point where we don't require so many forward declarations.
>>
>>>>   void
>>>>   init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>>>> @@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, 
>>>> unsigned long size,
>>>>           flush_rcache = false;
>>>>           for_each_online_cpu(cpu)
>>>>               free_cpu_cached_iovas(cpu, iovad);
>>>> +        free_global_cached_iovas(iovad);
>>>>           goto retry;
>>>>       }
>>>> @@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, 
>>>> struct iova_domain *iovad)
>>>>       }
>>>>   }
>>>> +/*
>>>> + * free all the IOVA ranges of global cache
>>>> + */
>>>> +static void free_global_cached_iovas(struct iova_domain *iovad)
>>>> +{
>>>> +    struct iova_rcache *rcache;
>>>> +    unsigned long flags;
>>>> +    int i, j;
>>>> +
>>>> +    for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
>>>> +        rcache = &iovad->rcaches[i];
>>>> +        spin_lock_irqsave(&rcache->lock, flags);
>>>> +        for (j = 0; j < rcache->depot_size; ++j) {
>>>> +            iova_magazine_free_pfns(rcache->depot[j], iovad);
>>>> +            iova_magazine_free(rcache->depot[j]);
>>>> +            rcache->depot[j] = NULL;
>>
>> I don't think that NULLify is strictly necessary
> 
> True, we don't explicitly clear depot entries in __iova_rcache_get() for 
> normal operation, so there's not much point in doing so here.

Right, so for consistency, I think that it would be nice not to NULLify, 
for consistency.

> 
> Robin.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure
@ 2020-11-09 11:12           ` John Garry
  0 siblings, 0 replies; 18+ messages in thread
From: John Garry @ 2020-11-09 11:12 UTC (permalink / raw)
  To: Robin Murphy, vjitta, joro, iommu, linux-kernel; +Cc: vinmenon, kernel-team

On 03/11/2020 15:59, Robin Murphy wrote:
>>>> alloc failure even after retry as global
>>>> rcache is holding the iova's which can cause fragmentation.
>>>> So, free the global iova rcache as well and then go for the
>>>> retry.
>>>
>>
>> If we do clear all the CPU rcaches, it would nice to have something 
>> immediately available to replenish, i.e. use the global rcache, 
>> instead of flushing it, if that is not required...
> 
> If we've reached the point of clearing *any* caches, though, I think any 
> hope of maintaining performance is already long gone. We've walked the 
> rbtree for the entire address space and found that it's still too full 
> to allocate from; we're teetering on the brink of hard failure and this 
> is a last-ditch attempt to claw back as much as possible in the hope 
> that it gives us a usable space. >
> TBH I'm not entirely sure what allocation pattern was expected by the 
> original code such that purging only some of the caches made sense,

I'd say that the assumption is that once the CPU rcaches are flushed, 
then we should have space again. No need to go any further.

> nor 
> what kind of pattern leads to lots of smaller IOVAs being allocated, 
> freed, and never reused to the point of blocking larger allocations, but 
> either way the reasoning does at least seem to hold up in abstract.

Ok, but I'd like to see that hard failure (if you get my meaning). 
Flushing the depot rcache may be papering over some other bug.

Either way, I don't feel to strongly, so if you're happy then I won't 
try to block, so [apart from comment, below]:
Acked-by: John Garry <john.garry@huaqwei.com>

> 
>>> This looks reasonable to me - it's mildly annoying that we end up 
>>> with so many similar-looking functions,
>>
>> Well I did add a function to clear all CPU rcaches here, if you would 
>> like to check:
>>
>> https://lore.kernel.org/linux-iommu/1603733501-211004-2-git-send-email-john.garry@huawei.com/ 
> 
> 
> I was thinking more of the way free_iova_rcaches(), 
> free_cpu_cached_iovas(), and free_global_cached_iovas() all look pretty 
> much the same shape at a glance.
> 
>>> but the necessary differences are right down in the middle of the 
>>> loops so nothing can reasonably be factored out :(
>>>
>>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
>>>
>>>> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
>>>> ---
>>>>   drivers/iommu/iova.c | 23 +++++++++++++++++++++++
>>>>   1 file changed, 23 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>>>> index c3a1a8e..faf9b13 100644
>>>> --- a/drivers/iommu/iova.c
>>>> +++ b/drivers/iommu/iova.c
>>>> @@ -25,6 +25,7 @@ static void init_iova_rcaches(struct iova_domain 
>>>> *iovad);
>>>>   static void free_iova_rcaches(struct iova_domain *iovad);
>>>>   static void fq_destroy_all_entries(struct iova_domain *iovad);
>>>>   static void fq_flush_timeout(struct timer_list *t);
>>>> +static void free_global_cached_iovas(struct iova_domain *iovad);
>>
>> a thought: It would be great if the file could be rearranged at some 
>> point where we don't require so many forward declarations.
>>
>>>>   void
>>>>   init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>>>> @@ -442,6 +443,7 @@ alloc_iova_fast(struct iova_domain *iovad, 
>>>> unsigned long size,
>>>>           flush_rcache = false;
>>>>           for_each_online_cpu(cpu)
>>>>               free_cpu_cached_iovas(cpu, iovad);
>>>> +        free_global_cached_iovas(iovad);
>>>>           goto retry;
>>>>       }
>>>> @@ -1057,5 +1059,26 @@ void free_cpu_cached_iovas(unsigned int cpu, 
>>>> struct iova_domain *iovad)
>>>>       }
>>>>   }
>>>> +/*
>>>> + * free all the IOVA ranges of global cache
>>>> + */
>>>> +static void free_global_cached_iovas(struct iova_domain *iovad)
>>>> +{
>>>> +    struct iova_rcache *rcache;
>>>> +    unsigned long flags;
>>>> +    int i, j;
>>>> +
>>>> +    for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
>>>> +        rcache = &iovad->rcaches[i];
>>>> +        spin_lock_irqsave(&rcache->lock, flags);
>>>> +        for (j = 0; j < rcache->depot_size; ++j) {
>>>> +            iova_magazine_free_pfns(rcache->depot[j], iovad);
>>>> +            iova_magazine_free(rcache->depot[j]);
>>>> +            rcache->depot[j] = NULL;
>>
>> I don't think that NULLify is strictly necessary
> 
> True, we don't explicitly clear depot entries in __iova_rcache_get() for 
> normal operation, so there's not much point in doing so here.

Right, so for consistency, I think that it would be nice not to NULLify, 
for consistency.

> 
> Robin.

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/2] iommu/iova: Retry from last rb tree node if iova search fails
  2020-09-30  7:44 ` vjitta
@ 2020-11-17 23:24   ` Will Deacon
  -1 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2020-11-17 23:24 UTC (permalink / raw)
  To: vjitta, joro, linux-kernel, iommu
  Cc: catalin.marinas, kernel-team, Will Deacon, robin.murphy, vinmenon

On Wed, 30 Sep 2020 13:14:23 +0530, vjitta@codeaurora.org wrote:
> When ever a new iova alloc request comes iova is always searched
> from the cached node and the nodes which are previous to cached
> node. So, even if there is free iova space available in the nodes
> which are next to the cached node iova allocation can still fail
> because of this approach.
> 
> Consider the following sequence of iova alloc and frees on
> 1GB of iova space
> 
> [...]

Applied to arm64 (for-next/iommu/iova), thanks!

[1/2] iommu/iova: Retry from last rb tree node if iova search fails
      https://git.kernel.org/arm64/c/4e89dce72521
[2/2] iommu/iova: Free global iova rcache on iova alloc failure
      https://git.kernel.org/arm64/c/6fa3525b455a

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/2] iommu/iova: Retry from last rb tree node if iova search fails
@ 2020-11-17 23:24   ` Will Deacon
  0 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2020-11-17 23:24 UTC (permalink / raw)
  To: vjitta, joro, linux-kernel, iommu
  Cc: catalin.marinas, vinmenon, kernel-team, robin.murphy, Will Deacon

On Wed, 30 Sep 2020 13:14:23 +0530, vjitta@codeaurora.org wrote:
> When ever a new iova alloc request comes iova is always searched
> from the cached node and the nodes which are previous to cached
> node. So, even if there is free iova space available in the nodes
> which are next to the cached node iova allocation can still fail
> because of this approach.
> 
> Consider the following sequence of iova alloc and frees on
> 1GB of iova space
> 
> [...]

Applied to arm64 (for-next/iommu/iova), thanks!

[1/2] iommu/iova: Retry from last rb tree node if iova search fails
      https://git.kernel.org/arm64/c/4e89dce72521
[2/2] iommu/iova: Free global iova rcache on iova alloc failure
      https://git.kernel.org/arm64/c/6fa3525b455a

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-11-17 23:25 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-30  7:44 [PATCH v5 1/2] iommu/iova: Retry from last rb tree node if iova search fails vjitta
2020-09-30  7:44 ` vjitta
2020-09-30  7:44 ` [PATCH v5 2/2] iommu/iova: Free global iova rcache on iova alloc failure vjitta
2020-09-30  7:44   ` vjitta
2020-10-20  9:18   ` Vijayanand Jitta
2020-10-20  9:18     ` Vijayanand Jitta
2020-11-03 12:35   ` Robin Murphy
2020-11-03 12:35     ` Robin Murphy
2020-11-03 14:31     ` John Garry
2020-11-03 14:31       ` John Garry
2020-11-03 15:59       ` Robin Murphy
2020-11-03 15:59         ` Robin Murphy
2020-11-09 11:12         ` John Garry
2020-11-09 11:12           ` John Garry
2020-10-20  9:17 ` [PATCH v5 1/2] iommu/iova: Retry from last rb tree node if iova search fails Vijayanand Jitta
2020-10-20  9:17   ` Vijayanand Jitta
2020-11-17 23:24 ` Will Deacon
2020-11-17 23:24   ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.