linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] irqchip/gic-v3-its: free_lpi_range tweaks
@ 2019-03-12 17:33 Rasmus Villemoes
  2019-03-12 17:33 ` [PATCH 1/4] irqchip/gic-v3-its: fix comparison logic in lpi_range_cmp Rasmus Villemoes
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Rasmus Villemoes @ 2019-03-12 17:33 UTC (permalink / raw)
  To: Marc Zyngier, linux-kernel; +Cc: Rasmus Villemoes, stable

I noticed that the O(n log n) behaviour of free_lpi_range could easily
be made O(n) (patch 4), though I don't suppose n is ever large enough
to actually matter. While there, I also stumbled on two other
micro-optimizations (2 and 3).

Then while writing the commit log for the last patch, I noticed that
the cmp callback I was removing was actually buggy, so I went back and
added a patch in front suitable for -stable. I'll leave it to others
to decide if it's important enough for that.

Please note that this is only compile-tested.

Rasmus Villemoes (4):
  irqchip/gic-v3-its: fix comparison logic in lpi_range_cmp
  irqchip/gic-v3-its: move allocation outside mutex
  irqchip/gic-v3-its: drop redundant initialization in mk_lpi_range
  irqchip/gic-v3-its: make free_lpi_range a little cheaper

 drivers/irqchip/irq-gic-v3-its.c | 75 +++++++++++++++-----------------
 1 file changed, 36 insertions(+), 39 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] irqchip/gic-v3-its: fix comparison logic in lpi_range_cmp
  2019-03-12 17:33 [PATCH 0/4] irqchip/gic-v3-its: free_lpi_range tweaks Rasmus Villemoes
@ 2019-03-12 17:33 ` Rasmus Villemoes
  2019-03-12 17:33 ` [PATCH 2/4] irqchip/gic-v3-its: move allocation outside mutex Rasmus Villemoes
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Rasmus Villemoes @ 2019-03-12 17:33 UTC (permalink / raw)
  To: Marc Zyngier, Thomas Gleixner, Jason Cooper
  Cc: Rasmus Villemoes, stable, linux-kernel

The lpi_range_list is supposed to be sorted in ascending order of
->base_id (at least if the range merging is to work), but the current
comparison function returns a positive value if rb->base_id >
ra->base_id, which means that list_sort() will put A after B in that
case - and vice versa, of course.

Fixes: 880cb3cddd16 (irqchip/gic-v3-its: Refactor LPI allocator)
Cc: stable@vger.kernel.org (v4.19+)
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 drivers/irqchip/irq-gic-v3-its.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 2dd1ff0cf558..7577755bdcf4 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1482,7 +1482,7 @@ static int lpi_range_cmp(void *priv, struct list_head *a, struct list_head *b)
 	ra = container_of(a, struct lpi_range, entry);
 	rb = container_of(b, struct lpi_range, entry);
 
-	return rb->base_id - ra->base_id;
+	return ra->base_id - rb->base_id;
 }
 
 static void merge_lpi_ranges(void)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] irqchip/gic-v3-its: move allocation outside mutex
  2019-03-12 17:33 [PATCH 0/4] irqchip/gic-v3-its: free_lpi_range tweaks Rasmus Villemoes
  2019-03-12 17:33 ` [PATCH 1/4] irqchip/gic-v3-its: fix comparison logic in lpi_range_cmp Rasmus Villemoes
@ 2019-03-12 17:33 ` Rasmus Villemoes
  2019-03-12 17:33 ` [PATCH 3/4] irqchip/gic-v3-its: drop redundant initialization in mk_lpi_range Rasmus Villemoes
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Rasmus Villemoes @ 2019-03-12 17:33 UTC (permalink / raw)
  To: Marc Zyngier, Thomas Gleixner, Jason Cooper
  Cc: Rasmus Villemoes, linux-kernel

There's no reason to do the allocation of the new lpi_range inside the
lpi_range_lock. One could change the code to avoid the allocation
altogether in case the freed range can be merged with one or two
existing ranges (in which case the allocation would naturally be done
under the lock), but it's probably not worth complicating the code for
that.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 drivers/irqchip/irq-gic-v3-its.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 7577755bdcf4..5c8232cff290 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1532,22 +1532,19 @@ static int alloc_lpi_range(u32 nr_lpis, u32 *base)
 static int free_lpi_range(u32 base, u32 nr_lpis)
 {
 	struct lpi_range *new;
-	int err = 0;
-
-	mutex_lock(&lpi_range_lock);
 
 	new = mk_lpi_range(base, nr_lpis);
-	if (!new) {
-		err = -ENOMEM;
-		goto out;
-	}
+	if (!new)
+		return -ENOMEM;
+
+	mutex_lock(&lpi_range_lock);
 
 	list_add(&new->entry, &lpi_range_list);
 	list_sort(NULL, &lpi_range_list, lpi_range_cmp);
 	merge_lpi_ranges();
-out:
+
 	mutex_unlock(&lpi_range_lock);
-	return err;
+	return 0;
 }
 
 static int __init its_lpi_init(u32 id_bits)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] irqchip/gic-v3-its: drop redundant initialization in mk_lpi_range
  2019-03-12 17:33 [PATCH 0/4] irqchip/gic-v3-its: free_lpi_range tweaks Rasmus Villemoes
  2019-03-12 17:33 ` [PATCH 1/4] irqchip/gic-v3-its: fix comparison logic in lpi_range_cmp Rasmus Villemoes
  2019-03-12 17:33 ` [PATCH 2/4] irqchip/gic-v3-its: move allocation outside mutex Rasmus Villemoes
@ 2019-03-12 17:33 ` Rasmus Villemoes
  2019-03-12 17:33 ` [PATCH 4/4] irqchip/gic-v3-its: make free_lpi_range a little cheaper Rasmus Villemoes
  2019-03-12 18:03 ` [PATCH 0/4] irqchip/gic-v3-its: free_lpi_range tweaks Marc Zyngier
  4 siblings, 0 replies; 6+ messages in thread
From: Rasmus Villemoes @ 2019-03-12 17:33 UTC (permalink / raw)
  To: Marc Zyngier, Thomas Gleixner, Jason Cooper
  Cc: Rasmus Villemoes, linux-kernel

There's no reason to ask kmalloc() to zero the allocation, since all
the fields get initialized immediately afterwards. Except that there's
also not any reason to initialize the ->entry member, since the
element gets added to the lpi_range_list immediately.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 drivers/irqchip/irq-gic-v3-its.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 5c8232cff290..2ebc28a53d96 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1465,9 +1465,8 @@ static struct lpi_range *mk_lpi_range(u32 base, u32 span)
 {
 	struct lpi_range *range;
 
-	range = kzalloc(sizeof(*range), GFP_KERNEL);
+	range = kmalloc(sizeof(*range), GFP_KERNEL);
 	if (range) {
-		INIT_LIST_HEAD(&range->entry);
 		range->base_id = base;
 		range->span = span;
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] irqchip/gic-v3-its: make free_lpi_range a little cheaper
  2019-03-12 17:33 [PATCH 0/4] irqchip/gic-v3-its: free_lpi_range tweaks Rasmus Villemoes
                   ` (2 preceding siblings ...)
  2019-03-12 17:33 ` [PATCH 3/4] irqchip/gic-v3-its: drop redundant initialization in mk_lpi_range Rasmus Villemoes
@ 2019-03-12 17:33 ` Rasmus Villemoes
  2019-03-12 18:03 ` [PATCH 0/4] irqchip/gic-v3-its: free_lpi_range tweaks Marc Zyngier
  4 siblings, 0 replies; 6+ messages in thread
From: Rasmus Villemoes @ 2019-03-12 17:33 UTC (permalink / raw)
  To: Marc Zyngier, Thomas Gleixner, Jason Cooper
  Cc: Rasmus Villemoes, linux-kernel

Using list_add + list_sort to insert an element and keeping the list
sorted is a somewhat blunt instrument; one can find the right place to
insert in fewer lines of code than the cmp callback uses. Moreover,
walking the entire list afterwards to merge adjacent ranges is
overkill, since we know that only the just-inserted element may be
merged with its neighbours.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 drivers/irqchip/irq-gic-v3-its.c | 61 ++++++++++++++++----------------
 1 file changed, 31 insertions(+), 30 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 2ebc28a53d96..b362d62d3e91 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -26,7 +26,6 @@
 #include <linux/interrupt.h>
 #include <linux/irqdomain.h>
 #include <linux/list.h>
-#include <linux/list_sort.h>
 #include <linux/log2.h>
 #include <linux/memblock.h>
 #include <linux/mm.h>
@@ -1474,31 +1473,6 @@ static struct lpi_range *mk_lpi_range(u32 base, u32 span)
 	return range;
 }
 
-static int lpi_range_cmp(void *priv, struct list_head *a, struct list_head *b)
-{
-	struct lpi_range *ra, *rb;
-
-	ra = container_of(a, struct lpi_range, entry);
-	rb = container_of(b, struct lpi_range, entry);
-
-	return ra->base_id - rb->base_id;
-}
-
-static void merge_lpi_ranges(void)
-{
-	struct lpi_range *range, *tmp;
-
-	list_for_each_entry_safe(range, tmp, &lpi_range_list, entry) {
-		if (!list_is_last(&range->entry, &lpi_range_list) &&
-		    (tmp->base_id == (range->base_id + range->span))) {
-			tmp->base_id = range->base_id;
-			tmp->span += range->span;
-			list_del(&range->entry);
-			kfree(range);
-		}
-	}
-}
-
 static int alloc_lpi_range(u32 nr_lpis, u32 *base)
 {
 	struct lpi_range *range, *tmp;
@@ -1528,9 +1502,21 @@ static int alloc_lpi_range(u32 nr_lpis, u32 *base)
 	return err;
 }
 
+static void merge_lpi_ranges(struct lpi_range *a, struct lpi_range *b)
+{
+	if (&a->entry == &lpi_range_list || &b->entry == &lpi_range_list)
+		return;
+	if (a->base_id + a->span != b->base_id)
+		return;
+	b->base_id = a->base_id;
+	b->span += a->span;
+	list_del(&a->entry);
+	kfree(a);
+}
+
 static int free_lpi_range(u32 base, u32 nr_lpis)
 {
-	struct lpi_range *new;
+	struct lpi_range *new, *old;
 
 	new = mk_lpi_range(base, nr_lpis);
 	if (!new)
@@ -1538,9 +1524,24 @@ static int free_lpi_range(u32 base, u32 nr_lpis)
 
 	mutex_lock(&lpi_range_lock);
 
-	list_add(&new->entry, &lpi_range_list);
-	list_sort(NULL, &lpi_range_list, lpi_range_cmp);
-	merge_lpi_ranges();
+	list_for_each_entry_reverse(old, &lpi_range_list, entry) {
+		if (old->base_id < base)
+			break;
+	}
+	/*
+	 * old is the last element with ->base_id smaller than base,
+	 * so new goes right after it. If there are no elements with
+	 * ->base_id smaller than base, &old->entry ends up pointing
+	 * at the head of the list, and inserting new it the start of
+	 * the list is the right thing to do in that case as well.
+	 */
+	list_add(&new->entry, &old->entry);
+	/*
+	 * Now check if we can merge with the preceding and/or
+	 * following ranges.
+	 */
+	merge_lpi_ranges(old, new);
+	merge_lpi_ranges(new, list_next_entry(new, entry));
 
 	mutex_unlock(&lpi_range_lock);
 	return 0;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/4] irqchip/gic-v3-its: free_lpi_range tweaks
  2019-03-12 17:33 [PATCH 0/4] irqchip/gic-v3-its: free_lpi_range tweaks Rasmus Villemoes
                   ` (3 preceding siblings ...)
  2019-03-12 17:33 ` [PATCH 4/4] irqchip/gic-v3-its: make free_lpi_range a little cheaper Rasmus Villemoes
@ 2019-03-12 18:03 ` Marc Zyngier
  4 siblings, 0 replies; 6+ messages in thread
From: Marc Zyngier @ 2019-03-12 18:03 UTC (permalink / raw)
  To: Rasmus Villemoes, linux-kernel; +Cc: stable

Hi Rasmus,

On 12/03/2019 17:33, Rasmus Villemoes wrote:
> I noticed that the O(n log n) behaviour of free_lpi_range could easily
> be made O(n) (patch 4), though I don't suppose n is ever large enough
> to actually matter. While there, I also stumbled on two other
> micro-optimizations (2 and 3).

n is usually in the range 1 .. nr_cpus, so pretty small, even on the
biggest machines we have around (256 threads). And actually, nobody ever
frees LPIs, because hey, why would you?

> Then while writing the commit log for the last patch, I noticed that
> the cmp callback I was removing was actually buggy, so I went back and
> added a patch in front suitable for -stable. I'll leave it to others
> to decide if it's important enough for that.

Thanks for that. I'll have a look at the whole thing anyway (I've just
glanced over it so far).

> Please note that this is only compile-tested.

Right, this needs some actual testing then. /me needs to build a guest
that shakes the allocator a bit.

Cheers,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-03-12 18:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-12 17:33 [PATCH 0/4] irqchip/gic-v3-its: free_lpi_range tweaks Rasmus Villemoes
2019-03-12 17:33 ` [PATCH 1/4] irqchip/gic-v3-its: fix comparison logic in lpi_range_cmp Rasmus Villemoes
2019-03-12 17:33 ` [PATCH 2/4] irqchip/gic-v3-its: move allocation outside mutex Rasmus Villemoes
2019-03-12 17:33 ` [PATCH 3/4] irqchip/gic-v3-its: drop redundant initialization in mk_lpi_range Rasmus Villemoes
2019-03-12 17:33 ` [PATCH 4/4] irqchip/gic-v3-its: make free_lpi_range a little cheaper Rasmus Villemoes
2019-03-12 18:03 ` [PATCH 0/4] irqchip/gic-v3-its: free_lpi_range tweaks Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).