linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] Remove on-stack cpumask var for irq subsystem
@ 2024-04-16  8:54 Dawei Li
  2024-04-16  8:54 ` [PATCH v2 1/7] cpumask: introduce cpumask_first_and_and() Dawei Li
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Dawei Li @ 2024-04-16  8:54 UTC (permalink / raw)
  To: tglx, yury.norov, rafael
  Cc: akpm, maz, florian.fainelli, chenhuacai, jiaxun.yang, anup,
	palmer, samuel.holland, linux, daniel.lezcano, linux-kernel,
	Dawei Li

Hi,

This is v2 of previous series[1] on removal of onstack cpumask var.

Generally it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

One may argue that alloc_cpumask_var() and its friends are the formal
way for these cases. But for struct irqchip::irq_set_affinity(), it's
called under atomic context(raw spinlock held), and dynamic memory
allocation in atomic context is less-favorable.

So a new helper is introduced to address all these issues above. It's
free of any context issue and intermediate cpumask variable allocation
issue(no matter it's on stack or heap).

The case with gic-v3-its(Patch 3) is special from others since it's not
related to intersections between 3 cpumask.

Patch#7 is not for irq subsystem, it's in this series only because it
uses new helper. Please ignore it if you found it's inappropriate for
this series.

Any comments are welcomed.

------------

Change since v1:

- Rebased against tip/irq/core;

- Patch[1]: [Yury]
  - Remove ifdefery nesting on find_first_and_and_bit; 
  - Update commit message;

- Patch[3]: [Marc]
  - Merge two bitmap ops into one;
  - Update commit message;

- Patch[2,4-6]: [Yury]
  - Unwrap lines;

- Patch[7]:
  Newly added. Feel free to drop/ignore it if you found it's inappropriate
  for this series.

[1] v1:
https://lore.kernel.org/lkml/20240412105839.2896281-1-dawei.li@shingroup.cn/

Dawei Li (7):
  cpumask: introduce cpumask_first_and_and()
  irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack
  irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack
  irqchip/loongson-eiointc: Avoid explicit cpumask allocation on stack
  irqchip/riscv-aplic-direct: Avoid explicit cpumask allocation on stack
  irqchip/sifive-plic: Avoid explicit cpumask allocation on stack
  cpuidle: Avoid explicit cpumask allocation on stack

 drivers/cpuidle/coupled.c                | 13 +++---------
 drivers/irqchip/irq-bcm6345-l1.c         |  6 +-----
 drivers/irqchip/irq-gic-v3-its.c         | 15 ++++++++-----
 drivers/irqchip/irq-loongson-eiointc.c   |  8 ++-----
 drivers/irqchip/irq-riscv-aplic-direct.c |  7 ++----
 drivers/irqchip/irq-sifive-plic.c        |  7 ++----
 include/linux/cpumask.h                  | 17 +++++++++++++++
 include/linux/find.h                     | 27 ++++++++++++++++++++++++
 lib/find_bit.c                           | 12 +++++++++++
 9 files changed, 76 insertions(+), 36 deletions(-)

base-commit: 35d77eb7b974f62aaef5a0dc72d93ddb1ada4074

Thanks,

    Dawei

-- 
2.27.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 1/7] cpumask: introduce cpumask_first_and_and()
  2024-04-16  8:54 [PATCH v2 0/7] Remove on-stack cpumask var for irq subsystem Dawei Li
@ 2024-04-16  8:54 ` Dawei Li
  2024-04-16 17:45   ` Yury Norov
  2024-04-24 20:04   ` [tip: irq/core] cpumask: Introduce cpumask_first_and_and() tip-bot2 for Dawei Li
  2024-04-16  8:54 ` [PATCH v2 2/7] irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack Dawei Li
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Dawei Li @ 2024-04-16  8:54 UTC (permalink / raw)
  To: tglx, yury.norov, rafael
  Cc: akpm, maz, florian.fainelli, chenhuacai, jiaxun.yang, anup,
	palmer, samuel.holland, linux, daniel.lezcano, linux-kernel,
	Dawei Li

Introduce cpumask_first_and_and() to get intersection between 3 cpumasks,
free of any intermediate cpumask variable. Instead, cpumask_first_and_and()
works in-place with all inputs and produce desired output directly.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 include/linux/cpumask.h | 17 +++++++++++++++++
 include/linux/find.h    | 27 +++++++++++++++++++++++++++
 lib/find_bit.c          | 12 ++++++++++++
 3 files changed, 56 insertions(+)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 1c29947db848..c46f9e9e1d66 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -187,6 +187,23 @@ unsigned int cpumask_first_and(const struct cpumask *srcp1, const struct cpumask
 	return find_first_and_bit(cpumask_bits(srcp1), cpumask_bits(srcp2), small_cpumask_bits);
 }
 
+/**
+ * cpumask_first_and_and - return the first cpu from *srcp1 & *srcp2 & *srcp3
+ * @srcp1: the first input
+ * @srcp2: the second input
+ * @srcp3: the third input
+ *
+ * Return: >= nr_cpu_ids if no cpus set in all.
+ */
+static inline
+unsigned int cpumask_first_and_and(const struct cpumask *srcp1,
+				   const struct cpumask *srcp2,
+				   const struct cpumask *srcp3)
+{
+	return find_first_and_and_bit(cpumask_bits(srcp1), cpumask_bits(srcp2),
+				      cpumask_bits(srcp3), small_cpumask_bits);
+}
+
 /**
  * cpumask_last - get the last CPU in a cpumask
  * @srcp:	- the cpumask pointer
diff --git a/include/linux/find.h b/include/linux/find.h
index c69598e383c1..28ec5a03393a 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -29,6 +29,8 @@ unsigned long __find_nth_and_andnot_bit(const unsigned long *addr1, const unsign
 					unsigned long n);
 extern unsigned long _find_first_and_bit(const unsigned long *addr1,
 					 const unsigned long *addr2, unsigned long size);
+unsigned long _find_first_and_and_bit(const unsigned long *addr1, const unsigned long *addr2,
+				      const unsigned long *addr3, unsigned long size);
 extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
 extern unsigned long _find_last_bit(const unsigned long *addr, unsigned long size);
 
@@ -345,6 +347,31 @@ unsigned long find_first_and_bit(const unsigned long *addr1,
 }
 #endif
 
+/**
+ * find_first_and_and_bit - find the first set bit in 3 memory regions
+ * @addr1: The first address to base the search on
+ * @addr2: The second address to base the search on
+ * @addr3: The third address to base the search on
+ * @size: The bitmap size in bits
+ *
+ * Returns the bit number for the first set bit
+ * If no bits are set, returns @size.
+ */
+static inline
+unsigned long find_first_and_and_bit(const unsigned long *addr1,
+				     const unsigned long *addr2,
+				     const unsigned long *addr3,
+				     unsigned long size)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val = *addr1 & *addr2 & *addr3 & GENMASK(size - 1, 0);
+
+		return val ? __ffs(val) : size;
+	}
+
+	return _find_first_and_and_bit(addr1, addr2, addr3, size);
+}
+
 #ifndef find_first_zero_bit
 /**
  * find_first_zero_bit - find the first cleared bit in a memory region
diff --git a/lib/find_bit.c b/lib/find_bit.c
index 32f99e9a670e..dacadd904250 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -116,6 +116,18 @@ unsigned long _find_first_and_bit(const unsigned long *addr1,
 EXPORT_SYMBOL(_find_first_and_bit);
 #endif
 
+/*
+ * Find the first set bit in three memory regions.
+ */
+unsigned long _find_first_and_and_bit(const unsigned long *addr1,
+				      const unsigned long *addr2,
+				      const unsigned long *addr3,
+				      unsigned long size)
+{
+	return FIND_FIRST_BIT(addr1[idx] & addr2[idx] & addr3[idx], /* nop */, size);
+}
+EXPORT_SYMBOL(_find_first_and_and_bit);
+
 #ifndef find_first_zero_bit
 /*
  * Find the first cleared bit in a memory region.
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 2/7] irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 [PATCH v2 0/7] Remove on-stack cpumask var for irq subsystem Dawei Li
  2024-04-16  8:54 ` [PATCH v2 1/7] cpumask: introduce cpumask_first_and_and() Dawei Li
@ 2024-04-16  8:54 ` Dawei Li
  2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
  2024-04-16  8:54 ` [PATCH v2 3/7] irqchip/gic-v3-its: " Dawei Li
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Dawei Li @ 2024-04-16  8:54 UTC (permalink / raw)
  To: tglx, yury.norov, rafael
  Cc: akpm, maz, florian.fainelli, chenhuacai, jiaxun.yang, anup,
	palmer, samuel.holland, linux, daniel.lezcano, linux-kernel,
	Dawei Li

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_first_and_and() to avoid the need for a temporary cpumask on
the stack.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/irqchip/irq-bcm6345-l1.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/irqchip/irq-bcm6345-l1.c b/drivers/irqchip/irq-bcm6345-l1.c
index eb02d203c963..90daa274ef23 100644
--- a/drivers/irqchip/irq-bcm6345-l1.c
+++ b/drivers/irqchip/irq-bcm6345-l1.c
@@ -192,14 +192,10 @@ static int bcm6345_l1_set_affinity(struct irq_data *d,
 	u32 mask = BIT(d->hwirq % IRQS_PER_WORD);
 	unsigned int old_cpu = cpu_for_irq(intc, d);
 	unsigned int new_cpu;
-	struct cpumask valid;
 	unsigned long flags;
 	bool enabled;
 
-	if (!cpumask_and(&valid, &intc->cpumask, dest))
-		return -EINVAL;
-
-	new_cpu = cpumask_any_and(&valid, cpu_online_mask);
+	new_cpu = cpumask_first_and_and(&intc->cpumask, dest, cpu_online_mask);
 	if (new_cpu >= nr_cpu_ids)
 		return -EINVAL;
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 3/7] irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 [PATCH v2 0/7] Remove on-stack cpumask var for irq subsystem Dawei Li
  2024-04-16  8:54 ` [PATCH v2 1/7] cpumask: introduce cpumask_first_and_and() Dawei Li
  2024-04-16  8:54 ` [PATCH v2 2/7] irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack Dawei Li
@ 2024-04-16  8:54 ` Dawei Li
  2024-04-17 10:56   ` Marc Zyngier
  2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
  2024-04-16  8:54 ` [PATCH v2 4/7] irqchip/loongson-eiointc: " Dawei Li
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Dawei Li @ 2024-04-16  8:54 UTC (permalink / raw)
  To: tglx, yury.norov, rafael
  Cc: akpm, maz, florian.fainelli, chenhuacai, jiaxun.yang, anup,
	palmer, samuel.holland, linux, daniel.lezcano, linux-kernel,
	Dawei Li

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Remove cpumask var on stack and use cpumask_any_and() to address it.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/irqchip/irq-gic-v3-its.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index fca888b36680..20f954211c61 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3826,9 +3826,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
 				bool force)
 {
 	struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
-	struct cpumask common, *table_mask;
+	unsigned int from, cpu = nr_cpu_ids;
+	struct cpumask *table_mask;
 	unsigned long flags;
-	int from, cpu;
 
 	/*
 	 * Changing affinity is mega expensive, so let's be as lazy as
@@ -3850,10 +3850,15 @@ static int its_vpe_set_affinity(struct irq_data *d,
 	 * If we are offered another CPU in the same GICv4.1 ITS
 	 * affinity, pick this one. Otherwise, any CPU will do.
 	 */
-	if (table_mask && cpumask_and(&common, mask_val, table_mask))
-		cpu = cpumask_test_cpu(from, &common) ? from : cpumask_first(&common);
-	else
+	if (table_mask)
+		cpu = cpumask_any_and(mask_val, table_mask);
+	if (cpu < nr_cpu_ids) {
+		if (cpumask_test_cpu(from, mask_val) &&
+		    cpumask_test_cpu(from, table_mask))
+			cpu = from;
+	} else {
 		cpu = cpumask_first(mask_val);
+	}
 
 	if (from == cpu)
 		goto out;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 4/7] irqchip/loongson-eiointc: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 [PATCH v2 0/7] Remove on-stack cpumask var for irq subsystem Dawei Li
                   ` (2 preceding siblings ...)
  2024-04-16  8:54 ` [PATCH v2 3/7] irqchip/gic-v3-its: " Dawei Li
@ 2024-04-16  8:54 ` Dawei Li
  2024-04-16 18:01   ` Yury Norov
  2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
  2024-04-16  8:54 ` [PATCH v2 5/7] irqchip/riscv-aplic-direct: " Dawei Li
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Dawei Li @ 2024-04-16  8:54 UTC (permalink / raw)
  To: tglx, yury.norov, rafael
  Cc: akpm, maz, florian.fainelli, chenhuacai, jiaxun.yang, anup,
	palmer, samuel.holland, linux, daniel.lezcano, linux-kernel,
	Dawei Li

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_first_and_and() to avoid the need for a temporary cpumask on
the stack.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/irqchip/irq-loongson-eiointc.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/irqchip/irq-loongson-eiointc.c b/drivers/irqchip/irq-loongson-eiointc.c
index 4f5e6d21d77d..c7ddebf312ad 100644
--- a/drivers/irqchip/irq-loongson-eiointc.c
+++ b/drivers/irqchip/irq-loongson-eiointc.c
@@ -93,19 +93,15 @@ static int eiointc_set_irq_affinity(struct irq_data *d, const struct cpumask *af
 	unsigned int cpu;
 	unsigned long flags;
 	uint32_t vector, regaddr;
-	struct cpumask intersect_affinity;
 	struct eiointc_priv *priv = d->domain->host_data;
 
 	raw_spin_lock_irqsave(&affinity_lock, flags);
 
-	cpumask_and(&intersect_affinity, affinity, cpu_online_mask);
-	cpumask_and(&intersect_affinity, &intersect_affinity, &priv->cpuspan_map);
-
-	if (cpumask_empty(&intersect_affinity)) {
+	cpu = cpumask_first_and_and(&priv->cpuspan_map, affinity, cpu_online_mask);
+	if (cpu >= nr_cpu_ids) {
 		raw_spin_unlock_irqrestore(&affinity_lock, flags);
 		return -EINVAL;
 	}
-	cpu = cpumask_first(&intersect_affinity);
 
 	vector = d->hwirq;
 	regaddr = EIOINTC_REG_ENABLE + ((vector >> 5) << 2);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 5/7] irqchip/riscv-aplic-direct: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 [PATCH v2 0/7] Remove on-stack cpumask var for irq subsystem Dawei Li
                   ` (3 preceding siblings ...)
  2024-04-16  8:54 ` [PATCH v2 4/7] irqchip/loongson-eiointc: " Dawei Li
@ 2024-04-16  8:54 ` Dawei Li
  2024-04-17 11:22   ` Anup Patel
  2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
  2024-04-16  8:54 ` [PATCH v2 6/7] irqchip/sifive-plic: " Dawei Li
  2024-04-16  8:54 ` [PATCH v2 7/7] cpuidle: " Dawei Li
  6 siblings, 2 replies; 22+ messages in thread
From: Dawei Li @ 2024-04-16  8:54 UTC (permalink / raw)
  To: tglx, yury.norov, rafael
  Cc: akpm, maz, florian.fainelli, chenhuacai, jiaxun.yang, anup,
	palmer, samuel.holland, linux, daniel.lezcano, linux-kernel,
	Dawei Li

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_first_and_and() to avoid the need for a temporary cpumask on
the stack.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/irqchip/irq-riscv-aplic-direct.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/irqchip/irq-riscv-aplic-direct.c b/drivers/irqchip/irq-riscv-aplic-direct.c
index 06bace9b7497..4a3ffe856d6c 100644
--- a/drivers/irqchip/irq-riscv-aplic-direct.c
+++ b/drivers/irqchip/irq-riscv-aplic-direct.c
@@ -54,15 +54,12 @@ static int aplic_direct_set_affinity(struct irq_data *d, const struct cpumask *m
 	struct aplic_direct *direct = container_of(priv, struct aplic_direct, priv);
 	struct aplic_idc *idc;
 	unsigned int cpu, val;
-	struct cpumask amask;
 	void __iomem *target;
 
-	cpumask_and(&amask, &direct->lmask, mask_val);
-
 	if (force)
-		cpu = cpumask_first(&amask);
+		cpu = cpumask_first_and(&direct->lmask, mask_val);
 	else
-		cpu = cpumask_any_and(&amask, cpu_online_mask);
+		cpu = cpumask_first_and_and(&direct->lmask, mask_val, cpu_online_mask);
 
 	if (cpu >= nr_cpu_ids)
 		return -EINVAL;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 6/7] irqchip/sifive-plic: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 [PATCH v2 0/7] Remove on-stack cpumask var for irq subsystem Dawei Li
                   ` (4 preceding siblings ...)
  2024-04-16  8:54 ` [PATCH v2 5/7] irqchip/riscv-aplic-direct: " Dawei Li
@ 2024-04-16  8:54 ` Dawei Li
  2024-04-17 11:21   ` Anup Patel
  2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
  2024-04-16  8:54 ` [PATCH v2 7/7] cpuidle: " Dawei Li
  6 siblings, 2 replies; 22+ messages in thread
From: Dawei Li @ 2024-04-16  8:54 UTC (permalink / raw)
  To: tglx, yury.norov, rafael
  Cc: akpm, maz, florian.fainelli, chenhuacai, jiaxun.yang, anup,
	palmer, samuel.holland, linux, daniel.lezcano, linux-kernel,
	Dawei Li

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_first_and_and() to avoid the need for a temporary cpumask on
the stack.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/irqchip/irq-sifive-plic.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index f3d4cb9e34f7..8fb183ced1e7 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -164,15 +164,12 @@ static int plic_set_affinity(struct irq_data *d,
 			     const struct cpumask *mask_val, bool force)
 {
 	unsigned int cpu;
-	struct cpumask amask;
 	struct plic_priv *priv = irq_data_get_irq_chip_data(d);
 
-	cpumask_and(&amask, &priv->lmask, mask_val);
-
 	if (force)
-		cpu = cpumask_first(&amask);
+		cpu = cpumask_first_and(&priv->lmask, mask_val);
 	else
-		cpu = cpumask_any_and(&amask, cpu_online_mask);
+		cpu = cpumask_first_and_and(&priv->lmask, mask_val, cpu_online_mask);
 
 	if (cpu >= nr_cpu_ids)
 		return -EINVAL;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 7/7] cpuidle: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 [PATCH v2 0/7] Remove on-stack cpumask var for irq subsystem Dawei Li
                   ` (5 preceding siblings ...)
  2024-04-16  8:54 ` [PATCH v2 6/7] irqchip/sifive-plic: " Dawei Li
@ 2024-04-16  8:54 ` Dawei Li
  2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
  6 siblings, 1 reply; 22+ messages in thread
From: Dawei Li @ 2024-04-16  8:54 UTC (permalink / raw)
  To: tglx, yury.norov, rafael
  Cc: akpm, maz, florian.fainelli, chenhuacai, jiaxun.yang, anup,
	palmer, samuel.holland, linux, daniel.lezcano, linux-kernel,
	Dawei Li

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_first_and_and() and cpumask_weight_and() to avoid the need
for a temporary cpumask on the stack.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/cpuidle/coupled.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c
index 9acde71558d5..bb8761c8a42e 100644
--- a/drivers/cpuidle/coupled.c
+++ b/drivers/cpuidle/coupled.c
@@ -439,13 +439,8 @@ static int cpuidle_coupled_clear_pokes(int cpu)
 
 static bool cpuidle_coupled_any_pokes_pending(struct cpuidle_coupled *coupled)
 {
-	cpumask_t cpus;
-	int ret;
-
-	cpumask_and(&cpus, cpu_online_mask, &coupled->coupled_cpus);
-	ret = cpumask_and(&cpus, &cpuidle_coupled_poke_pending, &cpus);
-
-	return ret;
+	return cpumask_first_and_and(cpu_online_mask, &coupled->coupled_cpus,
+				     &cpuidle_coupled_poke_pending) < nr_cpu_ids;
 }
 
 /**
@@ -626,9 +621,7 @@ int cpuidle_enter_state_coupled(struct cpuidle_device *dev,
 
 static void cpuidle_coupled_update_online_cpus(struct cpuidle_coupled *coupled)
 {
-	cpumask_t cpus;
-	cpumask_and(&cpus, cpu_online_mask, &coupled->coupled_cpus);
-	coupled->online_count = cpumask_weight(&cpus);
+	coupled->online_count = cpumask_weight_and(cpu_online_mask, &coupled->coupled_cpus);
 }
 
 /**
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/7] cpumask: introduce cpumask_first_and_and()
  2024-04-16  8:54 ` [PATCH v2 1/7] cpumask: introduce cpumask_first_and_and() Dawei Li
@ 2024-04-16 17:45   ` Yury Norov
  2024-04-16 17:49     ` Yury Norov
  2024-04-24 20:04   ` [tip: irq/core] cpumask: Introduce cpumask_first_and_and() tip-bot2 for Dawei Li
  1 sibling, 1 reply; 22+ messages in thread
From: Yury Norov @ 2024-04-16 17:45 UTC (permalink / raw)
  To: Dawei Li
  Cc: tglx, rafael, akpm, maz, florian.fainelli, chenhuacai,
	jiaxun.yang, anup, palmer, samuel.holland, linux, daniel.lezcano,
	linux-kernel

On Tue, Apr 16, 2024 at 04:54:48PM +0800, Dawei Li wrote:
> Introduce cpumask_first_and_and() to get intersection between 3 cpumasks,
> free of any intermediate cpumask variable. Instead, cpumask_first_and_and()
> works in-place with all inputs and produce desired output directly.
> 
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

Acked-by: Yury Norov <yury.norov@gmail.com>

> ---
>  include/linux/cpumask.h | 17 +++++++++++++++++
>  include/linux/find.h    | 27 +++++++++++++++++++++++++++
>  lib/find_bit.c          | 12 ++++++++++++
>  3 files changed, 56 insertions(+)
> 
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index 1c29947db848..c46f9e9e1d66 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -187,6 +187,23 @@ unsigned int cpumask_first_and(const struct cpumask *srcp1, const struct cpumask
>  	return find_first_and_bit(cpumask_bits(srcp1), cpumask_bits(srcp2), small_cpumask_bits);
>  }
>  
> +/**
> + * cpumask_first_and_and - return the first cpu from *srcp1 & *srcp2 & *srcp3
> + * @srcp1: the first input
> + * @srcp2: the second input
> + * @srcp3: the third input
> + *
> + * Return: >= nr_cpu_ids if no cpus set in all.
> + */
> +static inline
> +unsigned int cpumask_first_and_and(const struct cpumask *srcp1,
> +				   const struct cpumask *srcp2,
> +				   const struct cpumask *srcp3)
> +{
> +	return find_first_and_and_bit(cpumask_bits(srcp1), cpumask_bits(srcp2),
> +				      cpumask_bits(srcp3), small_cpumask_bits);
> +}
> +
>  /**
>   * cpumask_last - get the last CPU in a cpumask
>   * @srcp:	- the cpumask pointer
> diff --git a/include/linux/find.h b/include/linux/find.h
> index c69598e383c1..28ec5a03393a 100644
> --- a/include/linux/find.h
> +++ b/include/linux/find.h
> @@ -29,6 +29,8 @@ unsigned long __find_nth_and_andnot_bit(const unsigned long *addr1, const unsign
>  					unsigned long n);
>  extern unsigned long _find_first_and_bit(const unsigned long *addr1,
>  					 const unsigned long *addr2, unsigned long size);
> +unsigned long _find_first_and_and_bit(const unsigned long *addr1, const unsigned long *addr2,
> +				      const unsigned long *addr3, unsigned long size);
>  extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
>  extern unsigned long _find_last_bit(const unsigned long *addr, unsigned long size);
>  
> @@ -345,6 +347,31 @@ unsigned long find_first_and_bit(const unsigned long *addr1,
>  }
>  #endif
>  
> +/**
> + * find_first_and_and_bit - find the first set bit in 3 memory regions
> + * @addr1: The first address to base the search on
> + * @addr2: The second address to base the search on
> + * @addr3: The third address to base the search on
> + * @size: The bitmap size in bits
> + *
> + * Returns the bit number for the first set bit
> + * If no bits are set, returns @size.
> + */
> +static inline
> +unsigned long find_first_and_and_bit(const unsigned long *addr1,
> +				     const unsigned long *addr2,
> +				     const unsigned long *addr3,
> +				     unsigned long size)
> +{
> +	if (small_const_nbits(size)) {
> +		unsigned long val = *addr1 & *addr2 & *addr3 & GENMASK(size - 1, 0);
> +
> +		return val ? __ffs(val) : size;
> +	}
> +
> +	return _find_first_and_and_bit(addr1, addr2, addr3, size);
> +}
> +
>  #ifndef find_first_zero_bit
>  /**
>   * find_first_zero_bit - find the first cleared bit in a memory region
> diff --git a/lib/find_bit.c b/lib/find_bit.c
> index 32f99e9a670e..dacadd904250 100644
> --- a/lib/find_bit.c
> +++ b/lib/find_bit.c
> @@ -116,6 +116,18 @@ unsigned long _find_first_and_bit(const unsigned long *addr1,
>  EXPORT_SYMBOL(_find_first_and_bit);
>  #endif
>  
> +/*
> + * Find the first set bit in three memory regions.
> + */
> +unsigned long _find_first_and_and_bit(const unsigned long *addr1,
> +				      const unsigned long *addr2,
> +				      const unsigned long *addr3,
> +				      unsigned long size)
> +{
> +	return FIND_FIRST_BIT(addr1[idx] & addr2[idx] & addr3[idx], /* nop */, size);
> +}
> +EXPORT_SYMBOL(_find_first_and_and_bit);
> +
>  #ifndef find_first_zero_bit
>  /*
>   * Find the first cleared bit in a memory region.
> -- 
> 2.27.0

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/7] cpumask: introduce cpumask_first_and_and()
  2024-04-16 17:45   ` Yury Norov
@ 2024-04-16 17:49     ` Yury Norov
  2024-04-17  1:35       ` Dawei Li
  0 siblings, 1 reply; 22+ messages in thread
From: Yury Norov @ 2024-04-16 17:49 UTC (permalink / raw)
  To: Dawei Li
  Cc: tglx, rafael, akpm, maz, florian.fainelli, chenhuacai,
	jiaxun.yang, anup, palmer, samuel.holland, linux, daniel.lezcano,
	linux-kernel

On Tue, Apr 16, 2024 at 10:45:54AM -0700, Yury Norov wrote:
> On Tue, Apr 16, 2024 at 04:54:48PM +0800, Dawei Li wrote:
> > Introduce cpumask_first_and_and() to get intersection between 3 cpumasks,
> > free of any intermediate cpumask variable. Instead, cpumask_first_and_and()
> > works in-place with all inputs and produce desired output directly.

Still there: s/produce/produces

But whatever. Also, I think this patch would better go with the rest
of the series, right?

> > 
> > Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
> 
> Acked-by: Yury Norov <yury.norov@gmail.com>
> 
> > ---
> >  include/linux/cpumask.h | 17 +++++++++++++++++
> >  include/linux/find.h    | 27 +++++++++++++++++++++++++++
> >  lib/find_bit.c          | 12 ++++++++++++
> >  3 files changed, 56 insertions(+)
> > 
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index 1c29947db848..c46f9e9e1d66 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -187,6 +187,23 @@ unsigned int cpumask_first_and(const struct cpumask *srcp1, const struct cpumask
> >  	return find_first_and_bit(cpumask_bits(srcp1), cpumask_bits(srcp2), small_cpumask_bits);
> >  }
> >  
> > +/**
> > + * cpumask_first_and_and - return the first cpu from *srcp1 & *srcp2 & *srcp3
> > + * @srcp1: the first input
> > + * @srcp2: the second input
> > + * @srcp3: the third input
> > + *
> > + * Return: >= nr_cpu_ids if no cpus set in all.
> > + */
> > +static inline
> > +unsigned int cpumask_first_and_and(const struct cpumask *srcp1,
> > +				   const struct cpumask *srcp2,
> > +				   const struct cpumask *srcp3)
> > +{
> > +	return find_first_and_and_bit(cpumask_bits(srcp1), cpumask_bits(srcp2),
> > +				      cpumask_bits(srcp3), small_cpumask_bits);
> > +}
> > +
> >  /**
> >   * cpumask_last - get the last CPU in a cpumask
> >   * @srcp:	- the cpumask pointer
> > diff --git a/include/linux/find.h b/include/linux/find.h
> > index c69598e383c1..28ec5a03393a 100644
> > --- a/include/linux/find.h
> > +++ b/include/linux/find.h
> > @@ -29,6 +29,8 @@ unsigned long __find_nth_and_andnot_bit(const unsigned long *addr1, const unsign
> >  					unsigned long n);
> >  extern unsigned long _find_first_and_bit(const unsigned long *addr1,
> >  					 const unsigned long *addr2, unsigned long size);
> > +unsigned long _find_first_and_and_bit(const unsigned long *addr1, const unsigned long *addr2,
> > +				      const unsigned long *addr3, unsigned long size);
> >  extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
> >  extern unsigned long _find_last_bit(const unsigned long *addr, unsigned long size);
> >  
> > @@ -345,6 +347,31 @@ unsigned long find_first_and_bit(const unsigned long *addr1,
> >  }
> >  #endif
> >  
> > +/**
> > + * find_first_and_and_bit - find the first set bit in 3 memory regions
> > + * @addr1: The first address to base the search on
> > + * @addr2: The second address to base the search on
> > + * @addr3: The third address to base the search on
> > + * @size: The bitmap size in bits
> > + *
> > + * Returns the bit number for the first set bit
> > + * If no bits are set, returns @size.
> > + */
> > +static inline
> > +unsigned long find_first_and_and_bit(const unsigned long *addr1,
> > +				     const unsigned long *addr2,
> > +				     const unsigned long *addr3,
> > +				     unsigned long size)
> > +{
> > +	if (small_const_nbits(size)) {
> > +		unsigned long val = *addr1 & *addr2 & *addr3 & GENMASK(size - 1, 0);
> > +
> > +		return val ? __ffs(val) : size;
> > +	}
> > +
> > +	return _find_first_and_and_bit(addr1, addr2, addr3, size);
> > +}
> > +
> >  #ifndef find_first_zero_bit
> >  /**
> >   * find_first_zero_bit - find the first cleared bit in a memory region
> > diff --git a/lib/find_bit.c b/lib/find_bit.c
> > index 32f99e9a670e..dacadd904250 100644
> > --- a/lib/find_bit.c
> > +++ b/lib/find_bit.c
> > @@ -116,6 +116,18 @@ unsigned long _find_first_and_bit(const unsigned long *addr1,
> >  EXPORT_SYMBOL(_find_first_and_bit);
> >  #endif
> >  
> > +/*
> > + * Find the first set bit in three memory regions.
> > + */
> > +unsigned long _find_first_and_and_bit(const unsigned long *addr1,
> > +				      const unsigned long *addr2,
> > +				      const unsigned long *addr3,
> > +				      unsigned long size)
> > +{
> > +	return FIND_FIRST_BIT(addr1[idx] & addr2[idx] & addr3[idx], /* nop */, size);
> > +}
> > +EXPORT_SYMBOL(_find_first_and_and_bit);
> > +
> >  #ifndef find_first_zero_bit
> >  /*
> >   * Find the first cleared bit in a memory region.
> > -- 
> > 2.27.0

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 4/7] irqchip/loongson-eiointc: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 ` [PATCH v2 4/7] irqchip/loongson-eiointc: " Dawei Li
@ 2024-04-16 18:01   ` Yury Norov
  2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
  1 sibling, 0 replies; 22+ messages in thread
From: Yury Norov @ 2024-04-16 18:01 UTC (permalink / raw)
  To: Dawei Li
  Cc: tglx, rafael, akpm, maz, florian.fainelli, chenhuacai,
	jiaxun.yang, anup, palmer, samuel.holland, linux, daniel.lezcano,
	linux-kernel

On Tue, Apr 16, 2024 at 04:54:51PM +0800, Dawei Li wrote:
> In general it's preferable to avoid placing cpumasks on the stack, as
> for large values of NR_CPUS these can consume significant amounts of
> stack space and make stack overflows more likely.
> 
> Use cpumask_first_and_and() to avoid the need for a temporary cpumask on
> the stack.
> 
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
> ---
>  drivers/irqchip/irq-loongson-eiointc.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-loongson-eiointc.c b/drivers/irqchip/irq-loongson-eiointc.c
> index 4f5e6d21d77d..c7ddebf312ad 100644
> --- a/drivers/irqchip/irq-loongson-eiointc.c
> +++ b/drivers/irqchip/irq-loongson-eiointc.c
> @@ -93,19 +93,15 @@ static int eiointc_set_irq_affinity(struct irq_data *d, const struct cpumask *af
>  	unsigned int cpu;
>  	unsigned long flags;
>  	uint32_t vector, regaddr;
> -	struct cpumask intersect_affinity;
>  	struct eiointc_priv *priv = d->domain->host_data;
>  
>  	raw_spin_lock_irqsave(&affinity_lock, flags);
>  
> -	cpumask_and(&intersect_affinity, affinity, cpu_online_mask);
> -	cpumask_and(&intersect_affinity, &intersect_affinity, &priv->cpuspan_map);
> -
> -	if (cpumask_empty(&intersect_affinity)) {

This was unneeded because cpumask_and() returns true if there are set
bits.

For the series:

Reviewed-by: Yury Norov <yury.norov@gmail.com>

> +	cpu = cpumask_first_and_and(&priv->cpuspan_map, affinity, cpu_online_mask);
> +	if (cpu >= nr_cpu_ids) {
>  		raw_spin_unlock_irqrestore(&affinity_lock, flags);
>  		return -EINVAL;
>  	}
> -	cpu = cpumask_first(&intersect_affinity);
>  
>  	vector = d->hwirq;
>  	regaddr = EIOINTC_REG_ENABLE + ((vector >> 5) << 2);
> -- 
> 2.27.0

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/7] cpumask: introduce cpumask_first_and_and()
  2024-04-16 17:49     ` Yury Norov
@ 2024-04-17  1:35       ` Dawei Li
  0 siblings, 0 replies; 22+ messages in thread
From: Dawei Li @ 2024-04-17  1:35 UTC (permalink / raw)
  To: Yury Norov
  Cc: tglx, rafael, akpm, maz, florian.fainelli, chenhuacai,
	jiaxun.yang, anup, palmer, samuel.holland, linux, daniel.lezcano,
	linux-kernel

Hi Yury,

Thanks for review.

On Tue, Apr 16, 2024 at 10:49:46AM -0700, Yury Norov wrote:
> On Tue, Apr 16, 2024 at 10:45:54AM -0700, Yury Norov wrote:
> > On Tue, Apr 16, 2024 at 04:54:48PM +0800, Dawei Li wrote:
> > > Introduce cpumask_first_and_and() to get intersection between 3 cpumasks,
> > > free of any intermediate cpumask variable. Instead, cpumask_first_and_and()
> > > works in-place with all inputs and produce desired output directly.
> 
> Still there: s/produce/produces

Oops, sorry for that. If it's needed I will respin v3. 

> 
> But whatever. Also, I think this patch would better go with the rest
> of the series, right?

I suppose so, this series should be applied as a whole.

> 
> > > 
> > > Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

> > 
> > Acked-by: Yury Norov <yury.norov@gmail.com>

Thanks!

    Dawei
> > 
> > > ---
> > >  include/linux/cpumask.h | 17 +++++++++++++++++
> > >  include/linux/find.h    | 27 +++++++++++++++++++++++++++
> > >  lib/find_bit.c          | 12 ++++++++++++
> > >  3 files changed, 56 insertions(+)
> > > 
> > > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > > index 1c29947db848..c46f9e9e1d66 100644
> > > --- a/include/linux/cpumask.h
> > > +++ b/include/linux/cpumask.h
> > > @@ -187,6 +187,23 @@ unsigned int cpumask_first_and(const struct cpumask *srcp1, const struct cpumask
> > >  	return find_first_and_bit(cpumask_bits(srcp1), cpumask_bits(srcp2), small_cpumask_bits);
> > >  }
> > >  
> > > +/**
> > > + * cpumask_first_and_and - return the first cpu from *srcp1 & *srcp2 & *srcp3
> > > + * @srcp1: the first input
> > > + * @srcp2: the second input
> > > + * @srcp3: the third input
> > > + *
> > > + * Return: >= nr_cpu_ids if no cpus set in all.
> > > + */
> > > +static inline
> > > +unsigned int cpumask_first_and_and(const struct cpumask *srcp1,
> > > +				   const struct cpumask *srcp2,
> > > +				   const struct cpumask *srcp3)
> > > +{
> > > +	return find_first_and_and_bit(cpumask_bits(srcp1), cpumask_bits(srcp2),
> > > +				      cpumask_bits(srcp3), small_cpumask_bits);
> > > +}
> > > +
> > >  /**
> > >   * cpumask_last - get the last CPU in a cpumask
> > >   * @srcp:	- the cpumask pointer
> > > diff --git a/include/linux/find.h b/include/linux/find.h
> > > index c69598e383c1..28ec5a03393a 100644
> > > --- a/include/linux/find.h
> > > +++ b/include/linux/find.h
> > > @@ -29,6 +29,8 @@ unsigned long __find_nth_and_andnot_bit(const unsigned long *addr1, const unsign
> > >  					unsigned long n);
> > >  extern unsigned long _find_first_and_bit(const unsigned long *addr1,
> > >  					 const unsigned long *addr2, unsigned long size);
> > > +unsigned long _find_first_and_and_bit(const unsigned long *addr1, const unsigned long *addr2,
> > > +				      const unsigned long *addr3, unsigned long size);
> > >  extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
> > >  extern unsigned long _find_last_bit(const unsigned long *addr, unsigned long size);
> > >  
> > > @@ -345,6 +347,31 @@ unsigned long find_first_and_bit(const unsigned long *addr1,
> > >  }
> > >  #endif
> > >  
> > > +/**
> > > + * find_first_and_and_bit - find the first set bit in 3 memory regions
> > > + * @addr1: The first address to base the search on
> > > + * @addr2: The second address to base the search on
> > > + * @addr3: The third address to base the search on
> > > + * @size: The bitmap size in bits
> > > + *
> > > + * Returns the bit number for the first set bit
> > > + * If no bits are set, returns @size.
> > > + */
> > > +static inline
> > > +unsigned long find_first_and_and_bit(const unsigned long *addr1,
> > > +				     const unsigned long *addr2,
> > > +				     const unsigned long *addr3,
> > > +				     unsigned long size)
> > > +{
> > > +	if (small_const_nbits(size)) {
> > > +		unsigned long val = *addr1 & *addr2 & *addr3 & GENMASK(size - 1, 0);
> > > +
> > > +		return val ? __ffs(val) : size;
> > > +	}
> > > +
> > > +	return _find_first_and_and_bit(addr1, addr2, addr3, size);
> > > +}
> > > +
> > >  #ifndef find_first_zero_bit
> > >  /**
> > >   * find_first_zero_bit - find the first cleared bit in a memory region
> > > diff --git a/lib/find_bit.c b/lib/find_bit.c
> > > index 32f99e9a670e..dacadd904250 100644
> > > --- a/lib/find_bit.c
> > > +++ b/lib/find_bit.c
> > > @@ -116,6 +116,18 @@ unsigned long _find_first_and_bit(const unsigned long *addr1,
> > >  EXPORT_SYMBOL(_find_first_and_bit);
> > >  #endif
> > >  
> > > +/*
> > > + * Find the first set bit in three memory regions.
> > > + */
> > > +unsigned long _find_first_and_and_bit(const unsigned long *addr1,
> > > +				      const unsigned long *addr2,
> > > +				      const unsigned long *addr3,
> > > +				      unsigned long size)
> > > +{
> > > +	return FIND_FIRST_BIT(addr1[idx] & addr2[idx] & addr3[idx], /* nop */, size);
> > > +}
> > > +EXPORT_SYMBOL(_find_first_and_and_bit);
> > > +
> > >  #ifndef find_first_zero_bit
> > >  /*
> > >   * Find the first cleared bit in a memory region.
> > > -- 
> > > 2.27.0
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/7] irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 ` [PATCH v2 3/7] irqchip/gic-v3-its: " Dawei Li
@ 2024-04-17 10:56   ` Marc Zyngier
  2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
  1 sibling, 0 replies; 22+ messages in thread
From: Marc Zyngier @ 2024-04-17 10:56 UTC (permalink / raw)
  To: Dawei Li
  Cc: tglx, yury.norov, rafael, akpm, florian.fainelli, chenhuacai,
	jiaxun.yang, anup, palmer, samuel.holland, linux, daniel.lezcano,
	linux-kernel

On Tue, 16 Apr 2024 09:54:50 +0100,
Dawei Li <dawei.li@shingroup.cn> wrote:
> 
> In general it's preferable to avoid placing cpumasks on the stack, as
> for large values of NR_CPUS these can consume significant amounts of
> stack space and make stack overflows more likely.
> 
> Remove cpumask var on stack and use cpumask_any_and() to address it.
> 
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

Reviewed-by: Marc Zyngier <maz@kernel.org>

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 6/7] irqchip/sifive-plic: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 ` [PATCH v2 6/7] irqchip/sifive-plic: " Dawei Li
@ 2024-04-17 11:21   ` Anup Patel
  2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
  1 sibling, 0 replies; 22+ messages in thread
From: Anup Patel @ 2024-04-17 11:21 UTC (permalink / raw)
  To: Dawei Li
  Cc: tglx, yury.norov, rafael, akpm, maz, florian.fainelli,
	chenhuacai, jiaxun.yang, palmer, samuel.holland, linux,
	daniel.lezcano, linux-kernel

On Tue, Apr 16, 2024 at 2:26 PM Dawei Li <dawei.li@shingroup.cn> wrote:
>
> In general it's preferable to avoid placing cpumasks on the stack, as
> for large values of NR_CPUS these can consume significant amounts of
> stack space and make stack overflows more likely.
>
> Use cpumask_first_and_and() to avoid the need for a temporary cpumask on
> the stack.
>
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

LGTM.

Reviewed-by: Anup Patel <anup@brainfault.org>

Regards,
Anup

> ---
>  drivers/irqchip/irq-sifive-plic.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
> index f3d4cb9e34f7..8fb183ced1e7 100644
> --- a/drivers/irqchip/irq-sifive-plic.c
> +++ b/drivers/irqchip/irq-sifive-plic.c
> @@ -164,15 +164,12 @@ static int plic_set_affinity(struct irq_data *d,
>                              const struct cpumask *mask_val, bool force)
>  {
>         unsigned int cpu;
> -       struct cpumask amask;
>         struct plic_priv *priv = irq_data_get_irq_chip_data(d);
>
> -       cpumask_and(&amask, &priv->lmask, mask_val);
> -
>         if (force)
> -               cpu = cpumask_first(&amask);
> +               cpu = cpumask_first_and(&priv->lmask, mask_val);
>         else
> -               cpu = cpumask_any_and(&amask, cpu_online_mask);
> +               cpu = cpumask_first_and_and(&priv->lmask, mask_val, cpu_online_mask);
>
>         if (cpu >= nr_cpu_ids)
>                 return -EINVAL;
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 5/7] irqchip/riscv-aplic-direct: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 ` [PATCH v2 5/7] irqchip/riscv-aplic-direct: " Dawei Li
@ 2024-04-17 11:22   ` Anup Patel
  2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
  1 sibling, 0 replies; 22+ messages in thread
From: Anup Patel @ 2024-04-17 11:22 UTC (permalink / raw)
  To: Dawei Li
  Cc: tglx, yury.norov, rafael, akpm, maz, florian.fainelli,
	chenhuacai, jiaxun.yang, palmer, samuel.holland, linux,
	daniel.lezcano, linux-kernel

On Tue, Apr 16, 2024 at 2:26 PM Dawei Li <dawei.li@shingroup.cn> wrote:
>
> In general it's preferable to avoid placing cpumasks on the stack, as
> for large values of NR_CPUS these can consume significant amounts of
> stack space and make stack overflows more likely.
>
> Use cpumask_first_and_and() to avoid the need for a temporary cpumask on
> the stack.
>
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>

LGTM.

Reviewed-by: Anup Patel <anup@brainfault.org>

Regards,
Anup

> ---
>  drivers/irqchip/irq-riscv-aplic-direct.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/irqchip/irq-riscv-aplic-direct.c b/drivers/irqchip/irq-riscv-aplic-direct.c
> index 06bace9b7497..4a3ffe856d6c 100644
> --- a/drivers/irqchip/irq-riscv-aplic-direct.c
> +++ b/drivers/irqchip/irq-riscv-aplic-direct.c
> @@ -54,15 +54,12 @@ static int aplic_direct_set_affinity(struct irq_data *d, const struct cpumask *m
>         struct aplic_direct *direct = container_of(priv, struct aplic_direct, priv);
>         struct aplic_idc *idc;
>         unsigned int cpu, val;
> -       struct cpumask amask;
>         void __iomem *target;
>
> -       cpumask_and(&amask, &direct->lmask, mask_val);
> -
>         if (force)
> -               cpu = cpumask_first(&amask);
> +               cpu = cpumask_first_and(&direct->lmask, mask_val);
>         else
> -               cpu = cpumask_any_and(&amask, cpu_online_mask);
> +               cpu = cpumask_first_and_and(&direct->lmask, mask_val, cpu_online_mask);
>
>         if (cpu >= nr_cpu_ids)
>                 return -EINVAL;
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [tip: irq/core] irqchip/sifive-plic: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 ` [PATCH v2 6/7] irqchip/sifive-plic: " Dawei Li
  2024-04-17 11:21   ` Anup Patel
@ 2024-04-24 20:04   ` tip-bot2 for Dawei Li
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Dawei Li @ 2024-04-24 20:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dawei Li, Thomas Gleixner, Anup Patel, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     a7fb69ffd7ce438a259b2f9fbcebc62f5caf2d4f
Gitweb:        https://git.kernel.org/tip/a7fb69ffd7ce438a259b2f9fbcebc62f5caf2d4f
Author:        Dawei Li <dawei.li@shingroup.cn>
AuthorDate:    Tue, 16 Apr 2024 16:54:53 +08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 24 Apr 2024 21:23:49 +02:00

irqchip/sifive-plic: Avoid explicit cpumask allocation on stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_first_and_and() to avoid the need for a temporary cpumask on
the stack.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240416085454.3547175-7-dawei.li@shingroup.cn

---
 drivers/irqchip/irq-sifive-plic.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index f3d4cb9..8fb183c 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -164,15 +164,12 @@ static int plic_set_affinity(struct irq_data *d,
 			     const struct cpumask *mask_val, bool force)
 {
 	unsigned int cpu;
-	struct cpumask amask;
 	struct plic_priv *priv = irq_data_get_irq_chip_data(d);
 
-	cpumask_and(&amask, &priv->lmask, mask_val);
-
 	if (force)
-		cpu = cpumask_first(&amask);
+		cpu = cpumask_first_and(&priv->lmask, mask_val);
 	else
-		cpu = cpumask_any_and(&amask, cpu_online_mask);
+		cpu = cpumask_first_and_and(&priv->lmask, mask_val, cpu_online_mask);
 
 	if (cpu >= nr_cpu_ids)
 		return -EINVAL;

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: irq/core] cpuidle: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 ` [PATCH v2 7/7] cpuidle: " Dawei Li
@ 2024-04-24 20:04   ` tip-bot2 for Dawei Li
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot2 for Dawei Li @ 2024-04-24 20:04 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dawei Li, Thomas Gleixner, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     6f28c4a852fab8bd759a383149dfd30511477249
Gitweb:        https://git.kernel.org/tip/6f28c4a852fab8bd759a383149dfd30511477249
Author:        Dawei Li <dawei.li@shingroup.cn>
AuthorDate:    Tue, 16 Apr 2024 16:54:54 +08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 24 Apr 2024 21:23:49 +02:00

cpuidle: Avoid explicit cpumask allocation on stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_first_and_and() and cpumask_weight_and() to avoid the need
for a temporary cpumask on the stack.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240416085454.3547175-8-dawei.li@shingroup.cn

---
 drivers/cpuidle/coupled.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c
index 9acde71..bb8761c 100644
--- a/drivers/cpuidle/coupled.c
+++ b/drivers/cpuidle/coupled.c
@@ -439,13 +439,8 @@ static int cpuidle_coupled_clear_pokes(int cpu)
 
 static bool cpuidle_coupled_any_pokes_pending(struct cpuidle_coupled *coupled)
 {
-	cpumask_t cpus;
-	int ret;
-
-	cpumask_and(&cpus, cpu_online_mask, &coupled->coupled_cpus);
-	ret = cpumask_and(&cpus, &cpuidle_coupled_poke_pending, &cpus);
-
-	return ret;
+	return cpumask_first_and_and(cpu_online_mask, &coupled->coupled_cpus,
+				     &cpuidle_coupled_poke_pending) < nr_cpu_ids;
 }
 
 /**
@@ -626,9 +621,7 @@ out:
 
 static void cpuidle_coupled_update_online_cpus(struct cpuidle_coupled *coupled)
 {
-	cpumask_t cpus;
-	cpumask_and(&cpus, cpu_online_mask, &coupled->coupled_cpus);
-	coupled->online_count = cpumask_weight(&cpus);
+	coupled->online_count = cpumask_weight_and(cpu_online_mask, &coupled->coupled_cpus);
 }
 
 /**

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: irq/core] irqchip/riscv-aplic-direct: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 ` [PATCH v2 5/7] irqchip/riscv-aplic-direct: " Dawei Li
  2024-04-17 11:22   ` Anup Patel
@ 2024-04-24 20:04   ` tip-bot2 for Dawei Li
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Dawei Li @ 2024-04-24 20:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dawei Li, Thomas Gleixner, Anup Patel, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     5d650d1eba876717888a0951ed873ef0f1d8cf61
Gitweb:        https://git.kernel.org/tip/5d650d1eba876717888a0951ed873ef0f1d8cf61
Author:        Dawei Li <dawei.li@shingroup.cn>
AuthorDate:    Tue, 16 Apr 2024 16:54:52 +08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 24 Apr 2024 21:23:49 +02:00

irqchip/riscv-aplic-direct: Avoid explicit cpumask allocation on stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_first_and_and() to avoid the need for a temporary cpumask on
the stack.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240416085454.3547175-6-dawei.li@shingroup.cn

---
 drivers/irqchip/irq-riscv-aplic-direct.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/irqchip/irq-riscv-aplic-direct.c b/drivers/irqchip/irq-riscv-aplic-direct.c
index 06bace9..4a3ffe8 100644
--- a/drivers/irqchip/irq-riscv-aplic-direct.c
+++ b/drivers/irqchip/irq-riscv-aplic-direct.c
@@ -54,15 +54,12 @@ static int aplic_direct_set_affinity(struct irq_data *d, const struct cpumask *m
 	struct aplic_direct *direct = container_of(priv, struct aplic_direct, priv);
 	struct aplic_idc *idc;
 	unsigned int cpu, val;
-	struct cpumask amask;
 	void __iomem *target;
 
-	cpumask_and(&amask, &direct->lmask, mask_val);
-
 	if (force)
-		cpu = cpumask_first(&amask);
+		cpu = cpumask_first_and(&direct->lmask, mask_val);
 	else
-		cpu = cpumask_any_and(&amask, cpu_online_mask);
+		cpu = cpumask_first_and_and(&direct->lmask, mask_val, cpu_online_mask);
 
 	if (cpu >= nr_cpu_ids)
 		return -EINVAL;

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: irq/core] irqchip/loongson-eiointc: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 ` [PATCH v2 4/7] irqchip/loongson-eiointc: " Dawei Li
  2024-04-16 18:01   ` Yury Norov
@ 2024-04-24 20:04   ` tip-bot2 for Dawei Li
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Dawei Li @ 2024-04-24 20:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dawei Li, Thomas Gleixner, Yury Norov, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     2bc32db5a262cc34753cb4208b2d3043d1cd81ae
Gitweb:        https://git.kernel.org/tip/2bc32db5a262cc34753cb4208b2d3043d1cd81ae
Author:        Dawei Li <dawei.li@shingroup.cn>
AuthorDate:    Tue, 16 Apr 2024 16:54:51 +08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 24 Apr 2024 21:23:49 +02:00

irqchip/loongson-eiointc: Avoid explicit cpumask allocation on stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_first_and_and() to avoid the need for a temporary cpumask on
the stack.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20240416085454.3547175-5-dawei.li@shingroup.cn

---
 drivers/irqchip/irq-loongson-eiointc.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/irqchip/irq-loongson-eiointc.c b/drivers/irqchip/irq-loongson-eiointc.c
index 4f5e6d2..c7ddebf 100644
--- a/drivers/irqchip/irq-loongson-eiointc.c
+++ b/drivers/irqchip/irq-loongson-eiointc.c
@@ -93,19 +93,15 @@ static int eiointc_set_irq_affinity(struct irq_data *d, const struct cpumask *af
 	unsigned int cpu;
 	unsigned long flags;
 	uint32_t vector, regaddr;
-	struct cpumask intersect_affinity;
 	struct eiointc_priv *priv = d->domain->host_data;
 
 	raw_spin_lock_irqsave(&affinity_lock, flags);
 
-	cpumask_and(&intersect_affinity, affinity, cpu_online_mask);
-	cpumask_and(&intersect_affinity, &intersect_affinity, &priv->cpuspan_map);
-
-	if (cpumask_empty(&intersect_affinity)) {
+	cpu = cpumask_first_and_and(&priv->cpuspan_map, affinity, cpu_online_mask);
+	if (cpu >= nr_cpu_ids) {
 		raw_spin_unlock_irqrestore(&affinity_lock, flags);
 		return -EINVAL;
 	}
-	cpu = cpumask_first(&intersect_affinity);
 
 	vector = d->hwirq;
 	regaddr = EIOINTC_REG_ENABLE + ((vector >> 5) << 2);

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: irq/core] irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 ` [PATCH v2 3/7] irqchip/gic-v3-its: " Dawei Li
  2024-04-17 10:56   ` Marc Zyngier
@ 2024-04-24 20:04   ` tip-bot2 for Dawei Li
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Dawei Li @ 2024-04-24 20:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dawei Li, Thomas Gleixner, Marc Zyngier, x86, linux-kernel

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     fcb8af4cbcd122e33ceeadd347b8866d32035af7
Gitweb:        https://git.kernel.org/tip/fcb8af4cbcd122e33ceeadd347b8866d32035af7
Author:        Dawei Li <dawei.li@shingroup.cn>
AuthorDate:    Tue, 16 Apr 2024 16:54:50 +08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 24 Apr 2024 21:23:49 +02:00

irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Remove cpumask var on stack and use cpumask_any_and() to address it.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240416085454.3547175-4-dawei.li@shingroup.cn

---
 drivers/irqchip/irq-gic-v3-its.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index fca888b..20f9542 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3826,9 +3826,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
 				bool force)
 {
 	struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
-	struct cpumask common, *table_mask;
+	unsigned int from, cpu = nr_cpu_ids;
+	struct cpumask *table_mask;
 	unsigned long flags;
-	int from, cpu;
 
 	/*
 	 * Changing affinity is mega expensive, so let's be as lazy as
@@ -3850,10 +3850,15 @@ static int its_vpe_set_affinity(struct irq_data *d,
 	 * If we are offered another CPU in the same GICv4.1 ITS
 	 * affinity, pick this one. Otherwise, any CPU will do.
 	 */
-	if (table_mask && cpumask_and(&common, mask_val, table_mask))
-		cpu = cpumask_test_cpu(from, &common) ? from : cpumask_first(&common);
-	else
+	if (table_mask)
+		cpu = cpumask_any_and(mask_val, table_mask);
+	if (cpu < nr_cpu_ids) {
+		if (cpumask_test_cpu(from, mask_val) &&
+		    cpumask_test_cpu(from, table_mask))
+			cpu = from;
+	} else {
 		cpu = cpumask_first(mask_val);
+	}
 
 	if (from == cpu)
 		goto out;

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: irq/core] irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack
  2024-04-16  8:54 ` [PATCH v2 2/7] irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack Dawei Li
@ 2024-04-24 20:04   ` tip-bot2 for Dawei Li
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot2 for Dawei Li @ 2024-04-24 20:04 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Dawei Li, Thomas Gleixner, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     6a9a52f74e3b82ff3f5398810c1b23ad497e2df5
Gitweb:        https://git.kernel.org/tip/6a9a52f74e3b82ff3f5398810c1b23ad497e2df5
Author:        Dawei Li <dawei.li@shingroup.cn>
AuthorDate:    Tue, 16 Apr 2024 16:54:49 +08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 24 Apr 2024 21:23:49 +02:00

irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack

In general it's preferable to avoid placing cpumasks on the stack, as
for large values of NR_CPUS these can consume significant amounts of
stack space and make stack overflows more likely.

Use cpumask_first_and_and() to avoid the need for a temporary cpumask on
the stack.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240416085454.3547175-3-dawei.li@shingroup.cn

---
 drivers/irqchip/irq-bcm6345-l1.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/irqchip/irq-bcm6345-l1.c b/drivers/irqchip/irq-bcm6345-l1.c
index eb02d20..90daa27 100644
--- a/drivers/irqchip/irq-bcm6345-l1.c
+++ b/drivers/irqchip/irq-bcm6345-l1.c
@@ -192,14 +192,10 @@ static int bcm6345_l1_set_affinity(struct irq_data *d,
 	u32 mask = BIT(d->hwirq % IRQS_PER_WORD);
 	unsigned int old_cpu = cpu_for_irq(intc, d);
 	unsigned int new_cpu;
-	struct cpumask valid;
 	unsigned long flags;
 	bool enabled;
 
-	if (!cpumask_and(&valid, &intc->cpumask, dest))
-		return -EINVAL;
-
-	new_cpu = cpumask_any_and(&valid, cpu_online_mask);
+	new_cpu = cpumask_first_and_and(&intc->cpumask, dest, cpu_online_mask);
 	if (new_cpu >= nr_cpu_ids)
 		return -EINVAL;
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: irq/core] cpumask: Introduce cpumask_first_and_and()
  2024-04-16  8:54 ` [PATCH v2 1/7] cpumask: introduce cpumask_first_and_and() Dawei Li
  2024-04-16 17:45   ` Yury Norov
@ 2024-04-24 20:04   ` tip-bot2 for Dawei Li
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Dawei Li @ 2024-04-24 20:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dawei Li, Thomas Gleixner, Yury Norov, x86, linux-kernel, maz

The following commit has been merged into the irq/core branch of tip:

Commit-ID:     cdc66553c4130735f0a2db943a5259e54ff1597a
Gitweb:        https://git.kernel.org/tip/cdc66553c4130735f0a2db943a5259e54ff1597a
Author:        Dawei Li <dawei.li@shingroup.cn>
AuthorDate:    Tue, 16 Apr 2024 16:54:48 +08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 24 Apr 2024 21:23:49 +02:00

cpumask: Introduce cpumask_first_and_and()

Introduce cpumask_first_and_and() to get intersection between 3 cpumasks,
free of any intermediate cpumask variable. Instead, cpumask_first_and_and()
works in-place with all inputs and produces desired output directly.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20240416085454.3547175-2-dawei.li@shingroup.cn

---
 include/linux/cpumask.h | 17 +++++++++++++++++
 include/linux/find.h    | 27 +++++++++++++++++++++++++++
 lib/find_bit.c          | 12 ++++++++++++
 3 files changed, 56 insertions(+)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 1c29947..c46f9e9 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -188,6 +188,23 @@ unsigned int cpumask_first_and(const struct cpumask *srcp1, const struct cpumask
 }
 
 /**
+ * cpumask_first_and_and - return the first cpu from *srcp1 & *srcp2 & *srcp3
+ * @srcp1: the first input
+ * @srcp2: the second input
+ * @srcp3: the third input
+ *
+ * Return: >= nr_cpu_ids if no cpus set in all.
+ */
+static inline
+unsigned int cpumask_first_and_and(const struct cpumask *srcp1,
+				   const struct cpumask *srcp2,
+				   const struct cpumask *srcp3)
+{
+	return find_first_and_and_bit(cpumask_bits(srcp1), cpumask_bits(srcp2),
+				      cpumask_bits(srcp3), small_cpumask_bits);
+}
+
+/**
  * cpumask_last - get the last CPU in a cpumask
  * @srcp:	- the cpumask pointer
  *
diff --git a/include/linux/find.h b/include/linux/find.h
index c69598e..28ec5a0 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -29,6 +29,8 @@ unsigned long __find_nth_and_andnot_bit(const unsigned long *addr1, const unsign
 					unsigned long n);
 extern unsigned long _find_first_and_bit(const unsigned long *addr1,
 					 const unsigned long *addr2, unsigned long size);
+unsigned long _find_first_and_and_bit(const unsigned long *addr1, const unsigned long *addr2,
+				      const unsigned long *addr3, unsigned long size);
 extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
 extern unsigned long _find_last_bit(const unsigned long *addr, unsigned long size);
 
@@ -345,6 +347,31 @@ unsigned long find_first_and_bit(const unsigned long *addr1,
 }
 #endif
 
+/**
+ * find_first_and_and_bit - find the first set bit in 3 memory regions
+ * @addr1: The first address to base the search on
+ * @addr2: The second address to base the search on
+ * @addr3: The third address to base the search on
+ * @size: The bitmap size in bits
+ *
+ * Returns the bit number for the first set bit
+ * If no bits are set, returns @size.
+ */
+static inline
+unsigned long find_first_and_and_bit(const unsigned long *addr1,
+				     const unsigned long *addr2,
+				     const unsigned long *addr3,
+				     unsigned long size)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val = *addr1 & *addr2 & *addr3 & GENMASK(size - 1, 0);
+
+		return val ? __ffs(val) : size;
+	}
+
+	return _find_first_and_and_bit(addr1, addr2, addr3, size);
+}
+
 #ifndef find_first_zero_bit
 /**
  * find_first_zero_bit - find the first cleared bit in a memory region
diff --git a/lib/find_bit.c b/lib/find_bit.c
index 32f99e9..dacadd9 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -116,6 +116,18 @@ unsigned long _find_first_and_bit(const unsigned long *addr1,
 EXPORT_SYMBOL(_find_first_and_bit);
 #endif
 
+/*
+ * Find the first set bit in three memory regions.
+ */
+unsigned long _find_first_and_and_bit(const unsigned long *addr1,
+				      const unsigned long *addr2,
+				      const unsigned long *addr3,
+				      unsigned long size)
+{
+	return FIND_FIRST_BIT(addr1[idx] & addr2[idx] & addr3[idx], /* nop */, size);
+}
+EXPORT_SYMBOL(_find_first_and_and_bit);
+
 #ifndef find_first_zero_bit
 /*
  * Find the first cleared bit in a memory region.

^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-04-24 20:04 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-16  8:54 [PATCH v2 0/7] Remove on-stack cpumask var for irq subsystem Dawei Li
2024-04-16  8:54 ` [PATCH v2 1/7] cpumask: introduce cpumask_first_and_and() Dawei Li
2024-04-16 17:45   ` Yury Norov
2024-04-16 17:49     ` Yury Norov
2024-04-17  1:35       ` Dawei Li
2024-04-24 20:04   ` [tip: irq/core] cpumask: Introduce cpumask_first_and_and() tip-bot2 for Dawei Li
2024-04-16  8:54 ` [PATCH v2 2/7] irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack Dawei Li
2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
2024-04-16  8:54 ` [PATCH v2 3/7] irqchip/gic-v3-its: " Dawei Li
2024-04-17 10:56   ` Marc Zyngier
2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
2024-04-16  8:54 ` [PATCH v2 4/7] irqchip/loongson-eiointc: " Dawei Li
2024-04-16 18:01   ` Yury Norov
2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
2024-04-16  8:54 ` [PATCH v2 5/7] irqchip/riscv-aplic-direct: " Dawei Li
2024-04-17 11:22   ` Anup Patel
2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
2024-04-16  8:54 ` [PATCH v2 6/7] irqchip/sifive-plic: " Dawei Li
2024-04-17 11:21   ` Anup Patel
2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li
2024-04-16  8:54 ` [PATCH v2 7/7] cpuidle: " Dawei Li
2024-04-24 20:04   ` [tip: irq/core] " tip-bot2 for Dawei Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).