linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Support 1 of N SPI interrupt for interrupt distribution
@ 2020-11-27 14:15 Hanks Chen
  2020-11-27 14:15 ` [PATCH v1 1/3] irqchip/gic: enable irq target all Hanks Chen
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Hanks Chen @ 2020-11-27 14:15 UTC (permalink / raw)
  To: Thomas Gleixner, Jason Cooper, Marc Zyngier, Matthias Brugger,
	Russell King, Catalin Marinas, Will Deacon, Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-mediatek, CC Hwang,
	Kuohong Wang, Loda Chou, Hanks Chen

*** BLURB HERE ***
The GICv3 supports 1 of N selection of SPI interrupts.
When the GICD_IROUTERn.Interrupt_Routing_Mode == 1, the GIC selects
the appropriate core for a SPI.

Actually, dispatch the interrupt by hardware Interrupt Control Unit
is more efficient than irqbalance of software logic
and no need to implement software contoller to decide the targeted CPU
on various platform.

In order to reduce interrupt latency, all interrupts are targeted to
each online CPU defaultly by 1 of N selection of SPI interrupts.
That is, SPI interrupts might be serviced simultaneously on different CPUs.

[default policy]
AS-IS (target to boot CPU)
       CPU0       CPU1       CPU2       CPU3
 29:     92          0          0          0     GICv3 141 Level   ttyS0

TO-BE (enable ARM_IRQ_TARGET_ALL)
       CPU0       CPU1       CPU2       CPU3
 29:     23         23         24         22     GICv3 141 Level   ttyS0


Hanks Chen (3):
  irqchip/gic: enable irq target all
  arm: disable irq on cpu shutdown flow
  arm64: disable irq on cpu shutdown flow

 arch/arm/kernel/smp.c              |  10 ++-
 arch/arm64/kernel/smp.c            |   9 ++-
 drivers/irqchip/Kconfig            |  12 ++++
 drivers/irqchip/irq-gic-v3.c       | 107 +++++++++++++++++++++--------
 include/linux/irqchip/arm-gic-v3.h |   1 +
 kernel/irq/cpuhotplug.c            |  22 ++++++
 kernel/irq/manage.c                |   7 ++
 7 files changed, 138 insertions(+), 30 deletions(-)

-- 
2.18.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 1/3] irqchip/gic: enable irq target all
  2020-11-27 14:15 Support 1 of N SPI interrupt for interrupt distribution Hanks Chen
@ 2020-11-27 14:15 ` Hanks Chen
  2020-11-27 18:11   ` Marc Zyngier
  2020-11-27 14:15 ` [PATCH v1 2/3] arm: disable irq on cpu shutdown flow Hanks Chen
  2020-11-27 14:15 ` [PATCH v1 3/3] arm64: " Hanks Chen
  2 siblings, 1 reply; 10+ messages in thread
From: Hanks Chen @ 2020-11-27 14:15 UTC (permalink / raw)
  To: Thomas Gleixner, Jason Cooper, Marc Zyngier, Matthias Brugger,
	Russell King, Catalin Marinas, Will Deacon, Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-mediatek, CC Hwang,
	Kuohong Wang, Loda Chou, Hanks Chen

Support for interrupt distribution design for SMP system solutions.

With this feature enabled ,the SPI interrupts would be routed to
all the cores rather than boot core to achieve better
load balance of interrupt handling.
That is, interrupts might be serviced simultaneously on different CPUs.

Signed-off-by: Hanks Chen <hanks.chen@mediatek.com>
---
 drivers/irqchip/Kconfig            |  12 ++++
 drivers/irqchip/irq-gic-v3.c       | 107 +++++++++++++++++++++--------
 include/linux/irqchip/arm-gic-v3.h |   1 +
 kernel/irq/cpuhotplug.c            |  22 ++++++
 kernel/irq/manage.c                |   7 ++
 5 files changed, 122 insertions(+), 27 deletions(-)

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index c6098eee0c7c..c88ee7731e92 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -597,4 +597,16 @@ config MST_IRQ
 	help
 	  Support MStar Interrupt Controller.
 
+config ARM_IRQ_TARGET_ALL
+	bool "Distribute interrupts across processors on SMP system"
+	depends on SMP && ARM_GIC_V3
+	help
+	  Support for interrupt distribution design for
+	  SMP system solutions. With this feature enabled ,the
+	  SPI interrupts would be routed to all the cores rather
+	  than boot cpu to achieve better load balance of interrupt
+	  handling
+
+	  If you don't know what to do here, say N.
+
 endmenu
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 16fecc0febe8..62a878ce4681 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -381,6 +381,12 @@ static inline bool gic_supports_nmi(void)
 	       static_branch_likely(&supports_pseudo_nmis);
 }
 
+static inline bool gic_supports_1n(void)
+{
+	return (IS_ENABLED(CONFIG_ARM_IRQ_TARGET_ALL) &&
+		~(readl_relaxed(gic_data.dist_base + GICD_TYPER) & GICD_TYPER_No1N));
+}
+
 static int gic_irq_set_irqchip_state(struct irq_data *d,
 				     enum irqchip_irq_state which, bool val)
 {
@@ -716,6 +722,7 @@ static void __init gic_dist_init(void)
 {
 	unsigned int i;
 	u64 affinity;
+
 	void __iomem *base = gic_data.dist_base;
 	u32 val;
 
@@ -759,16 +766,27 @@ static void __init gic_dist_init(void)
 	/* Enable distributor with ARE, Group1 */
 	writel_relaxed(val, base + GICD_CTLR);
 
-	/*
-	 * Set all global interrupts to the boot CPU only. ARE must be
-	 * enabled.
-	 */
-	affinity = gic_mpidr_to_affinity(cpu_logical_map(smp_processor_id()));
-	for (i = 32; i < GIC_LINE_NR; i++)
-		gic_write_irouter(affinity, base + GICD_IROUTER + i * 8);
+	if (!gic_supports_1n()) {
+		/*
+		 * Set all global interrupts to the boot CPU only. ARE must be
+		 * enabled.
+		 */
+		affinity = gic_mpidr_to_affinity(cpu_logical_map(smp_processor_id()));
+		for (i = 32; i < GIC_LINE_NR; i++)
+			gic_write_irouter(affinity, base + GICD_IROUTER + i * 8);
 
-	for (i = 0; i < GIC_ESPI_NR; i++)
-		gic_write_irouter(affinity, base + GICD_IROUTERnE + i * 8);
+		for (i = 0; i < GIC_ESPI_NR; i++)
+			gic_write_irouter(affinity, base + GICD_IROUTERnE + i * 8);
+	} else {
+		/* default set target all for all SPI */
+		for (i = 32; i < GIC_LINE_NR; i++)
+			gic_write_irouter(GICD_IROUTER_SPI_MODE_ANY,
+					  base + GICD_IROUTER + i * 8);
+
+		for (i = 0; i < GIC_ESPI_NR; i++)
+			gic_write_irouter(GICD_IROUTER_SPI_MODE_ANY,
+					  base + GICD_IROUTERnE + i * 8);
+	}
 }
 
 static int gic_iterate_rdists(int (*fn)(struct redist_region *, void __iomem *))
@@ -1191,29 +1209,64 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
 	if (gic_irq_in_rdist(d))
 		return -EINVAL;
 
-	/* If interrupt was enabled, disable it first */
-	enabled = gic_peek_irq(d, GICD_ISENABLER);
-	if (enabled)
-		gic_mask_irq(d);
+	if (!gic_supports_1n()) {
+		/* If interrupt was enabled, disable it first */
+		enabled = gic_peek_irq(d, GICD_ISENABLER);
+		if (enabled)
+			gic_mask_irq(d);
 
-	offset = convert_offset_index(d, GICD_IROUTER, &index);
-	reg = gic_dist_base(d) + offset + (index * 8);
-	val = gic_mpidr_to_affinity(cpu_logical_map(cpu));
+		offset = convert_offset_index(d, GICD_IROUTER, &index);
+		reg = gic_dist_base(d) + offset + (index * 8);
+		val = gic_mpidr_to_affinity(cpu_logical_map(cpu));
 
-	gic_write_irouter(val, reg);
+		gic_write_irouter(val, reg);
 
-	/*
-	 * If the interrupt was enabled, enabled it again. Otherwise,
-	 * just wait for the distributor to have digested our changes.
-	 */
-	if (enabled)
-		gic_unmask_irq(d);
-	else
-		gic_dist_wait_for_rwp();
+		/*
+		 * If the interrupt was enabled, enabled it again. Otherwise,
+		 * just wait for the distributor to have digested our changes.
+		 */
+		if (enabled)
+			gic_unmask_irq(d);
+		else
+			gic_dist_wait_for_rwp();
+
+		irq_data_update_effective_affinity(d, cpumask_of(cpu));
+
+	} else {
+		/*
+		 * no need to update when:
+		 * input mask is equal to the current setting
+		 */
+		if (cpumask_equal(irq_data_get_affinity_mask(d), mask_val))
+			return IRQ_SET_MASK_OK_NOCOPY;
+
+		/* If interrupt was enabled, disable it first */
+		enabled = gic_peek_irq(d, GICD_ISENABLER);
+		if (enabled)
+			gic_mask_irq(d);
+
+		offset = convert_offset_index(d, GICD_IROUTER, &index);
+		reg = gic_dist_base(d) + offset + (index * 8);
 
-	irq_data_update_effective_affinity(d, cpumask_of(cpu));
+		/* GICv3 supports target is 1 or all */
+		if (cpumask_weight(mask_val) > 1)
+			val = GICD_IROUTER_SPI_MODE_ANY;
+		else
+			val = gic_mpidr_to_affinity(cpu_logical_map(cpu));
+
+		gic_write_irouter(val, reg);
+
+		/*
+		 * If the interrupt was enabled, enabled it again. Otherwise,
+		 * just wait for the distributor to have digested our changes.
+		 */
+		if (enabled)
+			gic_unmask_irq(d);
+		else
+			gic_dist_wait_for_rwp();
+	}
 
-	return IRQ_SET_MASK_OK_DONE;
+	return IRQ_SET_MASK_OK;
 }
 #else
 #define gic_set_affinity	NULL
diff --git a/include/linux/irqchip/arm-gic-v3.h b/include/linux/irqchip/arm-gic-v3.h
index f6d092fdb93d..c24336d506a3 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -80,6 +80,7 @@
 #define GICD_CTLR_ENABLE_SS_G0		(1U << 0)
 
 #define GICD_TYPER_RSS			(1U << 26)
+#define GICD_TYPER_No1N			(1U << 25)
 #define GICD_TYPER_LPIS			(1U << 17)
 #define GICD_TYPER_MBIS			(1U << 16)
 #define GICD_TYPER_ESPI			(1U << 8)
diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c
index 02236b13b359..779512e44960 100644
--- a/kernel/irq/cpuhotplug.c
+++ b/kernel/irq/cpuhotplug.c
@@ -87,6 +87,18 @@ static bool migrate_one_irq(struct irq_desc *desc)
 		return false;
 	}
 
+#ifdef CONFIG_ARM_IRQ_TARGET_ALL
+	/*
+	 * No move required, if interrupt is 1 of N IRQ.
+	 * write current cpu_online_mask into affinity mask.
+	 */
+	if (cpumask_weight(desc->irq_common_data.affinity) > 1) {
+		cpumask_copy(desc->irq_common_data.affinity, cpu_online_mask);
+
+		return false;
+	}
+#endif
+
 	/*
 	 * Complete an eventually pending irq move cleanup. If this
 	 * interrupt was moved in hard irq context, then the vectors need
@@ -191,6 +203,16 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu)
 	struct irq_data *data = irq_desc_get_irq_data(desc);
 	const struct cpumask *affinity = irq_data_get_affinity_mask(data);
 
+#ifdef CONFIG_ARM_IRQ_TARGET_ALL
+	/*
+	 * No restore required, if interrupt is 1 of N IRQ.
+	 */
+	if (cpumask_weight(affinity) > 1) {
+		cpumask_set_cpu(cpu, irq_data_get_affinity_mask(data));
+		return;
+	}
+#endif
+
 	if (!irqd_affinity_is_managed(data) || !desc->action ||
 	    !irq_data_get_irq_chip(data) || !cpumask_test_cpu(cpu, affinity))
 		return;
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index c460e0496006..770b97e326bd 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -270,7 +270,14 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	switch (ret) {
 	case IRQ_SET_MASK_OK:
 	case IRQ_SET_MASK_OK_DONE:
+#ifndef CONFIG_ARM_IRQ_TARGET_ALL
 		cpumask_copy(desc->irq_common_data.affinity, mask);
+#else
+		if (cpumask_weight(mask) > 1)
+			cpumask_copy(desc->irq_common_data.affinity, cpu_online_mask);
+		else
+			cpumask_copy(desc->irq_common_data.affinity, mask);
+#endif
 		fallthrough;
 	case IRQ_SET_MASK_OK_NOCOPY:
 		irq_validate_effective_affinity(data);
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 2/3] arm: disable irq on cpu shutdown flow
  2020-11-27 14:15 Support 1 of N SPI interrupt for interrupt distribution Hanks Chen
  2020-11-27 14:15 ` [PATCH v1 1/3] irqchip/gic: enable irq target all Hanks Chen
@ 2020-11-27 14:15 ` Hanks Chen
  2020-11-27 14:15 ` [PATCH v1 3/3] arm64: " Hanks Chen
  2 siblings, 0 replies; 10+ messages in thread
From: Hanks Chen @ 2020-11-27 14:15 UTC (permalink / raw)
  To: Thomas Gleixner, Jason Cooper, Marc Zyngier, Matthias Brugger,
	Russell King, Catalin Marinas, Will Deacon, Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-mediatek, CC Hwang,
	Kuohong Wang, Loda Chou, Hanks Chen

Disable irq on cpu shutdown flow to ensure interrupts
did not bother this cpu after status as offline.

To avoid suspicious RCU usage
[0:swapper/0]RCU used illegally from offline CPU! ...
[0:swapper/0]lockdep: [name:lockdep&]cpu_id = 0, cpu_is_offline = 1

Signed-off-by: Hanks Chen <hanks.chen@mediatek.com>
---
 arch/arm/kernel/smp.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 48099c6e1e4a..6b8f72490320 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -262,6 +262,12 @@ int __cpu_disable(void)
 	remove_cpu_topology(cpu);
 #endif
 
+	/*
+	 * we disable irq here to ensure interrupts
+	 * did not bother this cpu after status as offline.
+	 */
+	local_irq_disable();
+
 	/*
 	 * Take this CPU offline.  Once we clear this, we can't return,
 	 * and we must not schedule until we're ready to give up the cpu.
@@ -600,11 +606,11 @@ static void ipi_cpu_stop(unsigned int cpu)
 		raw_spin_unlock(&stop_lock);
 	}
 
-	set_cpu_online(cpu, false);
-
 	local_fiq_disable();
 	local_irq_disable();
 
+	set_cpu_online(cpu, false);
+
 	while (1) {
 		cpu_relax();
 		wfe();
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 3/3] arm64: disable irq on cpu shutdown flow
  2020-11-27 14:15 Support 1 of N SPI interrupt for interrupt distribution Hanks Chen
  2020-11-27 14:15 ` [PATCH v1 1/3] irqchip/gic: enable irq target all Hanks Chen
  2020-11-27 14:15 ` [PATCH v1 2/3] arm: disable irq on cpu shutdown flow Hanks Chen
@ 2020-11-27 14:15 ` Hanks Chen
  2020-11-27 18:27   ` Marc Zyngier
  2 siblings, 1 reply; 10+ messages in thread
From: Hanks Chen @ 2020-11-27 14:15 UTC (permalink / raw)
  To: Thomas Gleixner, Jason Cooper, Marc Zyngier, Matthias Brugger,
	Russell King, Catalin Marinas, Will Deacon, Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, linux-mediatek, CC Hwang,
	Kuohong Wang, Loda Chou, Hanks Chen

Disable irq on cpu shutdown flow to ensure interrupts
did not bother this cpu after status as offline.

To avoid suspicious RCU usage
(0)[0:swapper/0]RCU used illegally from offline CPU! ...
(0)[0:swapper/0]lockdep: [name:lockdep&]cpu_id = 0, cpu_is_offline = 1

Signed-off-by: Hanks Chen <hanks.chen@mediatek.com>
---
 arch/arm64/kernel/smp.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 82e75fc2c903..27a6553fa86f 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -308,6 +308,12 @@ int __cpu_disable(void)
 	remove_cpu_topology(cpu);
 	numa_remove_cpu(cpu);
 
+	/*
+	 * we disable irq here to ensure interrupts
+	 * did not bother this cpu after status as offline.
+	 */
+	local_irq_disable();
+
 	/*
 	 * Take this CPU offline.  Once we clear this, we can't return,
 	 * and we must not schedule until we're ready to give up the cpu.
@@ -842,9 +848,10 @@ void arch_irq_work_raise(void)
 
 static void local_cpu_stop(void)
 {
+	local_daif_mask();
+
 	set_cpu_online(smp_processor_id(), false);
 
-	local_daif_mask();
 	sdei_mask_local_cpu();
 	cpu_park_loop();
 }
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/3] irqchip/gic: enable irq target all
  2020-11-27 14:15 ` [PATCH v1 1/3] irqchip/gic: enable irq target all Hanks Chen
@ 2020-11-27 18:11   ` Marc Zyngier
  2020-11-27 18:56     ` Catalin Marinas
  2020-12-01 13:54     ` Hanks Chen
  0 siblings, 2 replies; 10+ messages in thread
From: Marc Zyngier @ 2020-11-27 18:11 UTC (permalink / raw)
  To: Hanks Chen
  Cc: Thomas Gleixner, Matthias Brugger, Russell King, Catalin Marinas,
	Will Deacon, Mark Rutland, linux-arm-kernel, linux-kernel,
	linux-mediatek, CC Hwang, Kuohong Wang, Loda Chou

On 2020-11-27 14:15, Hanks Chen wrote:
> Support for interrupt distribution design for SMP system solutions.

As far as I know, we have been supporting interrupt distribution on
ARM SMP systems pretty well for the past... what... 15 years?
I'm sure Russell can dig out an ARM926 SMP system that even predates
that.

> With this feature enabled ,the SPI interrupts would be routed to
> all the cores rather than boot core to achieve better
> load balance of interrupt handling.

Please quantify this compared to the current distribution method.

> That is, interrupts might be serviced simultaneously on different CPUs.

They already can. What is new here? Or do you mean the *same* interrupt
being serviced by different CPUs *at the same time*? That'd be fun.

> Signed-off-by: Hanks Chen <hanks.chen@mediatek.com>
> ---
>  drivers/irqchip/Kconfig            |  12 ++++
>  drivers/irqchip/irq-gic-v3.c       | 107 +++++++++++++++++++++--------
>  include/linux/irqchip/arm-gic-v3.h |   1 +
>  kernel/irq/cpuhotplug.c            |  22 ++++++
>  kernel/irq/manage.c                |   7 ++
>  5 files changed, 122 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index c6098eee0c7c..c88ee7731e92 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -597,4 +597,16 @@ config MST_IRQ
>  	help
>  	  Support MStar Interrupt Controller.
> 
> +config ARM_IRQ_TARGET_ALL
> +	bool "Distribute interrupts across processors on SMP system"
> +	depends on SMP && ARM_GIC_V3
> +	help
> +	  Support for interrupt distribution design for
> +	  SMP system solutions. With this feature enabled ,the
> +	  SPI interrupts would be routed to all the cores rather
> +	  than boot cpu to achieve better load balance of interrupt
> +	  handling
> +
> +	  If you don't know what to do here, say N.

There is no way we start introducing conditional compilation for
architectural features. Either this works at all times, or it doesn't
exist.

> +
>  endmenu
> diff --git a/drivers/irqchip/irq-gic-v3.c 
> b/drivers/irqchip/irq-gic-v3.c
> index 16fecc0febe8..62a878ce4681 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -381,6 +381,12 @@ static inline bool gic_supports_nmi(void)
>  	       static_branch_likely(&supports_pseudo_nmis);
>  }
> 
> +static inline bool gic_supports_1n(void)
> +{
> +	return (IS_ENABLED(CONFIG_ARM_IRQ_TARGET_ALL) &&
> +		~(readl_relaxed(gic_data.dist_base + GICD_TYPER) & 
> GICD_TYPER_No1N));
> +}
> +
>  static int gic_irq_set_irqchip_state(struct irq_data *d,
>  				     enum irqchip_irq_state which, bool val)
>  {
> @@ -716,6 +722,7 @@ static void __init gic_dist_init(void)
>  {
>  	unsigned int i;
>  	u64 affinity;
> +

Spurious whitespace.

>  	void __iomem *base = gic_data.dist_base;
>  	u32 val;
> 
> @@ -759,16 +766,27 @@ static void __init gic_dist_init(void)
>  	/* Enable distributor with ARE, Group1 */
>  	writel_relaxed(val, base + GICD_CTLR);
> 
> -	/*
> -	 * Set all global interrupts to the boot CPU only. ARE must be
> -	 * enabled.
> -	 */
> -	affinity = 
> gic_mpidr_to_affinity(cpu_logical_map(smp_processor_id()));
> -	for (i = 32; i < GIC_LINE_NR; i++)
> -		gic_write_irouter(affinity, base + GICD_IROUTER + i * 8);
> +	if (!gic_supports_1n()) {
> +		/*
> +		 * Set all global interrupts to the boot CPU only. ARE must be
> +		 * enabled.
> +		 */
> +		affinity = 
> gic_mpidr_to_affinity(cpu_logical_map(smp_processor_id()));
> +		for (i = 32; i < GIC_LINE_NR; i++)
> +			gic_write_irouter(affinity, base + GICD_IROUTER + i * 8);
> 
> -	for (i = 0; i < GIC_ESPI_NR; i++)
> -		gic_write_irouter(affinity, base + GICD_IROUTERnE + i * 8);
> +		for (i = 0; i < GIC_ESPI_NR; i++)
> +			gic_write_irouter(affinity, base + GICD_IROUTERnE + i * 8);
> +	} else {
> +		/* default set target all for all SPI */
> +		for (i = 32; i < GIC_LINE_NR; i++)
> +			gic_write_irouter(GICD_IROUTER_SPI_MODE_ANY,
> +					  base + GICD_IROUTER + i * 8);
> +
> +		for (i = 0; i < GIC_ESPI_NR; i++)
> +			gic_write_irouter(GICD_IROUTER_SPI_MODE_ANY,
> +					  base + GICD_IROUTERnE + i * 8);

Why should we decide to distribute interrupts to all CPUs by default 
when
we don't even know where to route them?

> +	}
>  }
> 
>  static int gic_iterate_rdists(int (*fn)(struct redist_region *, void
> __iomem *))
> @@ -1191,29 +1209,64 @@ static int gic_set_affinity(struct irq_data
> *d, const struct cpumask *mask_val,
>  	if (gic_irq_in_rdist(d))
>  		return -EINVAL;
> 
> -	/* If interrupt was enabled, disable it first */
> -	enabled = gic_peek_irq(d, GICD_ISENABLER);
> -	if (enabled)
> -		gic_mask_irq(d);
> +	if (!gic_supports_1n()) {
> +		/* If interrupt was enabled, disable it first */
> +		enabled = gic_peek_irq(d, GICD_ISENABLER);
> +		if (enabled)
> +			gic_mask_irq(d);
> 
> -	offset = convert_offset_index(d, GICD_IROUTER, &index);
> -	reg = gic_dist_base(d) + offset + (index * 8);
> -	val = gic_mpidr_to_affinity(cpu_logical_map(cpu));
> +		offset = convert_offset_index(d, GICD_IROUTER, &index);
> +		reg = gic_dist_base(d) + offset + (index * 8);
> +		val = gic_mpidr_to_affinity(cpu_logical_map(cpu));
> 
> -	gic_write_irouter(val, reg);
> +		gic_write_irouter(val, reg);
> 
> -	/*
> -	 * If the interrupt was enabled, enabled it again. Otherwise,
> -	 * just wait for the distributor to have digested our changes.
> -	 */
> -	if (enabled)
> -		gic_unmask_irq(d);
> -	else
> -		gic_dist_wait_for_rwp();
> +		/*
> +		 * If the interrupt was enabled, enabled it again. Otherwise,
> +		 * just wait for the distributor to have digested our changes.
> +		 */
> +		if (enabled)
> +			gic_unmask_irq(d);
> +		else
> +			gic_dist_wait_for_rwp();
> +
> +		irq_data_update_effective_affinity(d, cpumask_of(cpu));
> +
> +	} else {
> +		/*
> +		 * no need to update when:
> +		 * input mask is equal to the current setting
> +		 */
> +		if (cpumask_equal(irq_data_get_affinity_mask(d), mask_val))
> +			return IRQ_SET_MASK_OK_NOCOPY;
> +
> +		/* If interrupt was enabled, disable it first */
> +		enabled = gic_peek_irq(d, GICD_ISENABLER);
> +		if (enabled)
> +			gic_mask_irq(d);
> +
> +		offset = convert_offset_index(d, GICD_IROUTER, &index);
> +		reg = gic_dist_base(d) + offset + (index * 8);
> 
> -	irq_data_update_effective_affinity(d, cpumask_of(cpu));
> +		/* GICv3 supports target is 1 or all */
> +		if (cpumask_weight(mask_val) > 1)
> +			val = GICD_IROUTER_SPI_MODE_ANY;

There is a massive difference between targeting more than one CPU
and targeting all the CPUs. This breaks all existing drivers and
userspace that need to manage interrupt affinity.

> +		else
> +			val = gic_mpidr_to_affinity(cpu_logical_map(cpu));
> +
> +		gic_write_irouter(val, reg);
> +
> +		/*
> +		 * If the interrupt was enabled, enabled it again. Otherwise,
> +		 * just wait for the distributor to have digested our changes.
> +		 */
> +		if (enabled)
> +			gic_unmask_irq(d);
> +		else
> +			gic_dist_wait_for_rwp();

Why so much code duplication?

> +	}
> 
> -	return IRQ_SET_MASK_OK_DONE;
> +	return IRQ_SET_MASK_OK;
>  }
>  #else
>  #define gic_set_affinity	NULL
> diff --git a/include/linux/irqchip/arm-gic-v3.h
> b/include/linux/irqchip/arm-gic-v3.h
> index f6d092fdb93d..c24336d506a3 100644
> --- a/include/linux/irqchip/arm-gic-v3.h
> +++ b/include/linux/irqchip/arm-gic-v3.h
> @@ -80,6 +80,7 @@
>  #define GICD_CTLR_ENABLE_SS_G0		(1U << 0)
> 
>  #define GICD_TYPER_RSS			(1U << 26)
> +#define GICD_TYPER_No1N			(1U << 25)
>  #define GICD_TYPER_LPIS			(1U << 17)
>  #define GICD_TYPER_MBIS			(1U << 16)
>  #define GICD_TYPER_ESPI			(1U << 8)
> diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c
> index 02236b13b359..779512e44960 100644
> --- a/kernel/irq/cpuhotplug.c
> +++ b/kernel/irq/cpuhotplug.c
> @@ -87,6 +87,18 @@ static bool migrate_one_irq(struct irq_desc *desc)
>  		return false;
>  	}
> 
> +#ifdef CONFIG_ARM_IRQ_TARGET_ALL

No way.

> +	/*
> +	 * No move required, if interrupt is 1 of N IRQ.
> +	 * write current cpu_online_mask into affinity mask.
> +	 */
> +	if (cpumask_weight(desc->irq_common_data.affinity) > 1) {
> +		cpumask_copy(desc->irq_common_data.affinity, cpu_online_mask);

Again, this is totally bogus.

> +
> +		return false;
> +	}
> +#endif
> +
>  	/*
>  	 * Complete an eventually pending irq move cleanup. If this
>  	 * interrupt was moved in hard irq context, then the vectors need
> @@ -191,6 +203,16 @@ static void irq_restore_affinity_of_irq(struct
> irq_desc *desc, unsigned int cpu)
>  	struct irq_data *data = irq_desc_get_irq_data(desc);
>  	const struct cpumask *affinity = irq_data_get_affinity_mask(data);
> 
> +#ifdef CONFIG_ARM_IRQ_TARGET_ALL
> +	/*
> +	 * No restore required, if interrupt is 1 of N IRQ.
> +	 */
> +	if (cpumask_weight(affinity) > 1) {
> +		cpumask_set_cpu(cpu, irq_data_get_affinity_mask(data));
> +		return;
> +	}

Same thing.

> +#endif
> +
>  	if (!irqd_affinity_is_managed(data) || !desc->action ||
>  	    !irq_data_get_irq_chip(data) || !cpumask_test_cpu(cpu, affinity))
>  		return;
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index c460e0496006..770b97e326bd 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -270,7 +270,14 @@ int irq_do_set_affinity(struct irq_data *data,
> const struct cpumask *mask,
>  	switch (ret) {
>  	case IRQ_SET_MASK_OK:
>  	case IRQ_SET_MASK_OK_DONE:
> +#ifndef CONFIG_ARM_IRQ_TARGET_ALL
>  		cpumask_copy(desc->irq_common_data.affinity, mask);
> +#else
> +		if (cpumask_weight(mask) > 1)
> +			cpumask_copy(desc->irq_common_data.affinity, cpu_online_mask);
> +		else
> +			cpumask_copy(desc->irq_common_data.affinity, mask);

And again.

> +#endif
>  		fallthrough;
>  	case IRQ_SET_MASK_OK_NOCOPY:
>  		irq_validate_effective_affinity(data);

To sum it up:

- You claim that the current interrupt distribution model is either non
   functional or inefficient. I expect you to demonstrate this.

- You claim that this brings some kind of improvement: I expect you to
   describe workloads and numbers that demonstrate the improvements.
   1:N distribution has repeatedly shown to have much worse behaviour
   that 1:1 distribution, so I really wonder what's new here.

- You break existing APIs where a driver or userspace can legitimately
   decide to route an interrupt to 2 CPUs out of 256 if it decides to.
   That's not acceptable.

- You pollute the core code with hacks that should never be there. If 
the
   current behaviour is a problem, please state what the problem is.

- You don't factor in the requirements of the pseudo-NMI masking, 
meaning
   the IRI doesn't know about the current PMR and will not be able to
   efficiently dispatch interrupts.

- I don't see anything that configures the participating nodes to the 
1:N
   distribution, meaning it currently works by chance.

- How does it work with CPU hotplug, where we actively move the 
interrupts
   away from a CPU going down? With this scheme, the interrupts keep 
being
   delivered to the wrong CPU, resulting in lost interrupts if they were
   edge triggered.

As it stands, I can't imagine this code making it in the kernel.

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 3/3] arm64: disable irq on cpu shutdown flow
  2020-11-27 14:15 ` [PATCH v1 3/3] arm64: " Hanks Chen
@ 2020-11-27 18:27   ` Marc Zyngier
  0 siblings, 0 replies; 10+ messages in thread
From: Marc Zyngier @ 2020-11-27 18:27 UTC (permalink / raw)
  To: Hanks Chen
  Cc: Thomas Gleixner, Matthias Brugger, Russell King, Catalin Marinas,
	Will Deacon, Mark Rutland, linux-arm-kernel, linux-kernel,
	linux-mediatek, CC Hwang, Kuohong Wang, Loda Chou

On 2020-11-27 14:15, Hanks Chen wrote:
> Disable irq on cpu shutdown flow to ensure interrupts
> did not bother this cpu after status as offline.
> 
> To avoid suspicious RCU usage
> (0)[0:swapper/0]RCU used illegally from offline CPU! ...
> (0)[0:swapper/0]lockdep: [name:lockdep&]cpu_id = 0, cpu_is_offline = 1

This needs to be explained *a lot* more .

My hunch is that because a CPU going offline can still receive 
interrupts
thanks to your interrupt broadcast hack, you break some the core 
expectations,
and RCU shouts at you.

If that's indeed the case, I don't think the architecture code needs 
fixing
(or at least, not for that).

> 
> Signed-off-by: Hanks Chen <hanks.chen@mediatek.com>
> ---
>  arch/arm64/kernel/smp.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 82e75fc2c903..27a6553fa86f 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -308,6 +308,12 @@ int __cpu_disable(void)
>  	remove_cpu_topology(cpu);
>  	numa_remove_cpu(cpu);
> 
> +	/*
> +	 * we disable irq here to ensure interrupts
> +	 * did not bother this cpu after status as offline.
> +	 */
> +	local_irq_disable();
> +
>  	/*
>  	 * Take this CPU offline.  Once we clear this, we can't return,
>  	 * and we must not schedule until we're ready to give up the cpu.

Conveniently, the code that takes care of migrating the interrupts is
just below this comment.  Which strongly suggests that the interrupt
migration is broken by your earlier patch.

> @@ -842,9 +848,10 @@ void arch_irq_work_raise(void)
> 
>  static void local_cpu_stop(void)
>  {
> +	local_daif_mask();
> +
>  	set_cpu_online(smp_processor_id(), false);
> 
> -	local_daif_mask();
>  	sdei_mask_local_cpu();
>  	cpu_park_loop();
>  }

What problem are you addressing here?

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/3] irqchip/gic: enable irq target all
  2020-11-27 18:11   ` Marc Zyngier
@ 2020-11-27 18:56     ` Catalin Marinas
  2020-11-27 19:43       ` Marc Zyngier
  2020-12-01 13:54     ` Hanks Chen
  1 sibling, 1 reply; 10+ messages in thread
From: Catalin Marinas @ 2020-11-27 18:56 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Hanks Chen, Thomas Gleixner, Matthias Brugger, Russell King,
	Will Deacon, Mark Rutland, linux-arm-kernel, linux-kernel,
	linux-mediatek, CC Hwang, Kuohong Wang, Loda Chou

On Fri, Nov 27, 2020 at 06:11:01PM +0000, Marc Zyngier wrote:
> On 2020-11-27 14:15, Hanks Chen wrote:
> > Support for interrupt distribution design for SMP system solutions.
> 
> As far as I know, we have been supporting interrupt distribution on
> ARM SMP systems pretty well for the past... what... 15 years?
> I'm sure Russell can dig out an ARM926 SMP system that even predates
> that.
> 
> > With this feature enabled ,the SPI interrupts would be routed to
> > all the cores rather than boot core to achieve better
> > load balance of interrupt handling.
> 
> Please quantify this compared to the current distribution method.
> 
> > That is, interrupts might be serviced simultaneously on different CPUs.
> 
> They already can. What is new here? Or do you mean the *same* interrupt
> being serviced by different CPUs *at the same time*? That'd be fun.

IIRC (it's been many years since I looked at the GIC), more than one CPU
is woken and if they all read the INTACK, only one of them gets the
actual IRQ number, the others see a spurious interrupt. I thought we
decided that's not an efficient way to handle interrupts, so a software
irqbalancer is better.

Has anything changed since then?

I'm also concerned that in a big.LITTLE system, you may see the big CPUs
taking the interrupts all the time, which is nice for energy efficiency.

-- 
Catalin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/3] irqchip/gic: enable irq target all
  2020-11-27 18:56     ` Catalin Marinas
@ 2020-11-27 19:43       ` Marc Zyngier
  0 siblings, 0 replies; 10+ messages in thread
From: Marc Zyngier @ 2020-11-27 19:43 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Hanks Chen, Thomas Gleixner, Matthias Brugger, Russell King,
	Will Deacon, Mark Rutland, linux-arm-kernel, linux-kernel,
	linux-mediatek, CC Hwang, Kuohong Wang, Loda Chou

On 2020-11-27 18:56, Catalin Marinas wrote:
> On Fri, Nov 27, 2020 at 06:11:01PM +0000, Marc Zyngier wrote:
>> On 2020-11-27 14:15, Hanks Chen wrote:
>> > Support for interrupt distribution design for SMP system solutions.
>> 
>> As far as I know, we have been supporting interrupt distribution on
>> ARM SMP systems pretty well for the past... what... 15 years?
>> I'm sure Russell can dig out an ARM926 SMP system that even predates
>> that.
>> 
>> > With this feature enabled ,the SPI interrupts would be routed to
>> > all the cores rather than boot core to achieve better
>> > load balance of interrupt handling.
>> 
>> Please quantify this compared to the current distribution method.
>> 
>> > That is, interrupts might be serviced simultaneously on different CPUs.
>> 
>> They already can. What is new here? Or do you mean the *same* 
>> interrupt
>> being serviced by different CPUs *at the same time*? That'd be fun.
> 
> IIRC (it's been many years since I looked at the GIC), more than one 
> CPU
> is woken and if they all read the INTACK, only one of them gets the
> actual IRQ number, the others see a spurious interrupt. I thought we
> decided that's not an efficient way to handle interrupts, so a software
> irqbalancer is better.
> 
> Has anything changed since then?

What you describe is mostly the GICv1/v2 behaviour.

GICv3 *can* be slightly better in that respect, as long as you force
the synchronization from the CPU interface back to the redistributor
via PHME and a DSB SY on each priority change.

Which means what whatever you gain from not interrupting the other CPUs,
you waste it by forcing synchronization system wide.

> I'm also concerned that in a big.LITTLE system, you may see the big 
> CPUs
> taking the interrupts all the time, which is nice for energy 
> efficiency.

Yes, this is bound to happen. It means you cannot place interrupts
on a small cluster unless you completely hotplug the big cores off,
thanks to the "more than one means all" nonsense.

And since these patches seem to also break hotplug, that's... fun.

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/3] irqchip/gic: enable irq target all
  2020-11-27 18:11   ` Marc Zyngier
  2020-11-27 18:56     ` Catalin Marinas
@ 2020-12-01 13:54     ` Hanks Chen
  2020-12-02 11:09       ` Thomas Gleixner
  1 sibling, 1 reply; 10+ messages in thread
From: Hanks Chen @ 2020-12-01 13:54 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, CC Hwang, Catalin Marinas, Kuohong Wang,
	Russell King, linux-kernel, linux-mediatek, Loda Chou,
	Matthias Brugger, Thomas Gleixner, Will Deacon, linux-arm-kernel

Hi Marc,

Sorry for the late reply.


On Fri, 2020-11-27 at 18:11 +0000, Marc Zyngier wrote:
> On 2020-11-27 14:15, Hanks Chen wrote:
> > Support for interrupt distribution design for SMP system solutions.
> 
> As far as I know, we have been supporting interrupt distribution on
> ARM SMP systems pretty well for the past... what... 15 years?
> I'm sure Russell can dig out an ARM926 SMP system that even predates
> that.
> 
> > With this feature enabled ,the SPI interrupts would be routed to
> > all the cores rather than boot core to achieve better
> > load balance of interrupt handling.
> 
> Please quantify this compared to the current distribution method.
> 
Got it, but I need some time to prepare for the demonstration.
I'll show it soon.
Btw, it is good on IO bandwidth-sensitive applications, we could also
use androbench to recognize it.

> > That is, interrupts might be serviced simultaneously on different CPUs.
> 
> They already can. What is new here? Or do you mean the *same* interrupt
> being serviced by different CPUs *at the same time*? That'd be fun.
> 

It is my fault, I used the imprecise word to describe it.
Not mean 'same interrupt being serviced by different CPUs'.
You are right, we can use affinity API to achieve 'different CPUs'
However, I focus on HW behavior.
If we don't change the setting of affinity(target to boot CPU),
SPI IRQs might still be serviced on different CPUs after applying target
all.

> > Signed-off-by: Hanks Chen <hanks.chen@mediatek.com>
> > ---
> >  drivers/irqchip/Kconfig            |  12 ++++
> >  drivers/irqchip/irq-gic-v3.c       | 107 +++++++++++++++++++++--------
> >  include/linux/irqchip/arm-gic-v3.h |   1 +
> >  kernel/irq/cpuhotplug.c            |  22 ++++++
> >  kernel/irq/manage.c                |   7 ++
> >  5 files changed, 122 insertions(+), 27 deletions(-)
> > 
> > diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> > index c6098eee0c7c..c88ee7731e92 100644
> > --- a/drivers/irqchip/Kconfig
> > +++ b/drivers/irqchip/Kconfig
> > @@ -597,4 +597,16 @@ config MST_IRQ
> >  	help
> >  	  Support MStar Interrupt Controller.
> > 
> > +config ARM_IRQ_TARGET_ALL
> > +	bool "Distribute interrupts across processors on SMP system"
> > +	depends on SMP && ARM_GIC_V3
> > +	help
> > +	  Support for interrupt distribution design for
> > +	  SMP system solutions. With this feature enabled ,the
> > +	  SPI interrupts would be routed to all the cores rather
> > +	  than boot cpu to achieve better load balance of interrupt
> > +	  handling
> > +
> > +	  If you don't know what to do here, say N.
> 
> There is no way we start introducing conditional compilation for
> architectural features. Either this works at all times, or it doesn't
> exist.
> 
I have no idea how to solve it.
Indeed, it is not good to enable the feature on every ARM-based
platform.
maybe I could use device tree to determine whether or not the feature
should be enabled.
e.g.
gic: interrupt-controller {
	compatible = "arm,gic-v3";
	...
	irq-target-all-enable = <1>;
};

Would this be better?

> > +
> >  endmenu
> > diff --git a/drivers/irqchip/irq-gic-v3.c 
> > b/drivers/irqchip/irq-gic-v3.c
> > index 16fecc0febe8..62a878ce4681 100644
> > --- a/drivers/irqchip/irq-gic-v3.c
> > +++ b/drivers/irqchip/irq-gic-v3.c
> > @@ -381,6 +381,12 @@ static inline bool gic_supports_nmi(void)
> >  	       static_branch_likely(&supports_pseudo_nmis);
> >  }
> > 
> > +static inline bool gic_supports_1n(void)
> > +{
> > +	return (IS_ENABLED(CONFIG_ARM_IRQ_TARGET_ALL) &&
> > +		~(readl_relaxed(gic_data.dist_base + GICD_TYPER) & 
> > GICD_TYPER_No1N));
> > +}
> > +
> >  static int gic_irq_set_irqchip_state(struct irq_data *d,
> >  				     enum irqchip_irq_state which, bool val)
> >  {
> > @@ -716,6 +722,7 @@ static void __init gic_dist_init(void)
> >  {
> >  	unsigned int i;
> >  	u64 affinity;
> > +
> 
> Spurious whitespace.

Got it, I'll fix it.

> 
> >  	void __iomem *base = gic_data.dist_base;
> >  	u32 val;
> > 
> > @@ -759,16 +766,27 @@ static void __init gic_dist_init(void)
> >  	/* Enable distributor with ARE, Group1 */
> >  	writel_relaxed(val, base + GICD_CTLR);
> > 
> > -	/*
> > -	 * Set all global interrupts to the boot CPU only. ARE must be
> > -	 * enabled.
> > -	 */
> > -	affinity = 
> > gic_mpidr_to_affinity(cpu_logical_map(smp_processor_id()));
> > -	for (i = 32; i < GIC_LINE_NR; i++)
> > -		gic_write_irouter(affinity, base + GICD_IROUTER + i * 8);
> > +	if (!gic_supports_1n()) {
> > +		/*
> > +		 * Set all global interrupts to the boot CPU only. ARE must be
> > +		 * enabled.
> > +		 */
> > +		affinity = 
> > gic_mpidr_to_affinity(cpu_logical_map(smp_processor_id()));
> > +		for (i = 32; i < GIC_LINE_NR; i++)
> > +			gic_write_irouter(affinity, base + GICD_IROUTER + i * 8);
> > 
> > -	for (i = 0; i < GIC_ESPI_NR; i++)
> > -		gic_write_irouter(affinity, base + GICD_IROUTERnE + i * 8);
> > +		for (i = 0; i < GIC_ESPI_NR; i++)
> > +			gic_write_irouter(affinity, base + GICD_IROUTERnE + i * 8);
> > +	} else {
> > +		/* default set target all for all SPI */
> > +		for (i = 32; i < GIC_LINE_NR; i++)
> > +			gic_write_irouter(GICD_IROUTER_SPI_MODE_ANY,
> > +					  base + GICD_IROUTER + i * 8);
> > +
> > +		for (i = 0; i < GIC_ESPI_NR; i++)
> > +			gic_write_irouter(GICD_IROUTER_SPI_MODE_ANY,
> > +					  base + GICD_IROUTERnE + i * 8);
> 
> Why should we decide to distribute interrupts to all CPUs by default 
> when
> we don't even know where to route them?
> 
It makes sense to dispatch SPI interrupts defaultly by GICD when we
transfer the control to GIC HW.
It's more real time to track the IRQ balance by HW controller and the
policy is based on CPU state or low power purpose.

btw, we also want to introduce the interrupt class to our big-little
system.It is an implementation-defined feature that the GIC600 provides
for the 1 of N SPIs.

> > +	}
> >  }
> > 
> >  static int gic_iterate_rdists(int (*fn)(struct redist_region *, void
> > __iomem *))
> > @@ -1191,29 +1209,64 @@ static int gic_set_affinity(struct irq_data
> > *d, const struct cpumask *mask_val,
> >  	if (gic_irq_in_rdist(d))
> >  		return -EINVAL;
> > 
> > -	/* If interrupt was enabled, disable it first */
> > -	enabled = gic_peek_irq(d, GICD_ISENABLER);
> > -	if (enabled)
> > -		gic_mask_irq(d);
> > +	if (!gic_supports_1n()) {
> > +		/* If interrupt was enabled, disable it first */
> > +		enabled = gic_peek_irq(d, GICD_ISENABLER);
> > +		if (enabled)
> > +			gic_mask_irq(d);
> > 
> > -	offset = convert_offset_index(d, GICD_IROUTER, &index);
> > -	reg = gic_dist_base(d) + offset + (index * 8);
> > -	val = gic_mpidr_to_affinity(cpu_logical_map(cpu));
> > +		offset = convert_offset_index(d, GICD_IROUTER, &index);
> > +		reg = gic_dist_base(d) + offset + (index * 8);
> > +		val = gic_mpidr_to_affinity(cpu_logical_map(cpu));
> > 
> > -	gic_write_irouter(val, reg);
> > +		gic_write_irouter(val, reg);
> > 
> > -	/*
> > -	 * If the interrupt was enabled, enabled it again. Otherwise,
> > -	 * just wait for the distributor to have digested our changes.
> > -	 */
> > -	if (enabled)
> > -		gic_unmask_irq(d);
> > -	else
> > -		gic_dist_wait_for_rwp();
> > +		/*
> > +		 * If the interrupt was enabled, enabled it again. Otherwise,
> > +		 * just wait for the distributor to have digested our changes.
> > +		 */
> > +		if (enabled)
> > +			gic_unmask_irq(d);
> > +		else
> > +			gic_dist_wait_for_rwp();
> > +
> > +		irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > +
> > +	} else {
> > +		/*
> > +		 * no need to update when:
> > +		 * input mask is equal to the current setting
> > +		 */
> > +		if (cpumask_equal(irq_data_get_affinity_mask(d), mask_val))
> > +			return IRQ_SET_MASK_OK_NOCOPY;
> > +
> > +		/* If interrupt was enabled, disable it first */
> > +		enabled = gic_peek_irq(d, GICD_ISENABLER);
> > +		if (enabled)
> > +			gic_mask_irq(d);
> > +
> > +		offset = convert_offset_index(d, GICD_IROUTER, &index);
> > +		reg = gic_dist_base(d) + offset + (index * 8);
> > 
> > -	irq_data_update_effective_affinity(d, cpumask_of(cpu));
> > +		/* GICv3 supports target is 1 or all */
> > +		if (cpumask_weight(mask_val) > 1)
> > +			val = GICD_IROUTER_SPI_MODE_ANY;
> 
> There is a massive difference between targeting more than one CPU
> and targeting all the CPUs. This breaks all existing drivers and
> userspace that need to manage interrupt affinity.
> 
Yes, it is not a good implementation that cause the user-space confused.
Should I modify the code as follows?

If (bitmap_equal(mask_val, online_mask, nr_cpumask_bits))
	val = GICD_IROUTER_SPI_MODE_ANY;

redefine the coverage

> > +		else
> > +			val = gic_mpidr_to_affinity(cpu_logical_map(cpu));
> > +
> > +		gic_write_irouter(val, reg);
> > +
> > +		/*
> > +		 * If the interrupt was enabled, enabled it again. Otherwise,
> > +		 * just wait for the distributor to have digested our changes.
> > +		 */
> > +		if (enabled)
> > +			gic_unmask_irq(d);
> > +		else
> > +			gic_dist_wait_for_rwp();
> 
> Why so much code duplication?

Got it, I'll remove the duplicated code.

> 
> > +	}
> > 
> > -	return IRQ_SET_MASK_OK_DONE;
> > +	return IRQ_SET_MASK_OK;
> >  }
> >  #else
> >  #define gic_set_affinity	NULL
> > diff --git a/include/linux/irqchip/arm-gic-v3.h
> > b/include/linux/irqchip/arm-gic-v3.h
> > index f6d092fdb93d..c24336d506a3 100644
> > --- a/include/linux/irqchip/arm-gic-v3.h
> > +++ b/include/linux/irqchip/arm-gic-v3.h
> > @@ -80,6 +80,7 @@
> >  #define GICD_CTLR_ENABLE_SS_G0		(1U << 0)
> > 
> >  #define GICD_TYPER_RSS			(1U << 26)
> > +#define GICD_TYPER_No1N			(1U << 25)
> >  #define GICD_TYPER_LPIS			(1U << 17)
> >  #define GICD_TYPER_MBIS			(1U << 16)
> >  #define GICD_TYPER_ESPI			(1U << 8)
> > diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c
> > index 02236b13b359..779512e44960 100644
> > --- a/kernel/irq/cpuhotplug.c
> > +++ b/kernel/irq/cpuhotplug.c
> > @@ -87,6 +87,18 @@ static bool migrate_one_irq(struct irq_desc *desc)
> >  		return false;
> >  	}
> > 
> > +#ifdef CONFIG_ARM_IRQ_TARGET_ALL
> 
> No way.
> 
> > +	/*
> > +	 * No move required, if interrupt is 1 of N IRQ.
> > +	 * write current cpu_online_mask into affinity mask.
> > +	 */
> > +	if (cpumask_weight(desc->irq_common_data.affinity) > 1) {
> > +		cpumask_copy(desc->irq_common_data.affinity, cpu_online_mask);
> 
> Again, this is totally bogus.
> 

If I add the target all flag into irq_common_data stucture to
distinguish it. Would this be better?

remove #ifdef CONFIG_ARM_IRQ_TARGET_ALL

and

if (desc->irq_common_data.irq_target_all)
	cpumask_copy(desc->irq_common_data.affinity, cpu_online_mask);


> > +
> > +		return false;
> > +	}
> > +#endif
> > +
> >  	/*
> >  	 * Complete an eventually pending irq move cleanup. If this
> >  	 * interrupt was moved in hard irq context, then the vectors need
> > @@ -191,6 +203,16 @@ static void irq_restore_affinity_of_irq(struct
> > irq_desc *desc, unsigned int cpu)
> >  	struct irq_data *data = irq_desc_get_irq_data(desc);
> >  	const struct cpumask *affinity = irq_data_get_affinity_mask(data);
> > 
> > +#ifdef CONFIG_ARM_IRQ_TARGET_ALL
> > +	/*
> > +	 * No restore required, if interrupt is 1 of N IRQ.
> > +	 */
> > +	if (cpumask_weight(affinity) > 1) {
> > +		cpumask_set_cpu(cpu, irq_data_get_affinity_mask(data));
> > +		return;
> > +	}
> 
> Same thing.
> 

as above

remove #ifdef CONFIG_ARM_IRQ_TARGET_ALL

and

if (desc->irq_common_data.irq_target_all) {
	cpumask_set_cpu(cpu, irq_data_get_affinity_mask(data));
	return;
}

> > +#endif
> > +
> >  	if (!irqd_affinity_is_managed(data) || !desc->action ||
> >  	    !irq_data_get_irq_chip(data) || !cpumask_test_cpu(cpu, affinity))
> >  		return;
> > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> > index c460e0496006..770b97e326bd 100644
> > --- a/kernel/irq/manage.c
> > +++ b/kernel/irq/manage.c
> > @@ -270,7 +270,14 @@ int irq_do_set_affinity(struct irq_data *data,
> > const struct cpumask *mask,
> >  	switch (ret) {
> >  	case IRQ_SET_MASK_OK:
> >  	case IRQ_SET_MASK_OK_DONE:
> > +#ifndef CONFIG_ARM_IRQ_TARGET_ALL
> >  		cpumask_copy(desc->irq_common_data.affinity, mask);
> > +#else
> > +		if (cpumask_weight(mask) > 1)
> > +			cpumask_copy(desc->irq_common_data.affinity, cpu_online_mask);
> > +		else
> > +			cpumask_copy(desc->irq_common_data.affinity, mask);
> 
> And again.
> 

as above
if (!desc->irq_common_data.irq_target_all)
	cpumask_copy(desc->irq_common_data.affinity, mask);
else
	cpumask_copy(desc->irq_common_data.affinity, cpu_online_mask);

> > +#endif
> >  		fallthrough;
> >  	case IRQ_SET_MASK_OK_NOCOPY:
> >  		irq_validate_effective_affinity(data);
> 
> To sum it up:
> 
> - You claim that the current interrupt distribution model is either non
>    functional or inefficient. I expect you to demonstrate this.
> 
> - You claim that this brings some kind of improvement: I expect you to
>    describe workloads and numbers that demonstrate the improvements.
>    1:N distribution has repeatedly shown to have much worse behaviour
>    that 1:1 distribution, so I really wonder what's new here.
> 
I'll show the demonstration to answer first two questions.

> - You break existing APIs where a driver or userspace can legitimately
>    decide to route an interrupt to 2 CPUs out of 256 if it decides to.
>    That's not acceptable.
> 
I redefine the coverage as above

> - You pollute the core code with hacks that should never be there. If 
> the
>    current behaviour is a problem, please state what the problem is.
> 

We found the RCU warn when IRQs target to a offline CPU without I bit
masked in CPU hot-plug flow
I'll reproduce the issue again and show the log analysis for it. 

> - You don't factor in the requirements of the pseudo-NMI masking, 
> meaning
>    the IRI doesn't know about the current PMR and will not be able to
>    efficiently dispatch interrupts.
> 
Yes. mask interrupts using PMR becomes very expensive. (require a dsb)
We don't use the pseduo-NMI in our platform.
Thank you for pointing that out, I need some time to mull it over.

> - I don't see anything that configures the participating nodes to the 
> 1:N
>    distribution, meaning it currently works by chance.
> 
1 of N SPI is dispatched by GICD with round robin policy by default.
However, GIC support the interface to control the policy of dispatching
SPI interrupt.
In MTK platform, GICD could dispatch SPI interrupts to suitable target
CPU by CPU state or low power purpose.

> - How does it work with CPU hotplug, where we actively move the 
> interrupts
>    away from a CPU going down? With this scheme, the interrupts keep 
> being
>    delivered to the wrong CPU, resulting in lost interrupts if they were
>    edge triggered.
> 

Hot-plug off CPU would assert the GICR_WAKER.ProcessorSleep and
GICR_CTLR.DPG in trusted firmware cortex-a (TF-a)
1) When ProcessorSleep is 1
All interrupts that arrive at the re-distributor are not communicated to
the CPU interface
2) When DPG is 1
Any interrupts that have not reached a core at the time of the change
are recalled and reprioritized by
the GIC

hot-plug off sequence. (PSCI in TF-a)
1. Mask interrupts on the core
2. Clear the CPU interface enables
3. GICR_WAKER.ProcessorSleep = 1
4. GICR_CTLR.DPG = 1 (we need to add it when 1 of N SPIs feature is
enabled)

According to the above, I think it would keep the edge-triggered
interrupt in GIC.


> As it stands, I can't imagine this code making it in the kernel.
> 
> Thanks,
> 
>          M.

In MTK platform, We change the dispatch policy of 1:N SPIs by interface
of interrupt routing infrastructure
and it would improve our system performance.

GICR_CTLR.DPG, cpu_active and interrupt class could also change the
policy of 1:N SPIs.
So we want to introduce the interface for 1 of N SPIs to provide more
choices.
But I need to carefully consider the various scenario on all arm64
system and don't break the established API.
I need some time to mull it over.

Thank you for your review

Hanks




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 1/3] irqchip/gic: enable irq target all
  2020-12-01 13:54     ` Hanks Chen
@ 2020-12-02 11:09       ` Thomas Gleixner
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Gleixner @ 2020-12-02 11:09 UTC (permalink / raw)
  To: Hanks Chen, Marc Zyngier
  Cc: Mark Rutland, CC Hwang, Catalin Marinas, Kuohong Wang,
	Russell King, linux-kernel, linux-mediatek, Loda Chou,
	Matthias Brugger, Will Deacon, linux-arm-kernel

Hanks,

On Tue, Dec 01 2020 at 21:54, Hanks Chen wrote:
> On Fri, 2020-11-27 at 18:11 +0000, Marc Zyngier wrote:
>> On 2020-11-27 14:15, Hanks Chen wrote:
>> > +	/*
>> > +	 * No move required, if interrupt is 1 of N IRQ.
>> > +	 * write current cpu_online_mask into affinity mask.
>> > +	 */
>> > +	if (cpumask_weight(desc->irq_common_data.affinity) > 1) {
>> > +		cpumask_copy(desc->irq_common_data.affinity, cpu_online_mask);
>> 
>> Again, this is totally bogus.
>> 
>
> If I add the target all flag into irq_common_data stucture to
> distinguish it. Would this be better?
>
> remove #ifdef CONFIG_ARM_IRQ_TARGET_ALL
>
> and
>
> if (desc->irq_common_data.irq_target_all)
> 	cpumask_copy(desc->irq_common_data.affinity, cpu_online_mask);

No. We are not adding random members to irq_common_data just to support
this. If at all this is a flag of the interrupt chip.

Also this copy is wrong to begin with. The affinity mask is only updated
on cpu hot-unplug when the outgoing CPU is the last online CPU in the
mask. Then we break the user/kernel supplied affinity mask and we only
do that when the interrupt is not affinity managed.

We do not remove offline CPUs from the affinity mask. The affinity code
of the irq chip has to ensure that only online CPUs can be
targeted. That's what ends up in effective_affinity.

>> > +#ifdef CONFIG_ARM_IRQ_TARGET_ALL
>> > +	/*
>> > +	 * No restore required, if interrupt is 1 of N IRQ.
>> > +	 */
>> > +	if (cpumask_weight(affinity) > 1) {
>> > +		cpumask_set_cpu(cpu, irq_data_get_affinity_mask(data));
>> > +		return;
>> > +	}

Heck no. This breaks managed interrupts and some more. Fiddling with the
irq affinity mask is wrong to begin with. Don't touch it ever.

>> > --- a/kernel/irq/manage.c
>> > +++ b/kernel/irq/manage.c
>> > @@ -270,7 +270,14 @@ int irq_do_set_affinity(struct irq_data *data,
>> > const struct cpumask *mask,
>> >  	switch (ret) {
>> >  	case IRQ_SET_MASK_OK:
>> >  	case IRQ_SET_MASK_OK_DONE:
>> > +#ifndef CONFIG_ARM_IRQ_TARGET_ALL
>> >  		cpumask_copy(desc->irq_common_data.affinity, mask);
>> > +#else
>> > +		if (cpumask_weight(mask) > 1)
>> > +			cpumask_copy(desc->irq_common_data.affinity, cpu_online_mask);
>> > +		else
>> > +			cpumask_copy(desc->irq_common_data.affinity, mask);

Again, no. Touching the affinity mask is a NONO. The affinity mask is
handed in from either the kernel or from user space. This code has no
business to override that.

The only case where the kernel touches it is on CPU hot-unplug when the
last cpu in the affinity mask goes offline and if the interrupt is not
affinity managed.

>> - You pollute the core code with hacks that should never be there. If 
>> the
>>    current behaviour is a problem, please state what the problem is.
>> 
> We found the RCU warn when IRQs target to a offline CPU without I bit
> masked in CPU hot-plug flow
> I'll reproduce the issue again and show the log analysis for it.

That has absolutely nothing to do with any of the functions where you
fiddle with the affinity mask.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-12-02 11:10 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-27 14:15 Support 1 of N SPI interrupt for interrupt distribution Hanks Chen
2020-11-27 14:15 ` [PATCH v1 1/3] irqchip/gic: enable irq target all Hanks Chen
2020-11-27 18:11   ` Marc Zyngier
2020-11-27 18:56     ` Catalin Marinas
2020-11-27 19:43       ` Marc Zyngier
2020-12-01 13:54     ` Hanks Chen
2020-12-02 11:09       ` Thomas Gleixner
2020-11-27 14:15 ` [PATCH v1 2/3] arm: disable irq on cpu shutdown flow Hanks Chen
2020-11-27 14:15 ` [PATCH v1 3/3] arm64: " Hanks Chen
2020-11-27 18:27   ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).