* [PATCH] ARM: irq: Add IRQ_SET_MASK_OK_DONE handling in migrate_one_irq() @ 2019-01-08 13:58 Dietmar Eggemann 2019-01-08 14:16 ` Marc Zyngier 0 siblings, 1 reply; 5+ messages in thread From: Dietmar Eggemann @ 2019-01-08 13:58 UTC (permalink / raw) To: linux-arm-kernel Cc: Russell King - ARM Linux, Marc Zyngier, Sudeep Holla, linux-kernel Arm TC2 (multi_v7_defconfig plus CONFIG_ARM_BIG_LITTLE_CPUFREQ=y and CONFIG_ARM_VEXPRESS_SPC_CPUFREQ=y) fails hotplug stress tests. This issue was tracked down to a missing copy of the new affinity cpumask of the vexpress-spc interrupt into struct irq_common_data.affinity when the interrupt is migrated in migrate_one_irq(). Commit 0407daceedfe ("irqchip/gic: Return IRQ_SET_MASK_OK_DONE in the set_affinity method") changed the return value of the irq_set_affinity() function of the GIC from IRQ_SET_MASK_OK to IRQ_SET_MASK_OK_DONE. In migrate_one_irq() if the current irq affinity mask and the cpu online mask do not share any CPU, the affinity mask is set to the cpu online mask. In this case (ret == true) and when the irq chip function irq_set_affinity() returns successfully (IRQ_SET_MASK_OK), struct irq_common_data.affinity should also be updated. Add IRQ_SET_MASK_OK_DONE next to IRQ_SET_MASK_OK when checking that the irq chip function irq_set_affinity() returns successfully. Commit 2cb625478f8c ("genirq: Add IRQ_SET_MASK_OK_DONE to support stacked irqchip") only added IRQ_SET_MASK_OK_DONE handling to irq_do_set_affinity() in the irq core and not to the Arm32 irq code. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> --- The hotplug issue on Arm TC2 happens because the vexpress-spc interrupt (irq=22) is affine to CPU0. This occurs since it is setup early when the cpu_online_mask is still 0. But the problem with the missing copy of the affinity mask should occur with every interrupt which is forced to migrate. With additional debug in irq_setup_affinity(): [0.000619] irq_setup_affinity(): irq=17 mask=0 cpu_online_mask=0 set=0-4 [0.007065] irq_setup_affinity(): irq=22 mask=0 cpu_online_mask=0 set=0-4 [3.372907] irq_setup_affinity(): irq=47 mask=0-4 cpu_online_mask=0-4 set=0-4 cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 22: 316 0 0 0 0 GIC-0 127 Level vexpress-spc cat /proc/irq/22/smp_affinity_list 0 arch/arm/kernel/irq.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/arm/kernel/irq.c b/arch/arm/kernel/irq.c index 9908dacf9229..ddb828b0235b 100644 --- a/arch/arm/kernel/irq.c +++ b/arch/arm/kernel/irq.c @@ -117,6 +117,7 @@ static bool migrate_one_irq(struct irq_desc *desc) const struct cpumask *affinity = irq_data_get_affinity_mask(d); struct irq_chip *c; bool ret = false; + int ret2; /* * If this is a per-CPU interrupt, or the affinity does not @@ -131,9 +132,14 @@ static bool migrate_one_irq(struct irq_desc *desc) } c = irq_data_get_irq_chip(d); - if (!c->irq_set_affinity) + if (!c->irq_set_affinity) { pr_debug("IRQ%u: unable to set affinity\n", d->irq); - else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK && ret) + return ret; + } + + ret2 = c->irq_set_affinity(d, affinity, false); + + if ((ret2 == IRQ_SET_MASK_OK || ret2 == IRQ_SET_MASK_OK_DONE) && ret) cpumask_copy(irq_data_get_affinity_mask(d), affinity); return ret; -- 2.11.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] ARM: irq: Add IRQ_SET_MASK_OK_DONE handling in migrate_one_irq() 2019-01-08 13:58 [PATCH] ARM: irq: Add IRQ_SET_MASK_OK_DONE handling in migrate_one_irq() Dietmar Eggemann @ 2019-01-08 14:16 ` Marc Zyngier 2019-01-09 15:47 ` Dietmar Eggemann 0 siblings, 1 reply; 5+ messages in thread From: Marc Zyngier @ 2019-01-08 14:16 UTC (permalink / raw) To: Dietmar Eggemann, linux-arm-kernel Cc: Russell King - ARM Linux, Sudeep Holla, linux-kernel Hi Dietmar, On 08/01/2019 13:58, Dietmar Eggemann wrote: > Arm TC2 (multi_v7_defconfig plus CONFIG_ARM_BIG_LITTLE_CPUFREQ=y and > CONFIG_ARM_VEXPRESS_SPC_CPUFREQ=y) fails hotplug stress tests. > > This issue was tracked down to a missing copy of the new affinity > cpumask of the vexpress-spc interrupt into struct > irq_common_data.affinity when the interrupt is migrated in > migrate_one_irq(). > > Commit 0407daceedfe ("irqchip/gic: Return IRQ_SET_MASK_OK_DONE in the > set_affinity method") changed the return value of the irq_set_affinity() > function of the GIC from IRQ_SET_MASK_OK to IRQ_SET_MASK_OK_DONE. > > In migrate_one_irq() if the current irq affinity mask and the cpu > online mask do not share any CPU, the affinity mask is set to the cpu > online mask. In this case (ret == true) and when the irq chip > function irq_set_affinity() returns successfully (IRQ_SET_MASK_OK), > struct irq_common_data.affinity should also be updated. > > Add IRQ_SET_MASK_OK_DONE next to IRQ_SET_MASK_OK when checking that the > irq chip function irq_set_affinity() returns successfully. > > Commit 2cb625478f8c ("genirq: Add IRQ_SET_MASK_OK_DONE to support > stacked irqchip") only added IRQ_SET_MASK_OK_DONE handling to > irq_do_set_affinity() in the irq core and not to the Arm32 irq code. > > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> > --- > > The hotplug issue on Arm TC2 happens because the vexpress-spc interrupt > (irq=22) is affine to CPU0. This occurs since it is setup early when the > cpu_online_mask is still 0. > But the problem with the missing copy of the affinity mask should occur > with every interrupt which is forced to migrate. > > With additional debug in irq_setup_affinity(): > > [0.000619] irq_setup_affinity(): irq=17 mask=0 cpu_online_mask=0 set=0-4 > [0.007065] irq_setup_affinity(): irq=22 mask=0 cpu_online_mask=0 set=0-4 > [3.372907] irq_setup_affinity(): irq=47 mask=0-4 cpu_online_mask=0-4 > set=0-4 > > cat /proc/interrupts > CPU0 CPU1 CPU2 CPU3 CPU4 > 22: 316 0 0 0 0 GIC-0 127 > Level vexpress-spc > > cat /proc/irq/22/smp_affinity_list > 0 > > arch/arm/kernel/irq.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/arch/arm/kernel/irq.c b/arch/arm/kernel/irq.c > index 9908dacf9229..ddb828b0235b 100644 > --- a/arch/arm/kernel/irq.c > +++ b/arch/arm/kernel/irq.c > @@ -117,6 +117,7 @@ static bool migrate_one_irq(struct irq_desc *desc) > const struct cpumask *affinity = irq_data_get_affinity_mask(d); > struct irq_chip *c; > bool ret = false; > + int ret2; > > /* > * If this is a per-CPU interrupt, or the affinity does not > @@ -131,9 +132,14 @@ static bool migrate_one_irq(struct irq_desc *desc) > } > > c = irq_data_get_irq_chip(d); > - if (!c->irq_set_affinity) > + if (!c->irq_set_affinity) { > pr_debug("IRQ%u: unable to set affinity\n", d->irq); > - else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK && ret) > + return ret; > + } > + > + ret2 = c->irq_set_affinity(d, affinity, false); > + > + if ((ret2 == IRQ_SET_MASK_OK || ret2 == IRQ_SET_MASK_OK_DONE) && ret) > cpumask_copy(irq_data_get_affinity_mask(d), affinity); > > return ret; > On the arm64 side, we've solved the exact same issue by getting rid of this code and using the generic implementation. See 217d453d473c5 ("arm64: fix a migrating irq bug when hotplug cpu"), which uses irq_migrate_all_off_this_cpu instead. I'm not sure there is much value in not using the core code in this case. Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] ARM: irq: Add IRQ_SET_MASK_OK_DONE handling in migrate_one_irq() 2019-01-08 14:16 ` Marc Zyngier @ 2019-01-09 15:47 ` Dietmar Eggemann 2019-01-09 16:21 ` Marc Zyngier 0 siblings, 1 reply; 5+ messages in thread From: Dietmar Eggemann @ 2019-01-09 15:47 UTC (permalink / raw) To: Marc Zyngier, linux-arm-kernel Cc: Russell King - ARM Linux, Sudeep Holla, linux-kernel Hi Marc, On 1/8/19 3:16 PM, Marc Zyngier wrote: > Hi Dietmar, > > On 08/01/2019 13:58, Dietmar Eggemann wrote: >> Arm TC2 (multi_v7_defconfig plus CONFIG_ARM_BIG_LITTLE_CPUFREQ=y and >> CONFIG_ARM_VEXPRESS_SPC_CPUFREQ=y) fails hotplug stress tests. >> >> This issue was tracked down to a missing copy of the new affinity >> cpumask of the vexpress-spc interrupt into struct >> irq_common_data.affinity when the interrupt is migrated in >> migrate_one_irq(). >> >> Commit 0407daceedfe ("irqchip/gic: Return IRQ_SET_MASK_OK_DONE in the >> set_affinity method") changed the return value of the irq_set_affinity() >> function of the GIC from IRQ_SET_MASK_OK to IRQ_SET_MASK_OK_DONE. >> >> In migrate_one_irq() if the current irq affinity mask and the cpu >> online mask do not share any CPU, the affinity mask is set to the cpu >> online mask. In this case (ret == true) and when the irq chip >> function irq_set_affinity() returns successfully (IRQ_SET_MASK_OK), >> struct irq_common_data.affinity should also be updated. >> >> Add IRQ_SET_MASK_OK_DONE next to IRQ_SET_MASK_OK when checking that the >> irq chip function irq_set_affinity() returns successfully. >> >> Commit 2cb625478f8c ("genirq: Add IRQ_SET_MASK_OK_DONE to support >> stacked irqchip") only added IRQ_SET_MASK_OK_DONE handling to >> irq_do_set_affinity() in the irq core and not to the Arm32 irq code. >> >> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> >> --- >> >> The hotplug issue on Arm TC2 happens because the vexpress-spc interrupt >> (irq=22) is affine to CPU0. This occurs since it is setup early when the >> cpu_online_mask is still 0. >> But the problem with the missing copy of the affinity mask should occur >> with every interrupt which is forced to migrate. >> >> With additional debug in irq_setup_affinity(): >> >> [0.000619] irq_setup_affinity(): irq=17 mask=0 cpu_online_mask=0 set=0-4 >> [0.007065] irq_setup_affinity(): irq=22 mask=0 cpu_online_mask=0 set=0-4 >> [3.372907] irq_setup_affinity(): irq=47 mask=0-4 cpu_online_mask=0-4 >> set=0-4 >> >> cat /proc/interrupts >> CPU0 CPU1 CPU2 CPU3 CPU4 >> 22: 316 0 0 0 0 GIC-0 127 >> Level vexpress-spc >> >> cat /proc/irq/22/smp_affinity_list >> 0 [...] > > On the arm64 side, we've solved the exact same issue by getting rid of > this code and using the generic implementation. See 217d453d473c5 > ("arm64: fix a migrating irq bug when hotplug cpu"), which uses > irq_migrate_all_off_this_cpu instead. > > I'm not sure there is much value in not using the core code in this case. Thanks for the hint! Much more elegant! I tried the following on TC2 and it worked just fine. I'm not aware on any drawbacks of using the generic irq migration for Arm32 as well. -- >8 -- diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 664e918e2624..26524b75970a 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1400,6 +1400,7 @@ config NR_CPUS config HOTPLUG_CPU bool "Support for hot-pluggable CPUs" depends on SMP + select GENERIC_IRQ_MIGRATION help Say Y here to experiment with turning CPUs off and on. CPUs can be controlled through /sys/devices/system/cpu. diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h index c883fcbe93b6..46d41140df27 100644 --- a/arch/arm/include/asm/irq.h +++ b/arch/arm/include/asm/irq.h @@ -25,7 +25,6 @@ #ifndef __ASSEMBLY__ struct irqaction; struct pt_regs; -extern void migrate_irqs(void); extern void asm_do_IRQ(unsigned int, struct pt_regs *); void handle_IRQ(unsigned int, struct pt_regs *); diff --git a/arch/arm/kernel/irq.c b/arch/arm/kernel/irq.c index 9908dacf9229..844861368cd5 100644 --- a/arch/arm/kernel/irq.c +++ b/arch/arm/kernel/irq.c @@ -31,7 +31,6 @@ #include <linux/smp.h> #include <linux/init.h> #include <linux/seq_file.h> -#include <linux/ratelimit.h> #include <linux/errno.h> #include <linux/list.h> #include <linux/kallsyms.h> @@ -109,64 +108,3 @@ int __init arch_probe_nr_irqs(void) return nr_irqs; } #endif - -#ifdef CONFIG_HOTPLUG_CPU -static bool migrate_one_irq(struct irq_desc *desc) -{ - struct irq_data *d = irq_desc_get_irq_data(desc); - const struct cpumask *affinity = irq_data_get_affinity_mask(d); - struct irq_chip *c; - bool ret = false; - - /* - * If this is a per-CPU interrupt, or the affinity does not - * include this CPU, then we have nothing to do. - */ - if (irqd_is_per_cpu(d) || !cpumask_test_cpu(smp_processor_id(), affinity)) - return false; - - if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) { - affinity = cpu_online_mask; - ret = true; - } - - c = irq_data_get_irq_chip(d); - if (!c->irq_set_affinity) - pr_debug("IRQ%u: unable to set affinity\n", d->irq); - else if (c->irq_set_affinity(d, affinity, false) == IRQ_SET_MASK_OK && ret) - cpumask_copy(irq_data_get_affinity_mask(d), affinity); - - return ret; -} - -/* - * The current CPU has been marked offline. Migrate IRQs off this CPU. - * If the affinity settings do not allow other CPUs, force them onto any - * available CPU. - * - * Note: we must iterate over all IRQs, whether they have an attached - * action structure or not, as we need to get chained interrupts too. - */ -void migrate_irqs(void) -{ - unsigned int i; - struct irq_desc *desc; - unsigned long flags; - - local_irq_save(flags); - - for_each_irq_desc(i, desc) { - bool affinity_broken; - - raw_spin_lock(&desc->lock); - affinity_broken = migrate_one_irq(desc); - raw_spin_unlock(&desc->lock); - - if (affinity_broken) - pr_warn_ratelimited("IRQ%u no longer affine to CPU%u\n", - i, smp_processor_id()); - } - - local_irq_restore(flags); -} -#endif /* CONFIG_HOTPLUG_CPU */ diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 3bf82232b1be..1d6f5ea522f4 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -254,7 +254,7 @@ int __cpu_disable(void) /* * OK - migrate IRQs away from this CPU */ - migrate_irqs(); + irq_migrate_all_off_this_cpu(); /* * Flush user cache and TLB mappings, and then remove this CPU ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] ARM: irq: Add IRQ_SET_MASK_OK_DONE handling in migrate_one_irq() 2019-01-09 15:47 ` Dietmar Eggemann @ 2019-01-09 16:21 ` Marc Zyngier 2019-01-10 10:12 ` Dietmar Eggemann 0 siblings, 1 reply; 5+ messages in thread From: Marc Zyngier @ 2019-01-09 16:21 UTC (permalink / raw) To: Dietmar Eggemann, linux-arm-kernel Cc: Russell King - ARM Linux, Sudeep Holla, linux-kernel On 09/01/2019 15:47, Dietmar Eggemann wrote: > Hi Marc, > > On 1/8/19 3:16 PM, Marc Zyngier wrote: >> Hi Dietmar, >> >> On 08/01/2019 13:58, Dietmar Eggemann wrote: >>> Arm TC2 (multi_v7_defconfig plus CONFIG_ARM_BIG_LITTLE_CPUFREQ=y and >>> CONFIG_ARM_VEXPRESS_SPC_CPUFREQ=y) fails hotplug stress tests. >>> >>> This issue was tracked down to a missing copy of the new affinity >>> cpumask of the vexpress-spc interrupt into struct >>> irq_common_data.affinity when the interrupt is migrated in >>> migrate_one_irq(). >>> >>> Commit 0407daceedfe ("irqchip/gic: Return IRQ_SET_MASK_OK_DONE in the >>> set_affinity method") changed the return value of the irq_set_affinity() >>> function of the GIC from IRQ_SET_MASK_OK to IRQ_SET_MASK_OK_DONE. >>> >>> In migrate_one_irq() if the current irq affinity mask and the cpu >>> online mask do not share any CPU, the affinity mask is set to the cpu >>> online mask. In this case (ret == true) and when the irq chip >>> function irq_set_affinity() returns successfully (IRQ_SET_MASK_OK), >>> struct irq_common_data.affinity should also be updated. >>> >>> Add IRQ_SET_MASK_OK_DONE next to IRQ_SET_MASK_OK when checking that the >>> irq chip function irq_set_affinity() returns successfully. >>> >>> Commit 2cb625478f8c ("genirq: Add IRQ_SET_MASK_OK_DONE to support >>> stacked irqchip") only added IRQ_SET_MASK_OK_DONE handling to >>> irq_do_set_affinity() in the irq core and not to the Arm32 irq code. >>> >>> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> >>> --- >>> >>> The hotplug issue on Arm TC2 happens because the vexpress-spc interrupt >>> (irq=22) is affine to CPU0. This occurs since it is setup early when the >>> cpu_online_mask is still 0. >>> But the problem with the missing copy of the affinity mask should occur >>> with every interrupt which is forced to migrate. >>> >>> With additional debug in irq_setup_affinity(): >>> >>> [0.000619] irq_setup_affinity(): irq=17 mask=0 cpu_online_mask=0 set=0-4 >>> [0.007065] irq_setup_affinity(): irq=22 mask=0 cpu_online_mask=0 set=0-4 >>> [3.372907] irq_setup_affinity(): irq=47 mask=0-4 cpu_online_mask=0-4 >>> set=0-4 >>> >>> cat /proc/interrupts >>> CPU0 CPU1 CPU2 CPU3 CPU4 >>> 22: 316 0 0 0 0 GIC-0 127 >>> Level vexpress-spc >>> >>> cat /proc/irq/22/smp_affinity_list >>> 0 > > [...] > >> >> On the arm64 side, we've solved the exact same issue by getting rid of >> this code and using the generic implementation. See 217d453d473c5 >> ("arm64: fix a migrating irq bug when hotplug cpu"), which uses >> irq_migrate_all_off_this_cpu instead. >> >> I'm not sure there is much value in not using the core code in this case. > > Thanks for the hint! Much more elegant! I tried the following on TC2 and > it worked just fine. I'm not aware on any drawbacks of using the generic > irq migration for Arm32 as well. [...] Sounds great! Can you put it in a proper patch and resend it? Thanks, M. -- Jazz is not dead. It just smells funny... ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] ARM: irq: Add IRQ_SET_MASK_OK_DONE handling in migrate_one_irq() 2019-01-09 16:21 ` Marc Zyngier @ 2019-01-10 10:12 ` Dietmar Eggemann 0 siblings, 0 replies; 5+ messages in thread From: Dietmar Eggemann @ 2019-01-10 10:12 UTC (permalink / raw) To: Marc Zyngier, linux-arm-kernel Cc: Russell King - ARM Linux, Sudeep Holla, linux-kernel On 1/9/19 5:21 PM, Marc Zyngier wrote: > On 09/01/2019 15:47, Dietmar Eggemann wrote: >> Hi Marc, >> >> On 1/8/19 3:16 PM, Marc Zyngier wrote: >>> Hi Dietmar, >>> >>> On 08/01/2019 13:58, Dietmar Eggemann wrote: [...] >>> On the arm64 side, we've solved the exact same issue by getting rid of >>> this code and using the generic implementation. See 217d453d473c5 >>> ("arm64: fix a migrating irq bug when hotplug cpu"), which uses >>> irq_migrate_all_off_this_cpu instead. >>> >>> I'm not sure there is much value in not using the core code in this case. >> >> Thanks for the hint! Much more elegant! I tried the following on TC2 and >> it worked just fine. I'm not aware on any drawbacks of using the generic >> irq migration for Arm32 as well. > > [...] > > Sounds great! Can you put it in a proper patch and resend it? Yes. I renamed the patch to "arm: fix a migrating irq bug when hotplug cpu" so it's closer to its arm64 counterpart patch. https://lore.kernel.org/lkml/20190110100922.20280-1-dietmar.eggemann@arm.com/ ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-01-10 10:12 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-01-08 13:58 [PATCH] ARM: irq: Add IRQ_SET_MASK_OK_DONE handling in migrate_one_irq() Dietmar Eggemann 2019-01-08 14:16 ` Marc Zyngier 2019-01-09 15:47 ` Dietmar Eggemann 2019-01-09 16:21 ` Marc Zyngier 2019-01-10 10:12 ` Dietmar Eggemann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).