All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH][RFC] x86: Fix the irq affinity in fixup_cpus
@ 2017-04-24  7:46 Chen Yu
  2017-06-04 20:04 ` Thomas Gleixner
  0 siblings, 1 reply; 7+ messages in thread
From: Chen Yu @ 2017-04-24  7:46 UTC (permalink / raw)
  To: Chen Yu
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Borislav Petkov,
	Len Brown, Rafael J . Wysocki, x86, linux-kernel

fixup_cpus() is to set appropriate irq affinity once the CPU
has been brought down, however we should also adjust the
desc->irq_common_data.affinity otherwise we will get an
incorrect irqmask during cpu offline:

cat /proc/irq/31/smp_affinity
00000000,80000000
echo 0 > /sys/devices/system/cpu/cpu31/online
cat /proc/irq/31/smp_affinity
00000000,80000000

This might bring potential problems, as reported we
saw plenty of irq flood during hibernation restore:
do_IRQ: 1.51 No irq handler for vector
Maybe it is due to some drivers get incorrect irq mask
during hibernation.

Fix this by invoking the interface of irq_set_affinity_locked()
to also update the desc->irq_common_data.affinity.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=188281
Reported-and-tested-by: Thomas Mitterfellner <thomas@mitterfellner.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Len Brown <len.brown@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 arch/x86/kernel/irq.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 4d8183b..a108ed2 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -480,13 +480,14 @@ void fixup_irqs(void)
 		if (!irqd_can_move_in_process_context(data) && chip->irq_mask)
 			chip->irq_mask(data);
 
-		if (chip->irq_set_affinity) {
-			ret = chip->irq_set_affinity(data, affinity, true);
-			if (ret == -ENOSPC)
+		ret = irq_set_affinity_locked(data, affinity, true);
+		if (ret) {
+			if (ret == -ENOSPC) {
 				pr_crit("IRQ %d set affinity failed because there are no available vectors.  The device assigned to this IRQ is unstable.\n", irq);
-		} else {
-			if (!(warned++))
-				set_affinity = 0;
+			} else {
+				if (!(warned++))
+					set_affinity = 0;
+			}
 		}
 
 		/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH][RFC] x86: Fix the irq affinity in fixup_cpus
  2017-04-24  7:46 [PATCH][RFC] x86: Fix the irq affinity in fixup_cpus Chen Yu
@ 2017-06-04 20:04 ` Thomas Gleixner
  2017-07-03 13:51   ` Chen Yu
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2017-06-04 20:04 UTC (permalink / raw)
  To: Chen Yu
  Cc: Ingo Molnar, H. Peter Anvin, Borislav Petkov, Len Brown,
	Rafael J . Wysocki, x86, linux-kernel

On Mon, 24 Apr 2017, Chen Yu wrote:

> fixup_cpus() is to set appropriate irq affinity once the CPU
> has been brought down, however we should also adjust the
> desc->irq_common_data.affinity otherwise we will get an
> incorrect irqmask during cpu offline:
> 
> cat /proc/irq/31/smp_affinity
> 00000000,80000000
> echo 0 > /sys/devices/system/cpu/cpu31/online
> cat /proc/irq/31/smp_affinity
> 00000000,80000000
> 
> This might bring potential problems, as reported we
> saw plenty of irq flood during hibernation restore:
> do_IRQ: 1.51 No irq handler for vector
> Maybe it is due to some drivers get incorrect irq mask
> during hibernation.
> 
> Fix this by invoking the interface of irq_set_affinity_locked()
> to also update the desc->irq_common_data.affinity.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=188281
> Reported-and-tested-by: Thomas Mitterfellner <thomas@mitterfellner.at>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Len Brown <len.brown@intel.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: x86@kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> ---
>  arch/x86/kernel/irq.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index 4d8183b..a108ed2 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -480,13 +480,14 @@ void fixup_irqs(void)
>  		if (!irqd_can_move_in_process_context(data) && chip->irq_mask)
>  			chip->irq_mask(data);
>  
> -		if (chip->irq_set_affinity) {
> -			ret = chip->irq_set_affinity(data, affinity, true);
> -			if (ret == -ENOSPC)
> +		ret = irq_set_affinity_locked(data, affinity, true);

This can't work. For interrupts which cannot set the affinity in normal
context irq_set_affinity_locked() will queue the interrupt to move at the
next arrival of an interrupt. So the irq stays affine to the dying
CPU.

After looking at the callsites, it's safe to change
irq_set_affinity_locked() so that it uses the direct affinity setter
function when force == true.

So we need that change first, before we can switch fixups_irqs() over.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][RFC] x86: Fix the irq affinity in fixup_cpus
  2017-06-04 20:04 ` Thomas Gleixner
@ 2017-07-03 13:51   ` Chen Yu
  2017-07-04  8:50     ` Thomas Gleixner
  0 siblings, 1 reply; 7+ messages in thread
From: Chen Yu @ 2017-07-03 13:51 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, H. Peter Anvin, Borislav Petkov, Len Brown,
	Rafael J . Wysocki, x86, linux-kernel

On Sun, Jun 04, 2017 at 10:04:53PM +0200, Thomas Gleixner wrote:
> On Mon, 24 Apr 2017, Chen Yu wrote:
> 
> > fixup_cpus() is to set appropriate irq affinity once the CPU
> > has been brought down, however we should also adjust the
> > desc->irq_common_data.affinity otherwise we will get an
> > incorrect irqmask during cpu offline:
> > 
> > cat /proc/irq/31/smp_affinity
> > 00000000,80000000
> > echo 0 > /sys/devices/system/cpu/cpu31/online
> > cat /proc/irq/31/smp_affinity
> > 00000000,80000000
> > 
> > This might bring potential problems, as reported we
> > saw plenty of irq flood during hibernation restore:
> > do_IRQ: 1.51 No irq handler for vector
> > Maybe it is due to some drivers get incorrect irq mask
> > during hibernation.
> > 
> > Fix this by invoking the interface of irq_set_affinity_locked()
> > to also update the desc->irq_common_data.affinity.
> > 
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=188281
> > Reported-and-tested-by: Thomas Mitterfellner <thomas@mitterfellner.at>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Len Brown <len.brown@intel.com>
> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Cc: x86@kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> > ---
> >  arch/x86/kernel/irq.c | 13 +++++++------
> >  1 file changed, 7 insertions(+), 6 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> > index 4d8183b..a108ed2 100644
> > --- a/arch/x86/kernel/irq.c
> > +++ b/arch/x86/kernel/irq.c
> > @@ -480,13 +480,14 @@ void fixup_irqs(void)
> >  		if (!irqd_can_move_in_process_context(data) && chip->irq_mask)
> >  			chip->irq_mask(data);
> >  
> > -		if (chip->irq_set_affinity) {
> > -			ret = chip->irq_set_affinity(data, affinity, true);
> > -			if (ret == -ENOSPC)
> > +		ret = irq_set_affinity_locked(data, affinity, true);
> 
> This can't work. For interrupts which cannot set the affinity in normal
> context irq_set_affinity_locked() will queue the interrupt to move at the
> next arrival of an interrupt. So the irq stays affine to the dying
> CPU.
>
Ok, got it.
> After looking at the callsites, it's safe to change
> irq_set_affinity_locked() so that it uses the direct affinity setter
> function when force == true.
> 
Sorry it took me sometime to understand this point(this is why I did not reply
to you at the first time :-)
I thought the defination of the word 'safe' here means, we should
not adjust the irq affinity in the process context if the ISR is
still running, otherwise there might be a race condition.

Currently, there are four drivers would set the force flag to true(AKA,
invoking irq_force_affinity()).

1. exynos4_mct_starting_cpu()
   The irq affinity is set before the clockevent is registered,
   so there would be no interrupt triggered when adjusting
   the irq affinity in the process context. Safe.

2. sirfsoc_local_timer_starting_cpu()
   The same as above. Safe.

3. arm_perf_starting_cpu()
   During cpu offline, the pmu interrupt(non percpu pmu interrupt)
   might be migrated to other online cpus. Then once the same cpu
   is put online, the interrupt will be set back to this cpu again
   by invoking irq_force_affinity(), but currently the pmu interrupt
   might be still running on other cpus, so it would be unsafe to adjust
   its irq affinity in the process context?
   
4. sunhv_migrate_hvcons_irq()
   The cpu who encountered a panic needs to migrate the hvcons irq to the
   current alive cpu, and send ipi to stop other cpus. So at the time to
   adjust the irq affinity for the hvcons, the interrupt of the latter might
   be running and it might be unsafe to adjust the irq affinity in the
   process context?

Not sure if my understanding is correct, or do I miss something?

thanks,
Yu

> So we need that change first, before we can switch fixups_irqs() over.
> 
> Thanks,
> 
> 	tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][RFC] x86: Fix the irq affinity in fixup_cpus
  2017-07-03 13:51   ` Chen Yu
@ 2017-07-04  8:50     ` Thomas Gleixner
  2017-07-05  3:20       ` Chen Yu
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2017-07-04  8:50 UTC (permalink / raw)
  To: Chen Yu
  Cc: Ingo Molnar, H. Peter Anvin, Borislav Petkov, Len Brown,
	Rafael J . Wysocki, x86, LKML

On Mon, 3 Jul 2017, Chen Yu wrote:
> On Sun, Jun 04, 2017 at 10:04:53PM +0200, Thomas Gleixner wrote:
> > After looking at the callsites, it's safe to change
> > irq_set_affinity_locked() so that it uses the direct affinity setter
> > function when force == true.
> > 
> Sorry it took me sometime to understand this point(this is why I did not reply
> to you at the first time :-)
> I thought the defination of the word 'safe' here means, we should
> not adjust the irq affinity in the process context if the ISR is
> still running, otherwise there might be a race condition.
> 
> Currently, there are four drivers would set the force flag to true(AKA,
> invoking irq_force_affinity()).
> 
> 1. exynos4_mct_starting_cpu()
>    The irq affinity is set before the clockevent is registered,
>    so there would be no interrupt triggered when adjusting
>    the irq affinity in the process context. Safe.
> 
> 2. sirfsoc_local_timer_starting_cpu()
>    The same as above. Safe.
> 
> 3. arm_perf_starting_cpu()
>    During cpu offline, the pmu interrupt(non percpu pmu interrupt)
>    might be migrated to other online cpus. Then once the same cpu
>    is put online, the interrupt will be set back to this cpu again
>    by invoking irq_force_affinity(), but currently the pmu interrupt
>    might be still running on other cpus, so it would be unsafe to adjust
>    its irq affinity in the process context?

No, that's not an issue. The ability to move interrupts in process context,
or better said in any context, has nothing to do with a concurrent
interrupt. The normal mechanics for most architectures/interrupt
controllers is just to program the new affinity setting which will take
effect with the next delivered interrupt.

We just have these ill designed hardware implementations which do not allow
that. They require to change the interrupt affinity at the point when the
interrupt is handled on the original target CPU. But that's hard to achieve
when the CPU is about to go offline, because we might wait forever for an
interrupt to be raised. So in that case we need to forcefully move them
away and take the risk of colliding with an actual interrupt being raised
in hardware concurrently which has the potential to confuse the interrupt
chip.

> 4. sunhv_migrate_hvcons_irq()
>    The cpu who encountered a panic needs to migrate the hvcons irq to the
>    current alive cpu, and send ipi to stop other cpus. So at the time to
>    adjust the irq affinity for the hvcons, the interrupt of the latter might
>    be running and it might be unsafe to adjust the irq affinity in the
>    process context?

None of these are related to that problem. All of these architectures can
move interrupts in process context unconditionally. It's also not relevant
which callsites invoke irq_set_affinity_locked() with force=true.

The point is whether we can change the semantics of irq_set_affinity_locked()
without breaking something.

But please answer my other mail in that thread [1] first before we start
about changing anything in that area. The affinity related changes are in
Linus tree now, so please retest against that as well.

Thanks,

	tglx

[1] http://lkml.kernel.org/r/alpine.DEB.2.20.1706282036330.1890@nanos

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][RFC] x86: Fix the irq affinity in fixup_cpus
  2017-07-04  8:50     ` Thomas Gleixner
@ 2017-07-05  3:20       ` Chen Yu
  2017-07-05  6:46         ` Thomas Gleixner
  0 siblings, 1 reply; 7+ messages in thread
From: Chen Yu @ 2017-07-05  3:20 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, H. Peter Anvin, Borislav Petkov, Len Brown,
	Rafael J . Wysocki, x86, LKML

On Tue, Jul 04, 2017 at 10:50:33AM +0200, Thomas Gleixner wrote:
> On Mon, 3 Jul 2017, Chen Yu wrote:
> > On Sun, Jun 04, 2017 at 10:04:53PM +0200, Thomas Gleixner wrote:
> > > After looking at the callsites, it's safe to change
> > > irq_set_affinity_locked() so that it uses the direct affinity setter
> > > function when force == true.
> > > 
> > Sorry it took me sometime to understand this point(this is why I did not reply
> > to you at the first time :-)
> > I thought the defination of the word 'safe' here means, we should
> > not adjust the irq affinity in the process context if the ISR is
> > still running, otherwise there might be a race condition.
> > 
> > Currently, there are four drivers would set the force flag to true(AKA,
> > invoking irq_force_affinity()).
> > 
> > 1. exynos4_mct_starting_cpu()
> >    The irq affinity is set before the clockevent is registered,
> >    so there would be no interrupt triggered when adjusting
> >    the irq affinity in the process context. Safe.
> > 
> > 2. sirfsoc_local_timer_starting_cpu()
> >    The same as above. Safe.
> > 
> > 3. arm_perf_starting_cpu()
> >    During cpu offline, the pmu interrupt(non percpu pmu interrupt)
> >    might be migrated to other online cpus. Then once the same cpu
> >    is put online, the interrupt will be set back to this cpu again
> >    by invoking irq_force_affinity(), but currently the pmu interrupt
> >    might be still running on other cpus, so it would be unsafe to adjust
> >    its irq affinity in the process context?
> 
> No, that's not an issue. The ability to move interrupts in process context,
> or better said in any context, has nothing to do with a concurrent
> interrupt. The normal mechanics for most architectures/interrupt
> controllers is just to program the new affinity setting which will take
> effect with the next delivered interrupt.
> 
> We just have these ill designed hardware implementations which do not allow
> that. They require to change the interrupt affinity at the point when the
> interrupt is handled on the original target CPU. But that's hard to achieve
> when the CPU is about to go offline, because we might wait forever for an
> interrupt to be raised. So in that case we need to forcefully move them
> away and take the risk of colliding with an actual interrupt being raised
> in hardware concurrently which has the potential to confuse the interrupt
> chip.
>
Thanks for the detailed explaination! Got it.
> > 4. sunhv_migrate_hvcons_irq()
> >    The cpu who encountered a panic needs to migrate the hvcons irq to the
> >    current alive cpu, and send ipi to stop other cpus. So at the time to
> >    adjust the irq affinity for the hvcons, the interrupt of the latter might
> >    be running and it might be unsafe to adjust the irq affinity in the
> >    process context?
> 
> None of these are related to that problem. All of these architectures can
> move interrupts in process context unconditionally. It's also not relevant
> which callsites invoke irq_set_affinity_locked() with force=true.
> 
Okay.
> The point is whether we can change the semantics of irq_set_affinity_locked()
> without breaking something.
> 
Yes, this might change the semantics of force flag.
> But please answer my other mail in that thread [1] first before we start
> about changing anything in that area. The affinity related changes are in
> Linus tree now, so please retest against that as well.
Okay, I'll switch to that thread.
Here's the test result for affinity:
# uname -r
4.12.0+
# cat /proc/irq/32/smp_affinity
00000000,80000000
# echo 0 > /sys/devices/system/cpu/cpu31/online 
# cat /proc/irq/32/smp_affinity
00000000,ffffffff
Looks like cpu31 is till included in the irq mask.

Thanks,
Yu
> 
> Thanks,
> 
> 	tglx
> 
> [1] http://lkml.kernel.org/r/alpine.DEB.2.20.1706282036330.1890@nanos
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][RFC] x86: Fix the irq affinity in fixup_cpus
  2017-07-05  3:20       ` Chen Yu
@ 2017-07-05  6:46         ` Thomas Gleixner
  2017-07-05 14:10           ` Chen Yu
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2017-07-05  6:46 UTC (permalink / raw)
  To: Chen Yu
  Cc: Ingo Molnar, H. Peter Anvin, Borislav Petkov, Len Brown,
	Rafael J . Wysocki, x86, LKML

On Wed, 5 Jul 2017, Chen Yu wrote:
> On Tue, Jul 04, 2017 at 10:50:33AM +0200, Thomas Gleixner wrote:
> Here's the test result for affinity:
> # uname -r
> 4.12.0+
> # cat /proc/irq/32/smp_affinity
> 00000000,80000000
> # echo 0 > /sys/devices/system/cpu/cpu31/online 
> # cat /proc/irq/32/smp_affinity
> 00000000,ffffffff
> Looks like cpu31 is till included in the irq mask.

Yes, because the offline broke affinity and now it's reset to the default
affinity mask. /proc/irq/*/affinity is the affinity which is allowed by the
admin, not the one which is effective.

Look at /proc/irq/32/effective_affinity to see where it's really routed to.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH][RFC] x86: Fix the irq affinity in fixup_cpus
  2017-07-05  6:46         ` Thomas Gleixner
@ 2017-07-05 14:10           ` Chen Yu
  0 siblings, 0 replies; 7+ messages in thread
From: Chen Yu @ 2017-07-05 14:10 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, H. Peter Anvin, Borislav Petkov, Len Brown,
	Rafael J . Wysocki, x86, LKML

On Wed, Jul 05, 2017 at 08:46:08AM +0200, Thomas Gleixner wrote:
> On Wed, 5 Jul 2017, Chen Yu wrote:
> > On Tue, Jul 04, 2017 at 10:50:33AM +0200, Thomas Gleixner wrote:
> > Here's the test result for affinity:
> > # uname -r
> > 4.12.0+
> > # cat /proc/irq/32/smp_affinity
> > 00000000,80000000
> > # echo 0 > /sys/devices/system/cpu/cpu31/online 
> > # cat /proc/irq/32/smp_affinity
> > 00000000,ffffffff
> > Looks like cpu31 is till included in the irq mask.
> 
> Yes, because the offline broke affinity and now it's reset to the default
> affinity mask. /proc/irq/*/affinity is the affinity which is allowed by the
> admin, not the one which is effective.
> 
> Look at /proc/irq/32/effective_affinity to see where it's really routed to.
>
ok, got it, this is normal now.
cat /proc/irq/32/effective_affinity
00000000,00000001

Thanks,
	Yu

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-07-05 14:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-24  7:46 [PATCH][RFC] x86: Fix the irq affinity in fixup_cpus Chen Yu
2017-06-04 20:04 ` Thomas Gleixner
2017-07-03 13:51   ` Chen Yu
2017-07-04  8:50     ` Thomas Gleixner
2017-07-05  3:20       ` Chen Yu
2017-07-05  6:46         ` Thomas Gleixner
2017-07-05 14:10           ` Chen Yu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.