All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] x86/hotplug: Silence APIC only after all irq's are migrated
@ 2020-08-21  0:42 Ashok Raj
  2020-08-21  3:16 ` Randy Dunlap
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Ashok Raj @ 2020-08-21  0:42 UTC (permalink / raw)
  To: linux-kernel, Thomas Gleixner
  Cc: Ashok Raj, Sukumar Ghorai, Srikanth Nandamuri, Evan Green,
	Mathias Nyman, Bjorn Helgaas, stable

When offlining CPUs, fixup_irqs() migrates all interrupts away from the
outgoing CPU to an online CPU. It's always possible the device sent an
interrupt to the previous CPU destination. Pending interrupt bit in IRR in
LAPIC identifies such interrupts. apic_soft_disable() will not capture any
new interrupts in IRR. This causes interrupts from device to be lost during
CPU offline. The issue was found when explicitly setting MSI affinity to a
CPU and immediately offlining it. It was simple to recreate with a USB
ethernet device and doing I/O to it while the CPU is offlined. Lost
interrupts happen even when Interrupt Remapping is enabled.

Current code does apic_soft_disable() before migrating interrupts.

native_cpu_disable()
{
	...
	apic_soft_disable();
	cpu_disable_common();
	  --> fixup_irqs(); // Too late to capture anything in IRR.
}

Just flipping the above call sequence seems to hit the IRR checks
and the lost interrupt is fixed for both legacy MSI and when
interrupt remapping is enabled.

Fixes: 60dcaad5736f ("x86/hotplug: Silence APIC and NMI when CPU is dead")
Link: https://lore.kernel.org/lkml/875zdarr4h.fsf@nanos.tec.linutronix.de/
Reported-by: Evan Green <evgreen@chromium.org>
Tested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Tested-by: Evan Green <evgreen@chromium.org>
Reviewed-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
---
v2:
- Typos and fixes suggested by Randy Dunlap

To: linux-kernel@vger.kernel.org
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Sukumar Ghorai <sukumar.ghorai@intel.com>
Cc: Srikanth Nandamuri <srikanth.nandamuri@intel.com>
Cc: Evan Green <evgreen@chromium.org>
Cc: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
---
 arch/x86/kernel/smpboot.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 27aa04a95702..3016c3b627ce 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1594,13 +1594,20 @@ int native_cpu_disable(void)
 	if (ret)
 		return ret;
 
+	cpu_disable_common();
 	/*
 	 * Disable the local APIC. Otherwise IPI broadcasts will reach
 	 * it. It still responds normally to INIT, NMI, SMI, and SIPI
-	 * messages.
+	 * messages. It's important to do apic_soft_disable() after
+	 * fixup_irqs(), because fixup_irqs() called from cpu_disable_common()
+	 * depends on IRR being set. After apic_soft_disable() CPU preserves
+	 * currently set IRR/ISR but new interrupts will not set IRR.
+	 * This causes interrupts sent to outgoing CPU before completion
+	 * of IRQ migration to be lost. Check SDM Vol 3 "10.4.7.2 Local
+	 * APIC State after It Has been Software Disabled" section for more
+	 * details.
 	 */
 	apic_soft_disable();
-	cpu_disable_common();
 
 	return 0;
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] x86/hotplug: Silence APIC only after all irq's are migrated
  2020-08-21  0:42 [PATCH v2] x86/hotplug: Silence APIC only after all irq's are migrated Ashok Raj
@ 2020-08-21  3:16 ` Randy Dunlap
  2020-08-23 16:48 ` Raj, Ashok
  2020-08-26  0:40 ` Thomas Gleixner
  2 siblings, 0 replies; 5+ messages in thread
From: Randy Dunlap @ 2020-08-21  3:16 UTC (permalink / raw)
  To: Ashok Raj, linux-kernel, Thomas Gleixner
  Cc: Sukumar Ghorai, Srikanth Nandamuri, Evan Green, Mathias Nyman,
	Bjorn Helgaas, stable

On 8/20/20 5:42 PM, Ashok Raj wrote:
> When offlining CPUs, fixup_irqs() migrates all interrupts away from the
> outgoing CPU to an online CPU. It's always possible the device sent an
> interrupt to the previous CPU destination. Pending interrupt bit in IRR in
> LAPIC identifies such interrupts. apic_soft_disable() will not capture any
> new interrupts in IRR. This causes interrupts from device to be lost during
> CPU offline. The issue was found when explicitly setting MSI affinity to a
> CPU and immediately offlining it. It was simple to recreate with a USB
> ethernet device and doing I/O to it while the CPU is offlined. Lost
> interrupts happen even when Interrupt Remapping is enabled.
> 
> Current code does apic_soft_disable() before migrating interrupts.
> 
> native_cpu_disable()
> {
> 	...
> 	apic_soft_disable();
> 	cpu_disable_common();
> 	  --> fixup_irqs(); // Too late to capture anything in IRR.
> }
> 
> Just flipping the above call sequence seems to hit the IRR checks
> and the lost interrupt is fixed for both legacy MSI and when
> interrupt remapping is enabled.
> 
> Fixes: 60dcaad5736f ("x86/hotplug: Silence APIC and NMI when CPU is dead")
> Link: https://lore.kernel.org/lkml/875zdarr4h.fsf@nanos.tec.linutronix.de/
> Reported-by: Evan Green <evgreen@chromium.org>
> Tested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> Tested-by: Evan Green <evgreen@chromium.org>
> Reviewed-by: Evan Green <evgreen@chromium.org>
> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> ---
> v2:
> - Typos and fixes suggested by Randy Dunlap

Those all look good now.  Thanks for the update.

> To: linux-kernel@vger.kernel.org
> To: Thomas Gleixner <tglx@linutronix.de>
> Cc: Sukumar Ghorai <sukumar.ghorai@intel.com>
> Cc: Srikanth Nandamuri <srikanth.nandamuri@intel.com>
> Cc: Evan Green <evgreen@chromium.org>
> Cc: Mathias Nyman <mathias.nyman@linux.intel.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/kernel/smpboot.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)


-- 
~Randy


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] x86/hotplug: Silence APIC only after all irq's are migrated
  2020-08-21  0:42 [PATCH v2] x86/hotplug: Silence APIC only after all irq's are migrated Ashok Raj
  2020-08-21  3:16 ` Randy Dunlap
@ 2020-08-23 16:48 ` Raj, Ashok
  2020-08-26  0:40 ` Thomas Gleixner
  2 siblings, 0 replies; 5+ messages in thread
From: Raj, Ashok @ 2020-08-23 16:48 UTC (permalink / raw)
  To: linux-kernel, Thomas Gleixner
  Cc: Sukumar Ghorai, Srikanth Nandamuri, Evan Green, Mathias Nyman,
	Bjorn Helgaas, stable, x86, Ashok Raj

Hi Thomas,

I was wondering if you got a chance to take a look at this fix?

I had some mail issues recently and they showed up at lore after 2
days. I wasn't sure if you got the original mail, or maybe it didn't
make it. 

If you had a different way to fix it, we can try those out. 


On Thu, Aug 20, 2020 at 05:42:03PM -0700, Ashok Raj wrote:
> When offlining CPUs, fixup_irqs() migrates all interrupts away from the
> outgoing CPU to an online CPU. It's always possible the device sent an
> interrupt to the previous CPU destination. Pending interrupt bit in IRR in
> LAPIC identifies such interrupts. apic_soft_disable() will not capture any
> new interrupts in IRR. This causes interrupts from device to be lost during
> CPU offline. The issue was found when explicitly setting MSI affinity to a
> CPU and immediately offlining it. It was simple to recreate with a USB
> ethernet device and doing I/O to it while the CPU is offlined. Lost
> interrupts happen even when Interrupt Remapping is enabled.
> 
> Current code does apic_soft_disable() before migrating interrupts.
> 
> native_cpu_disable()
> {
> 	...
> 	apic_soft_disable();
> 	cpu_disable_common();
> 	  --> fixup_irqs(); // Too late to capture anything in IRR.
> }
> 
> Just flipping the above call sequence seems to hit the IRR checks
> and the lost interrupt is fixed for both legacy MSI and when
> interrupt remapping is enabled.

On another note, we have tested both with and without the read
after write when programming MSI addr/data on the device. It didn't
seem to change the results. But I think its a useful one to add
for correctness.

https://lore.kernel.org/lkml/878si6rx7f.fsf@nanos.tec.linutronix.de/

This bug been eluding for a while. Looking for your feedback.

> 
> Fixes: 60dcaad5736f ("x86/hotplug: Silence APIC and NMI when CPU is dead")
> Link: https://lore.kernel.org/lkml/875zdarr4h.fsf@nanos.tec.linutronix.de/
> Reported-by: Evan Green <evgreen@chromium.org>
> Tested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> Tested-by: Evan Green <evgreen@chromium.org>
> Reviewed-by: Evan Green <evgreen@chromium.org>
> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> ---
> v2:
> - Typos and fixes suggested by Randy Dunlap
> 
> To: linux-kernel@vger.kernel.org
> To: Thomas Gleixner <tglx@linutronix.de>
> Cc: Sukumar Ghorai <sukumar.ghorai@intel.com>
> Cc: Srikanth Nandamuri <srikanth.nandamuri@intel.com>
> Cc: Evan Green <evgreen@chromium.org>
> Cc: Mathias Nyman <mathias.nyman@linux.intel.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/kernel/smpboot.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 27aa04a95702..3016c3b627ce 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1594,13 +1594,20 @@ int native_cpu_disable(void)
>  	if (ret)
>  		return ret;
>  
> +	cpu_disable_common();
>  	/*
>  	 * Disable the local APIC. Otherwise IPI broadcasts will reach
>  	 * it. It still responds normally to INIT, NMI, SMI, and SIPI
> -	 * messages.
> +	 * messages. It's important to do apic_soft_disable() after
> +	 * fixup_irqs(), because fixup_irqs() called from cpu_disable_common()
> +	 * depends on IRR being set. After apic_soft_disable() CPU preserves
> +	 * currently set IRR/ISR but new interrupts will not set IRR.
> +	 * This causes interrupts sent to outgoing CPU before completion
> +	 * of IRQ migration to be lost. Check SDM Vol 3 "10.4.7.2 Local
> +	 * APIC State after It Has been Software Disabled" section for more
> +	 * details.
>  	 */
>  	apic_soft_disable();
> -	cpu_disable_common();
>  
>  	return 0;
>  }
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] x86/hotplug: Silence APIC only after all irq's are migrated
  2020-08-21  0:42 [PATCH v2] x86/hotplug: Silence APIC only after all irq's are migrated Ashok Raj
  2020-08-21  3:16 ` Randy Dunlap
  2020-08-23 16:48 ` Raj, Ashok
@ 2020-08-26  0:40 ` Thomas Gleixner
  2020-08-26  2:50   ` Raj, Ashok
  2 siblings, 1 reply; 5+ messages in thread
From: Thomas Gleixner @ 2020-08-26  0:40 UTC (permalink / raw)
  To: Ashok Raj, linux-kernel
  Cc: Ashok Raj, Sukumar Ghorai, Srikanth Nandamuri, Evan Green,
	Mathias Nyman, Bjorn Helgaas, stable

Ashok,

On Thu, Aug 20 2020 at 17:42, Ashok Raj wrote:
> When offlining CPUs, fixup_irqs() migrates all interrupts away from the
> outgoing CPU to an online CPU. It's always possible the device sent an
> interrupt to the previous CPU destination. Pending interrupt bit in IRR in
> LAPIC identifies such interrupts. apic_soft_disable() will not capture any
> new interrupts in IRR. This causes interrupts from device to be lost during
> CPU offline. The issue was found when explicitly setting MSI affinity to a
> CPU and immediately offlining it. It was simple to recreate with a USB
> ethernet device and doing I/O to it while the CPU is offlined. Lost
> interrupts happen even when Interrupt Remapping is enabled.

New lines exist for a reason. They help to structure information. For
the content, please see below.

> Current code does apic_soft_disable() before migrating interrupts.
>
> native_cpu_disable()
> {
> 	...
> 	apic_soft_disable();
> 	cpu_disable_common();
> 	  --> fixup_irqs(); // Too late to capture anything in IRR.
> }
>
> Just flipping the above call sequence seems to hit the IRR checks
> and the lost interrupt is fixed for both legacy MSI and when
> interrupt remapping is enabled.

Seems to hit? Come on, we really want changelogs which are based on
facts and not on assumptions.

Aside of that, yes that's a really subtle one and thanks for tracking it
down! For some reason I never looked at that ordering, but now that you
stick it in front of me, it's pretty clear that this is the root cause.

>  	/*
>  	 * Disable the local APIC. Otherwise IPI broadcasts will reach
>  	 * it. It still responds normally to INIT, NMI, SMI, and SIPI
> -	 * messages.
> +	 * messages. It's important to do apic_soft_disable() after
> +	 * fixup_irqs(), because fixup_irqs() called from cpu_disable_common()
> +	 * depends on IRR being set.

That sentence does not make sense to me.

> +       .... After apic_soft_disable() CPU preserves
> +	 * currently set IRR/ISR but new interrupts will not set IRR.

I agree with the IRR part, but ISR is simply impossible to be set in
this situation.

> +	 * This causes interrupts sent to outgoing CPU before completion
> +	 * of IRQ migration to be lost. Check SDM Vol 3 "10.4.7.2 Local
> +	 * APIC State after It Has been Software Disabled" section for more
> +	 * details.

Please do not use the SDM chapter number of today. It's going to be a
different one with the next version.

Something like this perhaps?

  	/*
  	 * Disable the local APIC. Otherwise IPI broadcasts will reach
  	 * it. It still responds normally to INIT, NMI, SMI, and SIPI
 	 * messages.
         *
         * Disabling the APIC must happen after cpu_disable_common()
  	 * which invokes fixup_irqs().
         *
         * Disabling the APIC preserves already set bits in IRR, but
         * an interrupt arriving after disabling the local APIC does not
         * set the corresponding IRR bit.
         *
         * fixup_irqs() scans IRR for set bits so it can raise a not
  	 * yet handled interrupt on the new destination CPU via an IPI
         * but obviously it can't do so for IRR bits which are not set.
         * IOW, interrupts arriving after disabling the local APIC will
         * be lost.
         */

Hmm?

The changelog wants to have a corresponding update.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] x86/hotplug: Silence APIC only after all irq's are migrated
  2020-08-26  0:40 ` Thomas Gleixner
@ 2020-08-26  2:50   ` Raj, Ashok
  0 siblings, 0 replies; 5+ messages in thread
From: Raj, Ashok @ 2020-08-26  2:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, Sukumar Ghorai, Srikanth Nandamuri, Evan Green,
	Mathias Nyman, Bjorn Helgaas, stable, Ashok Raj

Hi Thomas,

On Wed, Aug 26, 2020 at 02:40:45AM +0200, Thomas Gleixner wrote:
> Ashok,
> 
> On Thu, Aug 20 2020 at 17:42, Ashok Raj wrote:
> > When offlining CPUs, fixup_irqs() migrates all interrupts away from the
> > outgoing CPU to an online CPU. It's always possible the device sent an
> > interrupt to the previous CPU destination. Pending interrupt bit in IRR in
> > LAPIC identifies such interrupts. apic_soft_disable() will not capture any
> > new interrupts in IRR. This causes interrupts from device to be lost during
> > CPU offline. The issue was found when explicitly setting MSI affinity to a
> > CPU and immediately offlining it. It was simple to recreate with a USB
> > ethernet device and doing I/O to it while the CPU is offlined. Lost
> > interrupts happen even when Interrupt Remapping is enabled.
> 
> New lines exist for a reason. They help to structure information. For
> the content, please see below.

Will work on that :-)

> 
> > Current code does apic_soft_disable() before migrating interrupts.
> >
> > native_cpu_disable()
> > {
> > 	...
> > 	apic_soft_disable();
> > 	cpu_disable_common();
> > 	  --> fixup_irqs(); // Too late to capture anything in IRR.
> > }
> >
> > Just flipping the above call sequence seems to hit the IRR checks
> > and the lost interrupt is fixed for both legacy MSI and when
> > interrupt remapping is enabled.
> 
> Seems to hit? Come on, we really want changelogs which are based on
> facts and not on assumptions.

What I intended to convay was by placing a debug trace_printk() at
fixup_irqs(), it was *indeed* observed. Before the change I never noticed
that path being covered.

Just my Inglish (Indian English) tricking you :-).
Will make them sensible in the next update.

> 
> Aside of that, yes that's a really subtle one and thanks for tracking it
> down! For some reason I never looked at that ordering, but now that you
> stick it in front of me, it's pretty clear that this is the root cause.
> 
> >  	/*
> >  	 * Disable the local APIC. Otherwise IPI broadcasts will reach
> >  	 * it. It still responds normally to INIT, NMI, SMI, and SIPI
> > -	 * messages.
> > +	 * messages. It's important to do apic_soft_disable() after
> > +	 * fixup_irqs(), because fixup_irqs() called from cpu_disable_common()
> > +	 * depends on IRR being set.
> 
> That sentence does not make sense to me.

Right, I was just stating the obvious. Since fixup_irqs() isn't called
right in that function, it was suggested to make that connection explicit.

Your writeup below is crystal.. so will replace with what you have below.


> 
> > +       .... After apic_soft_disable() CPU preserves
> > +	 * currently set IRR/ISR but new interrupts will not set IRR.
> 
> I agree with the IRR part, but ISR is simply impossible to be set in
> this situation. 

You are correct. I was trying to convey what the SDM said, but its probably
irrelavant for this discussion. 

> 
> > +	 * This causes interrupts sent to outgoing CPU before completion
> > +	 * of IRQ migration to be lost. Check SDM Vol 3 "10.4.7.2 Local
> > +	 * APIC State after It Has been Software Disabled" section for more
> > +	 * details.
> 
> Please do not use the SDM chapter number of today. It's going to be a
> different one with the next version.
> 
> Something like this perhaps?
> 
>   	/*
>   	 * Disable the local APIC. Otherwise IPI broadcasts will reach
>   	 * it. It still responds normally to INIT, NMI, SMI, and SIPI
>  	 * messages.
>          *
>          * Disabling the APIC must happen after cpu_disable_common()
>   	 * which invokes fixup_irqs().
>          *
>          * Disabling the APIC preserves already set bits in IRR, but
>          * an interrupt arriving after disabling the local APIC does not
>          * set the corresponding IRR bit.
>          *
>          * fixup_irqs() scans IRR for set bits so it can raise a not
>   	 * yet handled interrupt on the new destination CPU via an IPI
>          * but obviously it can't do so for IRR bits which are not set.
>          * IOW, interrupts arriving after disabling the local APIC will
>          * be lost.
>          */
> 
> Hmm?
> 
> The changelog wants to have a corresponding update.

Will do ...

Cheers,
Ashok

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-08-26  2:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-21  0:42 [PATCH v2] x86/hotplug: Silence APIC only after all irq's are migrated Ashok Raj
2020-08-21  3:16 ` Randy Dunlap
2020-08-23 16:48 ` Raj, Ashok
2020-08-26  0:40 ` Thomas Gleixner
2020-08-26  2:50   ` Raj, Ashok

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.