linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/nmi: Fix "in NMI handler" check
@ 2024-02-07 16:52 Breno Leitao
  2024-02-07 18:30 ` Paul E. McKenney
  2024-02-26 22:50 ` [tip: x86/misc] x86/nmi: Fix the inverse " tip-bot2 for Breno Leitao
  0 siblings, 2 replies; 5+ messages in thread
From: Breno Leitao @ 2024-02-07 16:52 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Paul E. McKenney
  Cc: leit, Mark Rutland, Ingo Molnar,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)

Commit 344da544f177 ("x86/nmi: Print reasons why backtrace NMIs are
ignored") creates a super nice framework to diagnose NMIs.

Every time nmi_exc() is called, it increments a per_cpu counter
(nsp->idt_nmi_seq). At its exit, it also increments the same counter.
Looking at this counter, you can see how many times that function was
called (dividing by 2), and, if the function is still being executed, by
checking the idt_nmi_seq's last bit.

On the check side (nmi_backtrace_stall_check()), that variable is
queried to check if the NMI is still being executed, but, there is a
mistake in the bitwise operation. That code wants to check if the last
bit of the idt_nmi_seq is set or not, but, does the opposite, and check
for all the other bits, which will always be true after the first
exc_nmi() executed successfully.

This appends the misleading string to the dump "(CPU currently in NMI
handler function)"

Fix it by checking the last bit, and if it is set, append the string.

Fixes: 344da544f177 ("x86/nmi: Print reasons why backtrace NMIs are ignored")
Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/x86/kernel/nmi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 17e955ab69fe..6e738ad474dc 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -639,7 +639,7 @@ void nmi_backtrace_stall_check(const struct cpumask *btp)
 			msgp = nmi_check_stall_msg[idx];
 			if (nsp->idt_ignored_snap != READ_ONCE(nsp->idt_ignored) && (idx & 0x1))
 				modp = ", but OK because ignore_nmis was set";
-			if (nmi_seq & ~0x1)
+			if (nmi_seq & 0x1)
 				msghp = " (CPU currently in NMI handler function)";
 			else if (nsp->idt_nmi_seq_snap + 1 == nmi_seq)
 				msghp = " (CPU exited one NMI handler function)";
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/nmi: Fix "in NMI handler" check
  2024-02-07 16:52 [PATCH] x86/nmi: Fix "in NMI handler" check Breno Leitao
@ 2024-02-07 18:30 ` Paul E. McKenney
  2024-02-07 18:44   ` Xin Li
  2024-02-26 22:50 ` [tip: x86/misc] x86/nmi: Fix the inverse " tip-bot2 for Breno Leitao
  1 sibling, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2024-02-07 18:30 UTC (permalink / raw)
  To: Breno Leitao
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, leit, Mark Rutland, Ingo Molnar,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)

On Wed, Feb 07, 2024 at 08:52:35AM -0800, Breno Leitao wrote:
> Commit 344da544f177 ("x86/nmi: Print reasons why backtrace NMIs are
> ignored") creates a super nice framework to diagnose NMIs.
> 
> Every time nmi_exc() is called, it increments a per_cpu counter
> (nsp->idt_nmi_seq). At its exit, it also increments the same counter.
> Looking at this counter, you can see how many times that function was
> called (dividing by 2), and, if the function is still being executed, by
> checking the idt_nmi_seq's last bit.
> 
> On the check side (nmi_backtrace_stall_check()), that variable is
> queried to check if the NMI is still being executed, but, there is a
> mistake in the bitwise operation. That code wants to check if the last
> bit of the idt_nmi_seq is set or not, but, does the opposite, and check
> for all the other bits, which will always be true after the first
> exc_nmi() executed successfully.
> 
> This appends the misleading string to the dump "(CPU currently in NMI
> handler function)"
> 
> Fix it by checking the last bit, and if it is set, append the string.
> 
> Fixes: 344da544f177 ("x86/nmi: Print reasons why backtrace NMIs are ignored")
> Signed-off-by: Breno Leitao <leitao@debian.org>

If someone else is taking this:

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

(I am queueing it for testing in any case.)

> ---
>  arch/x86/kernel/nmi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> index 17e955ab69fe..6e738ad474dc 100644
> --- a/arch/x86/kernel/nmi.c
> +++ b/arch/x86/kernel/nmi.c
> @@ -639,7 +639,7 @@ void nmi_backtrace_stall_check(const struct cpumask *btp)
>  			msgp = nmi_check_stall_msg[idx];
>  			if (nsp->idt_ignored_snap != READ_ONCE(nsp->idt_ignored) && (idx & 0x1))
>  				modp = ", but OK because ignore_nmis was set";
> -			if (nmi_seq & ~0x1)
> +			if (nmi_seq & 0x1)
>  				msghp = " (CPU currently in NMI handler function)";
>  			else if (nsp->idt_nmi_seq_snap + 1 == nmi_seq)
>  				msghp = " (CPU exited one NMI handler function)";
> -- 
> 2.39.3
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/nmi: Fix "in NMI handler" check
  2024-02-07 18:30 ` Paul E. McKenney
@ 2024-02-07 18:44   ` Xin Li
  2024-02-08 14:59     ` Breno Leitao
  0 siblings, 1 reply; 5+ messages in thread
From: Xin Li @ 2024-02-07 18:44 UTC (permalink / raw)
  To: paulmck, Breno Leitao
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, leit, Mark Rutland, Ingo Molnar,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)

On 2/7/2024 10:30 AM, Paul E. McKenney wrote:
> On Wed, Feb 07, 2024 at 08:52:35AM -0800, Breno Leitao wrote:
>> Commit 344da544f177 ("x86/nmi: Print reasons why backtrace NMIs are
>> ignored") creates a super nice framework to diagnose NMIs.
>>
>> Every time nmi_exc() is called, it increments a per_cpu counter
>> (nsp->idt_nmi_seq). At its exit, it also increments the same counter.
>> Looking at this counter, you can see how many times that function was
>> called (dividing by 2), and, if the function is still being executed, by
>> checking the idt_nmi_seq's last bit.
>>
>> On the check side (nmi_backtrace_stall_check()), that variable is
>> queried to check if the NMI is still being executed, but, there is a
>> mistake in the bitwise operation. That code wants to check if the last
>> bit of the idt_nmi_seq is set or not, but, does the opposite, and check
>> for all the other bits, which will always be true after the first
>> exc_nmi() executed successfully.
>>
>> This appends the misleading string to the dump "(CPU currently in NMI
>> handler function)"
>>
>> Fix it by checking the last bit, and if it is set, append the string.
>>
>> Fixes: 344da544f177 ("x86/nmi: Print reasons why backtrace NMIs are ignored")
>> Signed-off-by: Breno Leitao <leitao@debian.org>
> 
> If someone else is taking this:
> 
> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> 
> (I am queueing it for testing in any case.)

Does this fix need to be backported?

Commit 344da544f177 has a date on Dec 16 2022.

> 
>> ---
>>   arch/x86/kernel/nmi.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
>> index 17e955ab69fe..6e738ad474dc 100644
>> --- a/arch/x86/kernel/nmi.c
>> +++ b/arch/x86/kernel/nmi.c
>> @@ -639,7 +639,7 @@ void nmi_backtrace_stall_check(const struct cpumask *btp)
>>   			msgp = nmi_check_stall_msg[idx];
>>   			if (nsp->idt_ignored_snap != READ_ONCE(nsp->idt_ignored) && (idx & 0x1))
>>   				modp = ", but OK because ignore_nmis was set";
>> -			if (nmi_seq & ~0x1)
>> +			if (nmi_seq & 0x1)
>>   				msghp = " (CPU currently in NMI handler function)";
>>   			else if (nsp->idt_nmi_seq_snap + 1 == nmi_seq)
>>   				msghp = " (CPU exited one NMI handler function)";
>> -- 
>> 2.39.3
>>
> 

-- 
Thanks!
     Xin


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/nmi: Fix "in NMI handler" check
  2024-02-07 18:44   ` Xin Li
@ 2024-02-08 14:59     ` Breno Leitao
  0 siblings, 0 replies; 5+ messages in thread
From: Breno Leitao @ 2024-02-08 14:59 UTC (permalink / raw)
  To: Xin Li
  Cc: paulmck, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, leit, Mark Rutland,
	Ingo Molnar, open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)

On Wed, Feb 07, 2024 at 10:44:57AM -0800, Xin Li wrote:
> On 2/7/2024 10:30 AM, Paul E. McKenney wrote:
> > On Wed, Feb 07, 2024 at 08:52:35AM -0800, Breno Leitao wrote:
> > > Commit 344da544f177 ("x86/nmi: Print reasons why backtrace NMIs are
> > > ignored") creates a super nice framework to diagnose NMIs.
> > > 
> > > Every time nmi_exc() is called, it increments a per_cpu counter
> > > (nsp->idt_nmi_seq). At its exit, it also increments the same counter.
> > > Looking at this counter, you can see how many times that function was
> > > called (dividing by 2), and, if the function is still being executed, by
> > > checking the idt_nmi_seq's last bit.
> > > 
> > > On the check side (nmi_backtrace_stall_check()), that variable is
> > > queried to check if the NMI is still being executed, but, there is a
> > > mistake in the bitwise operation. That code wants to check if the last
> > > bit of the idt_nmi_seq is set or not, but, does the opposite, and check
> > > for all the other bits, which will always be true after the first
> > > exc_nmi() executed successfully.
> > > 
> > > This appends the misleading string to the dump "(CPU currently in NMI
> > > handler function)"
> > > 
> > > Fix it by checking the last bit, and if it is set, append the string.
> > > 
> > > Fixes: 344da544f177 ("x86/nmi: Print reasons why backtrace NMIs are ignored")
> > > Signed-off-by: Breno Leitao <leitao@debian.org>
> > 
> > If someone else is taking this:
> > 
> > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > (I am queueing it for testing in any case.)
> 
> Does this fix need to be backported?
> 
> Commit 344da544f177 has a date on Dec 16 2022.

I would say so, if users are using this detection mechanism and want to
trust the messages being displayed by the kernel.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: x86/misc] x86/nmi: Fix the inverse "in NMI handler" check
  2024-02-07 16:52 [PATCH] x86/nmi: Fix "in NMI handler" check Breno Leitao
  2024-02-07 18:30 ` Paul E. McKenney
@ 2024-02-26 22:50 ` tip-bot2 for Breno Leitao
  1 sibling, 0 replies; 5+ messages in thread
From: tip-bot2 for Breno Leitao @ 2024-02-26 22:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Breno Leitao, Thomas Gleixner, Paul E. McKenney, stable, x86,
	linux-kernel

The following commit has been merged into the x86/misc branch of tip:

Commit-ID:     d54e56f31a34fa38fcb5e91df609f9633419a79a
Gitweb:        https://git.kernel.org/tip/d54e56f31a34fa38fcb5e91df609f9633419a79a
Author:        Breno Leitao <leitao@debian.org>
AuthorDate:    Wed, 07 Feb 2024 08:52:35 -08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Mon, 26 Feb 2024 23:41:30 +01:00

x86/nmi: Fix the inverse "in NMI handler" check

Commit 344da544f177 ("x86/nmi: Print reasons why backtrace NMIs are
ignored") creates a super nice framework to diagnose NMIs.

Every time nmi_exc() is called, it increments a per_cpu counter
(nsp->idt_nmi_seq). At its exit, it also increments the same counter.  By
reading this counter it can be seen how many times that function was called
(dividing by 2), and, if the function is still being executed, by checking
the idt_nmi_seq's least significant bit.

On the check side (nmi_backtrace_stall_check()), that variable is queried
to check if the NMI is still being executed, but, there is a mistake in the
bitwise operation. That code wants to check if the least significant bit of
the idt_nmi_seq is set or not, but does the opposite, and checks for all
the other bits, which will always be true after the first exc_nmi()
executed successfully.

This appends the misleading string to the dump "(CPU currently in NMI
handler function)"

Fix it by checking the least significant bit, and if it is set, append the
string.

Fixes: 344da544f177 ("x86/nmi: Print reasons why backtrace NMIs are ignored")
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240207165237.1048837-1-leitao@debian.org

---
 arch/x86/kernel/nmi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index d238679..c95dc1b 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -639,7 +639,7 @@ void nmi_backtrace_stall_check(const struct cpumask *btp)
 			msgp = nmi_check_stall_msg[idx];
 			if (nsp->idt_ignored_snap != READ_ONCE(nsp->idt_ignored) && (idx & 0x1))
 				modp = ", but OK because ignore_nmis was set";
-			if (nmi_seq & ~0x1)
+			if (nmi_seq & 0x1)
 				msghp = " (CPU currently in NMI handler function)";
 			else if (nsp->idt_nmi_seq_snap + 1 == nmi_seq)
 				msghp = " (CPU exited one NMI handler function)";

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-02-26 22:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-07 16:52 [PATCH] x86/nmi: Fix "in NMI handler" check Breno Leitao
2024-02-07 18:30 ` Paul E. McKenney
2024-02-07 18:44   ` Xin Li
2024-02-08 14:59     ` Breno Leitao
2024-02-26 22:50 ` [tip: x86/misc] x86/nmi: Fix the inverse " tip-bot2 for Breno Leitao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).