linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
@ 2021-11-07  4:51 Nicholas Piggin
  2021-11-08  8:34 ` Petr Mladek
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Nicholas Piggin @ 2021-11-07  4:51 UTC (permalink / raw)
  To: Petr Mladek; +Cc: Nicholas Piggin, John Ogness, linux-kernel

printk from NMI context relies on irq work being raised on the local CPU
to print to console. This can be a problem if the NMI was raised by a
lockup detector to print lockup stack and regs, because the CPU may not
enable irqs (because it is locked up).

Introduce printk_trigger_flush() that can be called another CPU to try
to get those messages to the console, call that where printk_safe_flush
was previously called.

Fixes: 93d102f094be ("printk: remove safe buffers")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/watchdog.c | 6 ++++++
 include/linux/printk.h         | 4 ++++
 kernel/printk/printk.c         | 5 +++++
 lib/nmi_backtrace.c            | 6 ++++++
 4 files changed, 21 insertions(+)

diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index 5f69ba4de1f3..c8017bc23b00 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -227,6 +227,12 @@ static void watchdog_smp_panic(int cpu)
 		cpumask_clear(&wd_smp_cpus_ipi);
 	}
 
+	/*
+	 * Force flush any remote buffers that might be stuck in IRQ context
+	 * and therefore could not run their irq_work.
+	 */
+	printk_trigger_flush();
+
 	if (hardlockup_panic)
 		nmi_panic(NULL, "Hard LOCKUP");
 
diff --git a/include/linux/printk.h b/include/linux/printk.h
index 85b656f82d75..9497f6b98339 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -198,6 +198,7 @@ void dump_stack_print_info(const char *log_lvl);
 void show_regs_print_info(const char *log_lvl);
 extern asmlinkage void dump_stack_lvl(const char *log_lvl) __cold;
 extern asmlinkage void dump_stack(void) __cold;
+void printk_trigger_flush(void);
 #else
 static inline __printf(1, 0)
 int vprintk(const char *s, va_list args)
@@ -274,6 +275,9 @@ static inline void dump_stack_lvl(const char *log_lvl)
 static inline void dump_stack(void)
 {
 }
+static inline void printk_trigger_flush(void)
+{
+}
 #endif
 
 #ifdef CONFIG_SMP
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index a8d0a58deebc..99221b016c68 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3252,6 +3252,11 @@ void defer_console_output(void)
 	preempt_enable();
 }
 
+void printk_trigger_flush(void)
+{
+	defer_console_output();
+}
+
 int vprintk_deferred(const char *fmt, va_list args)
 {
 	int r;
diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
index f9e89001b52e..199ab201d501 100644
--- a/lib/nmi_backtrace.c
+++ b/lib/nmi_backtrace.c
@@ -75,6 +75,12 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
 		touch_softlockup_watchdog();
 	}
 
+	/*
+	 * Force flush any remote buffers that might be stuck in IRQ context
+	 * and therefore could not run their irq_work.
+	 */
+	printk_trigger_flush();
+
 	clear_bit_unlock(0, &backtrace_flag);
 	put_cpu();
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
  2021-11-07  4:51 [PATCH] printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces Nicholas Piggin
@ 2021-11-08  8:34 ` Petr Mladek
  2021-11-08 13:43   ` Nicholas Piggin
  2021-11-08 10:40 ` John Ogness
  2021-11-10 15:25 ` Petr Mladek
  2 siblings, 1 reply; 5+ messages in thread
From: Petr Mladek @ 2021-11-08  8:34 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: John Ogness, linux-kernel

On Sun 2021-11-07 14:51:16, Nicholas Piggin wrote:
> printk from NMI context relies on irq work being raised on the local CPU
> to print to console. This can be a problem if the NMI was raised by a
> lockup detector to print lockup stack and regs, because the CPU may not
> enable irqs (because it is locked up).
> 
> Introduce printk_trigger_flush() that can be called another CPU to try
> to get those messages to the console, call that where printk_safe_flush
> was previously called.
> 
> Fixes: 93d102f094be ("printk: remove safe buffers")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Petr Mladek <pmladek@suse.com>

We should add also

     Cc: stable@vger.kernel.org # 5.15

No need to resent the patch. I could add it when pushing.

Plan: I am going to wait one or more days for a potential feedback
and ack from John. Then I am going to push this into printk/linux.git.
IMHO, it makes sense to get this into 5.16-rc1 or rc2.

Thank you both a lot for nailing this down and for the fix.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
  2021-11-07  4:51 [PATCH] printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces Nicholas Piggin
  2021-11-08  8:34 ` Petr Mladek
@ 2021-11-08 10:40 ` John Ogness
  2021-11-10 15:25 ` Petr Mladek
  2 siblings, 0 replies; 5+ messages in thread
From: John Ogness @ 2021-11-08 10:40 UTC (permalink / raw)
  To: Nicholas Piggin, Petr Mladek; +Cc: Nicholas Piggin, linux-kernel

On 2021-11-07, Nicholas Piggin <npiggin@gmail.com> wrote:
> printk from NMI context relies on irq work being raised on the local CPU
> to print to console. This can be a problem if the NMI was raised by a
> lockup detector to print lockup stack and regs, because the CPU may not
> enable irqs (because it is locked up).
>
> Introduce printk_trigger_flush() that can be called another CPU to try
> to get those messages to the console, call that where printk_safe_flush
> was previously called.
>
> Fixes: 93d102f094be ("printk: remove safe buffers")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: John Ogness <john.ogness@linutronix.de>

> ---
>  arch/powerpc/kernel/watchdog.c | 6 ++++++
>  include/linux/printk.h         | 4 ++++
>  kernel/printk/printk.c         | 5 +++++
>  lib/nmi_backtrace.c            | 6 ++++++
>  4 files changed, 21 insertions(+)
>
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index 5f69ba4de1f3..c8017bc23b00 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -227,6 +227,12 @@ static void watchdog_smp_panic(int cpu)
>  		cpumask_clear(&wd_smp_cpus_ipi);
>  	}
>  
> +	/*
> +	 * Force flush any remote buffers that might be stuck in IRQ context
> +	 * and therefore could not run their irq_work.
> +	 */
> +	printk_trigger_flush();
> +
>  	if (hardlockup_panic)
>  		nmi_panic(NULL, "Hard LOCKUP");
>  
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 85b656f82d75..9497f6b98339 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -198,6 +198,7 @@ void dump_stack_print_info(const char *log_lvl);
>  void show_regs_print_info(const char *log_lvl);
>  extern asmlinkage void dump_stack_lvl(const char *log_lvl) __cold;
>  extern asmlinkage void dump_stack(void) __cold;
> +void printk_trigger_flush(void);
>  #else
>  static inline __printf(1, 0)
>  int vprintk(const char *s, va_list args)
> @@ -274,6 +275,9 @@ static inline void dump_stack_lvl(const char *log_lvl)
>  static inline void dump_stack(void)
>  {
>  }
> +static inline void printk_trigger_flush(void)
> +{
> +}
>  #endif
>  
>  #ifdef CONFIG_SMP
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index a8d0a58deebc..99221b016c68 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3252,6 +3252,11 @@ void defer_console_output(void)
>  	preempt_enable();
>  }
>  
> +void printk_trigger_flush(void)
> +{
> +	defer_console_output();
> +}
> +
>  int vprintk_deferred(const char *fmt, va_list args)
>  {
>  	int r;
> diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
> index f9e89001b52e..199ab201d501 100644
> --- a/lib/nmi_backtrace.c
> +++ b/lib/nmi_backtrace.c
> @@ -75,6 +75,12 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
>  		touch_softlockup_watchdog();
>  	}
>  
> +	/*
> +	 * Force flush any remote buffers that might be stuck in IRQ context
> +	 * and therefore could not run their irq_work.
> +	 */
> +	printk_trigger_flush();
> +
>  	clear_bit_unlock(0, &backtrace_flag);
>  	put_cpu();
>  }
> -- 
> 2.23.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
  2021-11-08  8:34 ` Petr Mladek
@ 2021-11-08 13:43   ` Nicholas Piggin
  0 siblings, 0 replies; 5+ messages in thread
From: Nicholas Piggin @ 2021-11-08 13:43 UTC (permalink / raw)
  To: Petr Mladek; +Cc: John Ogness, linux-kernel

Excerpts from Petr Mladek's message of November 8, 2021 6:34 pm:
> On Sun 2021-11-07 14:51:16, Nicholas Piggin wrote:
>> printk from NMI context relies on irq work being raised on the local CPU
>> to print to console. This can be a problem if the NMI was raised by a
>> lockup detector to print lockup stack and regs, because the CPU may not
>> enable irqs (because it is locked up).
>> 
>> Introduce printk_trigger_flush() that can be called another CPU to try
>> to get those messages to the console, call that where printk_safe_flush
>> was previously called.
>> 
>> Fixes: 93d102f094be ("printk: remove safe buffers")
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> 
> Reviewed-by: Petr Mladek <pmladek@suse.com>
> 
> We should add also
> 
>      Cc: stable@vger.kernel.org # 5.15
> 
> No need to resent the patch. I could add it when pushing.
> 
> Plan: I am going to wait one or more days for a potential feedback
> and ack from John. Then I am going to push this into printk/linux.git.

That sounds good to me.

> IMHO, it makes sense to get this into 5.16-rc1 or rc2.

Agree.

Thanks,
Nick

> Thank you both a lot for nailing this down and for the fix.
> 
> Best Regards,
> Petr
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
  2021-11-07  4:51 [PATCH] printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces Nicholas Piggin
  2021-11-08  8:34 ` Petr Mladek
  2021-11-08 10:40 ` John Ogness
@ 2021-11-10 15:25 ` Petr Mladek
  2 siblings, 0 replies; 5+ messages in thread
From: Petr Mladek @ 2021-11-10 15:25 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: John Ogness, linux-kernel

On Sun 2021-11-07 14:51:16, Nicholas Piggin wrote:
> printk from NMI context relies on irq work being raised on the local CPU
> to print to console. This can be a problem if the NMI was raised by a
> lockup detector to print lockup stack and regs, because the CPU may not
> enable irqs (because it is locked up).
> 
> Introduce printk_trigger_flush() that can be called another CPU to try
> to get those messages to the console, call that where printk_safe_flush
> was previously called.
> 
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -227,6 +227,12 @@ static void watchdog_smp_panic(int cpu)
>  		cpumask_clear(&wd_smp_cpus_ipi);
>  	}

The above context did not apply. I guess that it is a pending change
that did not even reached linux-next yet.

The pushed code might be seen at
https://git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git/commit/?h=rework/printk_safe-removal&id=5d5e4522a7f404d1a96fd6c703989d32a9c9568d

>  
> +	/*
> +	 * Force flush any remote buffers that might be stuck in IRQ context
> +	 * and therefore could not run their irq_work.
> +	 */
> +	printk_trigger_flush();
> +
>  	if (hardlockup_panic)
>  		nmi_panic(NULL, "Hard LOCKUP");
>  

The patch has been committed into printk/linux.git,
branch rework/printk_safe-removal.

I am going to add it into the pull request for 5.16-rc2 following week.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-10 15:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-07  4:51 [PATCH] printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces Nicholas Piggin
2021-11-08  8:34 ` Petr Mladek
2021-11-08 13:43   ` Nicholas Piggin
2021-11-08 10:40 ` John Ogness
2021-11-10 15:25 ` Petr Mladek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).