[PATCH v1] panic: push panic() messages to the console even from the MCE nmi handler

* [PATCH v1] panic: push panic() messages to the console even from the MCE nmi handler
@ 2021-01-04 21:15 “William Roche
  2021-01-06  4:35 ` Sergey Senozhatsky
  2021-01-08 12:47 ` Petr Mladek
  0 siblings, 2 replies; 6+ messages in thread
From: “William Roche @ 2021-01-04 21:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: John Ogness, Peter Zijlstra, Petr Mladek, Steven Rostedt,
	Andrew Morton, Sergey Senozhatsky, Thomas Gleixner,
	Borislav Petkov

From: William Roche <william.roche@oracle.com>

Force push panic messages to the console as panic() can be called from NMI
interrupt handler functions where printed messages can't always reach the
console without an explicit push provided by printk_safe_flush_on_panic()
and console_flush_on_panic().
This is the case with the MCE handler that can lead to a system panic
giving information on the fatal MCE root cause that must reach the console.

Signed-off-by: William Roche <william.roche@oracle.com>
---

Notes:
    	While testing MCE injection and kernel reaction, we discovered a bug
    in the way the kernel provides the panic reason information: When dealing
    with fatal MCE, the machine (physical or virtual) can reboot without
    leaving any message on the console.
    
    	This behavior can be reproduced on Intel with the mce-inject tool
    with a simple:
    	# modprobe mce-inject
    	# mce-inject test/uncorrected
    
    	The investigations showed that the MCE panic can be totally message-less
    or can give a small set of messages. This behavior depends on the use of the
    crash_kexec mechanism (using the "crashkernel" parameter). Not using this
    parameter, we get a partial [Hardware Error] information on panic, but some
    important notifications can be missing. And when using it, a fatal MCE can
    panic the system without leaving any information.
    
    . Without "crashkernel", a Fatal MCE injection shows:
    
    [  212.153928] mce: Machine check injector initialized
    [  236.730682] mce: Triggering MCE exception on CPU 0
    [  236.731304] Disabling lock debugging due to kernel taint
    [  236.731947] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 1: b000000000000000
    [  236.731948] mce: [Hardware Error]: TSC 78418fb4a83f
    [  236.731949] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1605312952 SOCKET 0 APIC 0 microcode 1
    [  236.731949] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
    [  236.731950] mce: [Hardware Error]: Machine check: MCIP not set in MCA handler
    [  236.731950] Kernel panic - not syncing: Fatal machine check
    [  236.732047] Kernel Offset: disabled
    
    	The system hangs 30 seconds without any additional message, and finally
    reboots.
    
    . With the use of "crashkernel", a Fatal MCE injection shows only the
    injection message
    
    [   80.811708] mce: Machine check injector initialized
    [   92.298755] mce: Triggering MCE exception on CPU 0
    [   92.299362] Disabling lock debugging due to kernel taint
    
    	No other messages is displayed and the system reboots immediately.
    
    The reason behind this defective behavior is the fact that
    do_machine_check() MCE handler can call mce_panic() to finish in panic()
    trying to print out messages on the console. So we end up trying to print
    out messages directly from this interrupt handler, which can require the
    help of additional function calls like printk_safe_flush_on_panic() and
    console_flush_on_panic().
    
    With the suggested fix here, the above output turns into the expected:
    
    . Without crashkernel:
    [   92.025010] mce: Machine check injector initialized
    [  113.006798] mce: Triggering MCE exception on CPU 0
    [  113.007391] Disabling lock debugging due to kernel taint
    [  113.008061] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 1: b000000000000000
    [  113.008062] mce: [Hardware Error]: TSC 2ec380a0f187
    [  113.008063] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1606223541 SOCKET 0 APIC 0 microcode 1
    [  113.008063] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
    [  113.008064] mce: [Hardware Error]: Machine check: MCIP not set in MCA handler
    [  113.008064] Kernel panic - not syncing: Fatal machine check
    [  113.013010] Kernel Offset: disabled
    [  113.013426] Rebooting in 30 seconds..
    
    	Note that we now have the important information about the Reboot in
    30 seconds.
    
    . With crashkernel:
    [   75.770143] mce: Machine check injector initialized
    [  100.923650] mce: Triggering MCE exception on CPU 0
    [  100.924415] Disabling lock debugging due to kernel taint
    [  100.925094] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 1: b000000000000000
    [  100.925095] mce: [Hardware Error]: TSC 2f3aa07b03de
    [  100.925096] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1606223692 SOCKET 0 APIC 0 microcode 1
    [  100.925097] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
    [  100.925098] mce: [Hardware Error]: Machine check: MCIP not set in MCA handler
    [  100.925100] Kernel panic - not syncing: Fatal machine check
    
    	Here the reboot is immediate but the MCE error message reaches the
    console.
    
    Could you please review this fix ?
    
    Thanks in advance for any feedback you could have.
    William.

 kernel/panic.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/kernel/panic.c b/kernel/panic.c
index 332736a..eb90cc0 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -166,6 +166,15 @@ static void panic_print_sys_info(void)
 		ftrace_dump(DUMP_ALL);
 }
 
+/*
+ * Force flush messages to the console.
+ */
+static void panic_flush_to_console(void)
+{
+	printk_safe_flush_on_panic();
+	console_flush_on_panic(CONSOLE_FLUSH_PENDING);
+}
+
 /**
  *	panic - halt the system
  *	@fmt: The text string to print
@@ -247,7 +256,7 @@ void panic(const char *fmt, ...)
 	 * Bypass the panic_cpu check and call __crash_kexec directly.
 	 */
 	if (!_crash_kexec_post_notifiers) {
-		printk_safe_flush_on_panic();
+		panic_flush_to_console();
 		__crash_kexec(NULL);
 
 		/*
@@ -271,9 +280,8 @@ void panic(const char *fmt, ...)
 	 */
 	atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
 
-	/* Call flush even twice. It tries harder with a single online CPU */
-	printk_safe_flush_on_panic();
 	kmsg_dump(KMSG_DUMP_PANIC);
+	panic_flush_to_console();
 
 	/*
 	 * If you doubt kdump always works fine in any situation,
@@ -298,7 +306,7 @@ void panic(const char *fmt, ...)
 	 * buffer.  Try to acquire the lock then release it regardless of the
 	 * result.  The release will also print the buffers out.  Locks debug
 	 * should be disabled to avoid reporting bad unlock balance when
-	 * panic() is not being callled from OOPS.
+	 * panic() is not being called from OOPS.
 	 */
 	debug_locks_off();
 	console_flush_on_panic(CONSOLE_FLUSH_PENDING);
@@ -314,6 +322,7 @@ void panic(const char *fmt, ...)
 		 * We can't use the "normal" timers since we just panicked.
 		 */
 		pr_emerg("Rebooting in %d seconds..\n", panic_timeout);
+		panic_flush_to_console();
 
 		for (i = 0; i < panic_timeout * 1000; i += PANIC_TIMER_STEP) {
 			touch_nmi_watchdog();
@@ -347,6 +356,7 @@ void panic(const char *fmt, ...)
 	disabled_wait();
 #endif
 	pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf);
+	panic_flush_to_console();
 
 	/* Do not scroll important messages printed above */
 	suppress_printk = 1;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread