[PATCH] panic: disable optimistic spin after halting CPUs

* [PATCH] panic: disable optimistic spin after halting CPUs
@ 2022-01-13  0:54 Stephen Brennan
  2022-01-13  7:03 ` kernel test robot
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Stephen Brennan @ 2022-01-13  0:54 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Petr Mladek, Sergey Senozhatsky,
	Steven Rostedt, John Ogness
  Cc: Stephen Brennan, Sergey Senozhatsky, linux-kernel

A CPU executing with console lock spinning enabled might be halted
during a panic. Before console_flush_on_panic(), it's possible for
console_trylock() to attempt optimistic spinning, deadlocking the panic
CPU:

CPU 0 (panic CPU)             CPU 1
-----------------             ------
                              printk() {
                                vprintk_func() {
                                  vprintk_default() {
                                    vprintk_emit() {
                                      console_unlock() {
                                        console_lock_spinning_enable();
                                        ... printing to console ...
panic() {
  crash_smp_send_stop() {
    NMI  -------------------> HALT
  }
  atomic_notifier_call_chain() {
    printk() {
      ...
      console_trylock_spinnning() {
        // optimistic spin infinitely

This hang during panic can be induced when a kdump kernel is loaded, and
crash_kexec_post_notifiers=1 is present on the kernel command line. The
following script which concurrently writes to /dev/kmsg, and triggers a
panic, can result in this hang:

    #!/bin/bash
    date
    # 991 chars (based on log buffer size):
    chars="$(printf 'a%.0s' {1..991})"
    while :; do
        echo $chars > /dev/kmsg
    done &
    echo c > /proc/sysrq-trigger &
    date
    exit

Below are the hang rates for the above test case, based on v5.16-rc8
before and after this patch:

before:  197 hangs / 340 trials - 57.9%
after :    0 hangs / 245 trials -  0.0%

Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes")

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
---
 include/linux/console.h |  1 +
 kernel/panic.c          |  7 +++++++
 kernel/printk/printk.c  | 17 +++++++++++++++++
 3 files changed, 25 insertions(+)

diff --git a/include/linux/console.h b/include/linux/console.h
index a97f277cfdfa..4eeb46927d96 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -179,6 +179,7 @@ extern void console_unlock(void);
 extern void console_conditional_schedule(void);
 extern void console_unblank(void);
 extern void console_flush_on_panic(enum con_flush_mode mode);
+extern void console_lock_spinning_disable_on_panic(void);
 extern struct tty_driver *console_device(int *);
 extern void console_stop(struct console *);
 extern void console_start(struct console *);
diff --git a/kernel/panic.c b/kernel/panic.c
index cefd7d82366f..a9340e580b20 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -265,6 +265,13 @@ void panic(const char *fmt, ...)
 		crash_smp_send_stop();
 	}
 
+	/*
+	 * Now that we've halted other CPUs, disable optimistic spinning in
+	 * printk(). This avoids deadlocking in console_trylock(), until we call
+	 * console_flush_on_panic().
+	 */
+	console_lock_spinning_disable_on_panic();
+
 	/*
 	 * Run any panic handlers, including those that might need to
 	 * add information to the kmsg dump output.
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 57b132b658e1..d824fdf7d3fd 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1823,6 +1823,23 @@ static int console_lock_spinning_disable_and_check(void)
 	return 1;
 }
 
+/**
+ * console_lock_spinning_disable_on_panic - disable spinning so that
+ *	a panic CPU does not enter an infinite loop
+ *
+ * This is called once all CPUs are halted. A CPU halted during a section which
+ * allowed spinning, could trigger an infinite loop in console_trylock. To avoid
+ * this, mark console_owner as NULL.
+ */
+void console_lock_spinning_disable_on_panic(void)
+{
+	WRITE_ONCE(console_owner, NULL);
+	if (raw_spin_is_locked(&console_owner_lock)) {
+		debug_locks_off();
+		raw_spin_lock_init(&console_owner_lock);
+	}
+}
+
 /**
  * console_trylock_spinning - try to get console_lock by busy waiting
  *
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread