All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Ogness <john.ogness@linutronix.de>
To: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: [PATCH printk v4 11/14] printk: Disable passing console lock owner completely during panic()
Date: Wed,  7 Feb 2024 14:47:00 +0106	[thread overview]
Message-ID: <20240207134103.1357162-12-john.ogness@linutronix.de> (raw)
In-Reply-To: <20240207134103.1357162-1-john.ogness@linutronix.de>

From: Petr Mladek <pmladek@suse.com>

The commit d51507098ff91 ("printk: disable optimistic spin
during panic") added checks to avoid becoming a console waiter
if a panic is in progress.

However, the transition to panic can occur while there is
already a waiter. The current owner should not pass the lock to
the waiter because it might get stopped or blocked anytime.

Also the panic context might pass the console lock owner to an
already stopped waiter by mistake. It might happen when
console_flush_on_panic() ignores the current lock owner, for
example:

CPU0                                CPU1
----                                ----
console_lock_spinning_enable()
                                    console_trylock_spinning()
                                      [CPU1 now console waiter]
NMI: panic()
  panic_other_cpus_shutdown()
                                    [stopped as console waiter]
  console_flush_on_panic()
    console_lock_spinning_enable()
    [print 1 record]
    console_lock_spinning_disable_and_check()
      [handover to stopped CPU1]

This results in panic() not flushing the panic messages.

Fix these problems by disabling all spinning operations
completely during panic().

Another advantage is that it prevents possible deadlocks caused
by "console_owner_lock". The panic() context does not need to
take it any longer. The lockless checks are safe because the
functions become NOPs when they see the panic in progress. All
operations manipulating the state are still synchronized by the
lock even when non-panic CPUs would notice the panic
synchronously.

The current owner might stay spinning. But non-panic() CPUs
would get stopped anyway and the panic context will never start
spinning.

Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes")
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index f3a7f5a6f6f8..cb99c854a648 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1869,10 +1869,23 @@ static bool console_waiter;
  */
 static void console_lock_spinning_enable(void)
 {
+	/*
+	 * Do not use spinning in panic(). The panic CPU wants to keep the lock.
+	 * Non-panic CPUs abandon the flush anyway.
+	 *
+	 * Just keep the lockdep annotation. The panic-CPU should avoid
+	 * taking console_owner_lock because it might cause a deadlock.
+	 * This looks like the easiest way how to prevent false lockdep
+	 * reports without handling races a lockless way.
+	 */
+	if (panic_in_progress())
+		goto lockdep;
+
 	raw_spin_lock(&console_owner_lock);
 	console_owner = current;
 	raw_spin_unlock(&console_owner_lock);
 
+lockdep:
 	/* The waiter may spin on us after setting console_owner */
 	spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
 }
@@ -1897,6 +1910,22 @@ static int console_lock_spinning_disable_and_check(int cookie)
 {
 	int waiter;
 
+	/*
+	 * Ignore spinning waiters during panic() because they might get stopped
+	 * or blocked at any time,
+	 *
+	 * It is safe because nobody is allowed to start spinning during panic
+	 * in the first place. If there has been a waiter then non panic CPUs
+	 * might stay spinning. They would get stopped anyway. The panic context
+	 * will never start spinning and an interrupted spin on panic CPU will
+	 * never continue.
+	 */
+	if (panic_in_progress()) {
+		/* Keep lockdep happy. */
+		spin_release(&console_owner_dep_map, _THIS_IP_);
+		return 0;
+	}
+
 	raw_spin_lock(&console_owner_lock);
 	waiter = READ_ONCE(console_waiter);
 	console_owner = NULL;
-- 
2.39.2


  parent reply	other threads:[~2024-02-07 13:41 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-07 13:40 [PATCH printk v4 00/14] fix console flushing John Ogness
2024-02-07 13:40 ` [PATCH printk v4 01/14] printk: nbcon: Relocate 32bit seq macros John Ogness
2024-02-07 13:40 ` [PATCH printk v4 02/14] printk: Adjust mapping for " John Ogness
2024-02-07 13:40 ` [PATCH printk v4 03/14] printk: Use prb_first_seq() as base " John Ogness
2024-02-07 13:40 ` [PATCH printk v4 04/14] printk: ringbuffer: Do not skip non-finalized records with prb_next_seq() John Ogness
2024-02-07 16:00   ` Petr Mladek
2024-02-07 13:40 ` [PATCH printk v4 05/14] printk: ringbuffer: Clarify special lpos values John Ogness
2024-02-07 13:40 ` [PATCH printk v4 06/14] printk: For @suppress_panic_printk check for other CPU in panic John Ogness
2024-02-07 13:40 ` [PATCH printk v4 07/14] printk: Add this_cpu_in_panic() John Ogness
2024-02-07 13:40 ` [PATCH printk v4 08/14] printk: ringbuffer: Cleanup reader terminology John Ogness
2024-02-07 13:40 ` [PATCH printk v4 09/14] printk: Wait for all reserved records with pr_flush() John Ogness
2024-02-07 13:40 ` [PATCH printk v4 10/14] printk: ringbuffer: Skip non-finalized records in panic John Ogness
2024-02-07 13:41 ` John Ogness [this message]
2024-02-07 13:41 ` [PATCH printk v4 12/14] printk: Avoid non-panic CPUs writing to ringbuffer John Ogness
2024-02-07 13:41 ` [PATCH printk v4 13/14] panic: Flush kernel log buffer at the end John Ogness
2024-02-07 13:41 ` [PATCH printk v4 14/14] dump_stack: Do not get cpu_sync for panic CPU John Ogness
2024-02-07 16:16   ` Petr Mladek
2024-02-07 16:18 ` [PATCH printk v4 00/14] fix console flushing Petr Mladek
2024-02-07 16:53   ` Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240207134103.1357162-12-john.ogness@linutronix.de \
    --to=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.