From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
Konstantin Khlebnikov <koct9i@gmail.com>,
Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
Steven Rostedt <rostedt@goodmis.org>,
Peter Zijlstra <peterz@infradead.org>,
x86@kernel.org, John Ogness <john.ogness@linutronix.de>,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
Petr Tesarik <ptesarik@suse.cz>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe
Date: Wed, 31 Jul 2019 15:08:39 +0900 [thread overview]
Message-ID: <20190731060839.GA1756@jagdpanzerIV> (raw)
In-Reply-To: <20190724122711.qquevkcuge24bhdd@pathway.suse.cz>
Sorry for the delayed response.
On (07/24/19 14:27), Petr Mladek wrote:
[..]
> > And this is where the idea of "disconnecting" those CPUs from main
> > logbuf come from.
> >
> > So what we can do:
> > - smp_send_stop()
> > - disconnect all-but-self from logbuf (via printk-safe)
>
> printk_safe is not really necessary. As you wrote, nobody
> is interested into messages from CPUs that are supposed
> to be stopped.
OK.
Doing it in printk.c makes it easier to handle console_owner/waiter,
which are not exported.
> It might be enough to set some global variable, for example,
> with the CPU number that is calling panic() and is the only
> one allowed to print messages from this point on.
Sounds good.
> It might even be used to force console lock owner to leave
> the cycle immediately.
Yes, makes sense.
[..]
> > So, shall we try one more time with the "disconnect" misbehaving CPUs
> > approach? I can send an RFC patch.
>
> IMHO, it will be acceptable only when it is reasonably simple and
> straightforward. The panic() code is full of special hacks and
> it is already complicated to follow all the twists.
When you have a chance, mind to take a look at the patch below?
Doesn't look very difficult (half of it are white-spaces and
comments, I believe).
> Especially because this scenario came up from a theoretical
> discussion. Note that my original, real bug report, was
> with logbuf_lock NMI, enabled kdump notifiers, ...
I understand. The idea is that this patch should handle your real
scenario + theoretical scenario.
---
include/linux/printk.h | 2 ++
kernel/panic.c | 9 ++++++-
kernel/printk/printk.c | 53 ++++++++++++++++++++++++++++++++++++++++++
3 files changed, 63 insertions(+), 1 deletion(-)
diff --git a/include/linux/printk.h b/include/linux/printk.h
index 57c9473f4a81..016f5ba06e94 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -322,6 +322,8 @@ static inline void printk_safe_flush_on_panic(void)
}
#endif
+void printk_enter_panic_mode(int cpu);
+
extern int kptr_restrict;
#ifndef pr_fmt
diff --git a/kernel/panic.c b/kernel/panic.c
index d1ece4c363b9..85fac975a90f 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -254,13 +254,20 @@ void panic(const char *fmt, ...)
crash_smp_send_stop();
}
+ /* Misbehaving secondary CPUs cannot printk() to the main logbuf now */
+ printk_enter_panic_mode(this_cpu);
+
/*
* Run any panic handlers, including those that might need to
* add information to the kmsg dump output.
*/
atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
- /* Call flush even twice. It tries harder with a single online CPU */
+ /*
+ * Call flush even twice. It tries harder with a single online CPU.
+ * Even if we failed to stop some of secondary CPUs we have printk
+ * locks re-initialized and keep secondary CPUs off printk().
+ */
printk_safe_flush_on_panic();
kmsg_dump(KMSG_DUMP_PANIC);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index f0bc37a511a7..750f83c3b589 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -87,6 +87,8 @@ static DEFINE_SEMAPHORE(console_sem);
struct console *console_drivers;
EXPORT_SYMBOL_GPL(console_drivers);
+static atomic_t __read_mostly printk_panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
+
/*
* System may need to suppress printk message under certain
* circumstances, like after kernel panic happens.
@@ -1644,6 +1646,49 @@ static void console_lock_spinning_enable(void)
spin_acquire(&console_owner_dep_map, 0, 0, _THIS_IP_);
}
+static int panic_in_progress_on_other_cpu(void)
+{
+ int cpu = atomic_read(&printk_panic_cpu);
+
+ return cpu != PANIC_CPU_INVALID && cpu != smp_processor_id();
+}
+
+void printk_enter_panic_mode(int cpu)
+{
+ unsigned long timeout;
+
+ cpu = atomic_cmpxchg(&printk_panic_cpu, PANIC_CPU_INVALID, cpu);
+ /* printk can enter panic mode only once */
+ if (cpu != PANIC_CPU_INVALID)
+ return;
+ /*
+ * Wait for active secondary CPUs (if there are any) to leave
+ * console_unlock() printing loop (for up to one second).
+ */
+ if (num_online_cpus() > 1) {
+ timeout = USEC_PER_SEC;
+ while (num_online_cpus() > 1 && timeout--)
+ udelay(1);
+ }
+
+ debug_locks_off();
+ /*
+ * On some platforms crash_smp_send_stop() can kill CPUs via NMI
+ * vector. Re-init printk() locks just in case if any of those killed
+ * CPUs held any of printk() locks. On platforms which don't support
+ * NMI stop, misbehaving secondary CPUs will be handled by
+ * panic_in_progress_on_other_cpu() test.
+ *
+ * We re-init only printk() locks here. oops_in_progress is expected
+ * to be set by now, so good console drivers are in lockless mode,
+ * bad console drivers, however, can deadlock.
+ */
+ raw_spin_lock_init(&logbuf_lock);
+ sema_init(&console_sem, 1);
+ WRITE_ONCE(console_waiter, false);
+ raw_spin_lock_init(&console_owner_lock);
+}
+
/**
* console_lock_spinning_disable_and_check - mark end of code where another
* thread was able to busy wait and check if there is a waiter
@@ -1900,6 +1945,9 @@ int vprintk_store(int facility, int level,
size_t text_len;
enum log_flags lflags = 0;
+ if (panic_in_progress_on_other_cpu())
+ return 0;
+
/*
* The printf needs to come first; we need the syslog
* prefix which might be passed-in as a parameter.
@@ -2468,6 +2516,11 @@ void console_unlock(void)
return;
}
+ if (panic_in_progress_on_other_cpu()) {
+ printk_safe_exit_irqrestore(flags);
+ return;
+ }
+
printk_safe_exit_irqrestore(flags);
if (do_cond_resched)
--
2.22.0
next prev parent reply other threads:[~2019-07-31 6:08 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-16 7:28 [PATCH 0/2] panic/printk/x86: Prevent some more printk-related deadlocks in panic() Petr Mladek
2019-07-16 7:28 ` [PATCH 1/2] printk/panic: Access the main printk log in panic() only when safe Petr Mladek
2019-07-17 9:56 ` Sergey Senozhatsky
2019-07-18 8:36 ` Petr Mladek
2019-07-18 9:49 ` Sergey Senozhatsky
2019-07-19 12:57 ` Petr Mladek
2019-07-23 3:13 ` Sergey Senozhatsky
2019-07-24 12:27 ` Petr Mladek
2019-07-31 6:08 ` Sergey Senozhatsky [this message]
2019-07-31 6:59 ` Sergey Senozhatsky
2019-07-18 10:11 ` Sergey Senozhatsky
2019-07-16 7:28 ` [PATCH 2/2] printk/panic/x86: Allow to access printk log buffer after crash_smp_send_stop() Petr Mladek
2019-07-18 10:47 ` Sergey Senozhatsky
2019-07-18 11:07 ` Konstantin Khlebnikov
2019-07-18 11:29 ` Sergey Senozhatsky
2019-07-19 12:19 ` Petr Mladek
2019-07-18 9:59 ` [PATCH 0/2] panic/printk/x86: Prevent some more printk-related deadlocks in panic() Konstantin Khlebnikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190731060839.GA1756@jagdpanzerIV \
--to=sergey.senozhatsky.work@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=john.ogness@linutronix.de \
--cc=koct9i@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=ptesarik@suse.cz \
--cc=rostedt@goodmis.org \
--cc=sergey.senozhatsky@gmail.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).