All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.cz>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Dave Anderson <anderson@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Kay Sievers <kay@vrfy.org>, Jiri Kosina <jkosina@suse.cz>,
	Michal Hocko <mhocko@suse.cz>, Jan Kara <jack@suse.cz>,
	linux-kernel@vger.kernel.org,
	Wang Long <long.wanglong@huawei.com>,
	peifeiyue@huawei.com, dzickus@redhat.com, morgan.wang@huawei.com,
	sasha.levin@oracle.com, Petr Mladek <pmladek@suse.cz>
Subject: [PATCH 01/10] printk: Avoid deadlock in NMI context
Date: Mon, 25 May 2015 14:46:24 +0200	[thread overview]
Message-ID: <1432557993-20458-2-git-send-email-pmladek@suse.cz> (raw)
In-Reply-To: <1432557993-20458-1-git-send-email-pmladek@suse.cz>

printk() cannot be used in NMI context safely because it uses an internal
lock and thus could cause a deadlock. This is fine because NMI context
is very special. The handlers should be short, effective, and do not
use printk().

But developers tend to print warnings even from NMI code. They are pretty
hard to debug when they lockup the machine and nothing appears in the logs.

This patch prevents from the deadlock on logbuf_lock by using trylock
rather than spin_lock. If the lock can not be taken, the message is
ignored and some warning is printed later on.

We also must not try to get console from NMI context. It needs
even more locks and there is even higher chance to hung up.

Unfortunately, we could not print more details about the lost message.
We could not alloc a buffer in NMI. Therefore we would need some
lockless mechanism to share a buffer between NMI and normal context.
But this would make printk() code much more complicated and
it is not worth it. There has already been an attempt to do so
and it has been rejected, see https://lkml.org/lkml/2014/6/10/388
This is also the reason why we use the atomic counter.

Hint: If anybody wants to know what NMI message is being lost,
trace_dump_stack(0) can be called when try_lock() fails. Then
trace_printk() could be used on that place.

This is only the basic logic. There will be some improvements in
the followup patches.

Signed-off-by: Petr Mladek <pmladek@suse.cz>
---
 kernel/printk/printk.c | 42 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 38 insertions(+), 4 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index c099b082cd02..94fcf6f0b542 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1615,6 +1615,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 			    const char *fmt, va_list args)
 {
 	static int recursion_bug;
+	static atomic_t nmi_message_lost;
 	static char textbuf[LOG_LINE_MAX];
 	char *text = textbuf;
 	size_t text_len = 0;
@@ -1641,7 +1642,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	/*
 	 * Ouch, printk recursed into itself!
 	 */
-	if (unlikely(logbuf_cpu == this_cpu)) {
+	if (unlikely(logbuf_cpu == this_cpu) && !in_nmi()) {
 		/*
 		 * If a crash is occurring during printk() on this CPU,
 		 * then try to get the crash message out but make sure
@@ -1658,7 +1659,26 @@ asmlinkage int vprintk_emit(int facility, int level,
 	}
 
 	lockdep_off();
-	raw_spin_lock(&logbuf_lock);
+
+	/*
+	 * printk() should not be used in NMI context because the interrupt
+	 * might came when lockbuf_lock is taken. Such situation is hard
+	 * to debug. Therefore, we try to avoid the deadlock, use trylock
+	 * in NMI context, and later warn about lost messages.
+	 *
+	 * Hint: If you get into troubles, use trace_dump_stack(0)
+	 * to get location of the lost message and use trace_printk()
+	 * there.
+	 */
+	if (!in_nmi()) {
+		raw_spin_lock(&logbuf_lock);
+	} else if (!raw_spin_trylock(&logbuf_lock)) {
+		atomic_inc(&nmi_message_lost);
+		lockdep_on();
+		local_irq_restore(flags);
+		return 0;
+	}
+
 	logbuf_cpu = this_cpu;
 
 	if (unlikely(recursion_bug)) {
@@ -1672,6 +1692,17 @@ asmlinkage int vprintk_emit(int facility, int level,
 					 strlen(recursion_msg));
 	}
 
+	if (unlikely(atomic_read(&nmi_message_lost))) {
+		int lost = atomic_xchg(&nmi_message_lost, 0);
+
+		text_len = scnprintf(text, sizeof(textbuf),
+				     "BAD LUCK: lost %d message(s) from NMI context!",
+				     lost);
+		/* emit KERN_CRIT message */
+		printed_len += log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
+					 NULL, 0, text, text_len);
+	}
+
 	/*
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
@@ -1759,8 +1790,11 @@ asmlinkage int vprintk_emit(int facility, int level,
 	lockdep_on();
 	local_irq_restore(flags);
 
-	/* If called from the scheduler, we can not call up(). */
-	if (!in_sched) {
+	/*
+	 * If called from the scheduler or NMI context, we can not get console
+	 * without a possible deadlock.
+	 */
+	if (!in_sched && !in_nmi()) {
 		lockdep_off();
 		/*
 		 * Disable preemption to avoid being preempted while holding
-- 
1.8.5.6


  reply	other threads:[~2015-05-25 12:46 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-25 12:46 [PATCH 00/10] printk: Avoid deadlock in NMI + vprintk_emit() cleanup Petr Mladek
2015-05-25 12:46 ` Petr Mladek [this message]
2015-05-27 23:13   ` [PATCH 01/10] printk: Avoid deadlock in NMI context Andrew Morton
2015-05-28 12:00     ` Petr Mladek
2015-05-25 12:46 ` [PATCH 02/10] printk: Try harder to get logbuf_lock on NMI Petr Mladek
2015-05-27 23:14   ` Andrew Morton
2015-05-28  7:54     ` Jiri Kosina
2015-05-28 13:50     ` Petr Mladek
2015-05-28 20:09       ` Andrew Morton
2015-05-29 10:56         ` Petr Mladek
2015-05-29 20:46           ` Andrew Morton
2015-05-25 12:46 ` [PATCH 03/10] printk: Move the deferred printk stuff Petr Mladek
2015-05-25 12:46 ` [PATCH 04/10] printk: Merge and flush NMI buffer predictably via IRQ work Petr Mladek
2015-05-27 23:14   ` Andrew Morton
2015-05-28 13:12     ` Petr Mladek
2015-05-25 12:46 ` [PATCH 05/10] printk: Try hard to print Oops message in NMI context Petr Mladek
2015-05-25 12:46 ` [PATCH 06/10] printk: Split delayed printing of warnings from vprintk_emit() Petr Mladek
2015-05-25 12:46 ` [PATCH 07/10] printk: Split text formatting and analyze " Petr Mladek
2015-05-25 12:46 ` [PATCH 08/10] printk: Detect scheduler messages in vprintk_format_and_analyze() Petr Mladek
2015-05-25 12:46 ` [PATCH 09/10] printk: Split text storing logic from vprintk_emit() Petr Mladek
2015-05-25 12:46 ` [PATCH 10/10] printk: Split console call " Petr Mladek
2015-05-29 20:50 ` [PATCH 00/10] printk: Avoid deadlock in NMI + vprintk_emit() cleanup Andrew Morton
2015-06-01 13:06   ` Jan Kara
2015-06-02  9:46     ` long.wanglong
2015-06-02  9:52       ` Jiri Kosina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1432557993-20458-2-git-send-email-pmladek@suse.cz \
    --to=pmladek@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=anderson@redhat.com \
    --cc=dzickus@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=jack@suse.cz \
    --cc=jkosina@suse.cz \
    --cc=kay@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=long.wanglong@huawei.com \
    --cc=mhocko@suse.cz \
    --cc=morgan.wang@huawei.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peifeiyue@huawei.com \
    --cc=rostedt@goodmis.org \
    --cc=sasha.levin@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.