All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>, LKML <linux-kernel@vger.kernel.org>,
	jslaby@suse.cz, Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH v2] printk: Avoid softlockups in console_unlock()
Date: Tue, 05 Feb 2013 19:42:13 -0500	[thread overview]
Message-ID: <1360111333.2621.22.camel@gandalf.local.home> (raw)
In-Reply-To: <CAFTL4hxGwUXmHs0venWANAkYs1CSwjDR1KEFF+waxqzSwup5TQ@mail.gmail.com>

On Tue, 2013-02-05 at 23:56 +0100, Frederic Weisbecker wrote:
> 2013/2/5 Andrew Morton <akpm@linux-foundation.org>:
>  callers to printk() or by a queued work.
> >
> > I still hate the patch :(

I hate it too.

> >
> >> ...
> >>
> >> +void console_unlock(void)
> >> +{
> >> +     if (__console_unlock()) {
> >> +             /* Let worker do the rest of printing */
> >> +             schedule_work(&printk_work);
> >> +     }
> >>  }
> >
> > This creates another place from where we cannot call printk(): anywhere
> > where worker_pool.lock is held.
> >
> > And as schedule_work() can do a wakeup it creates a third reason why
> > the sched code cannot call printk (along with rq->lock taken by
> > wake_up(klogd) and rq->lock taken by up(&console_sem).  Hence
> > printk_sched().  See the lkml thread "[GIT PULL] printk: Support for
> > full dynticks mode".
> 
> Agreed, I really wish we avoid that workqueue solution.

Yes, please do not use a workqueue here.

> 
> >
> > We already have machinery for doing async tickling in printk: the
> > printk_pending stuff.  Did you consider adding another
> > PRINTK_PENDING_foo in some fashion?
> 
> Yeah, that would delay until the next timer tick (small exception
> after my patchset, if tick is stopped this will trigger as soon as
> irqs are re-enabled through a self IPI) but we can probably improve
> that behaviour. And that won't mess up with locking scenarios. The
> printk tick (or irq work after my patchset) can also re-trigger itself
> until the next tick if the batch to send to the console driver is too
> big.

Actually, we could have printk_sched() do something similar. Right now,
printk_sched() loads a per cpu buffer (wasting space, as it is seldom
used), and it sets the PRINTK_PENDING_SCHED flag. Then the next timer
interrupt, it calls printk() with the contents in the buffer.

What we could do, is have printk_sched() instead load the printk buffer.
This requires a little tweaking of the logbuf_lock to keep it from being
held when doing any wake ups, otherwise we have the AB BA deadlock issue
between logbuf_lock and the rq locks.

This patch and printk_sched() can use the same flag. If the
PRINTK_PENDING_SCHED is set (we probably should rename that to
PRINTK_PENDING_OUTPUT) it will do the console_trylock()/unlock() to
finish the print to console.

The below patch makes printk_sched() use the printk buffer instead of
its own (and removing the 512 bytes per cpu that it uses), and does a
console_trylock()/console_unlock() instead. Jan's patch could easily be
based off of this one with very little to change.

-- Steve


printk: Remove separate printk_sched buffers and use printk buf instead

To prevent deadlocks with doing a printk inside the scheduler,
printk_sched() was created. The issue is that printk has a console_sem
that it can grab and release. The release does a wake up if there's a
task pending on the sem, and this wake up grabs the rq locks that is
held in the scheduler. This leads to a possible deadlock if the wake up
uses the same rq as the one with the rq lock held already.

What printk_sched() does is to save the printk write in a per cpu buffer
and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
set, the printk() is done against the buffer.

There's a couple of issues with this approach.

1) If two printk_sched()s are called before the tick, the second one
will overwrite the first one.

2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
bit of space wasted for something that is seldom used.

In order to remove this, the printk_sched() can instead use the printk
buffer instead, and delay the console_trylock()/console_unlock() to the
tick.

Because printk_sched() would then be taking the logbuf_lock, the
logbuf_lock must not be held while doing anything that may call into the
scheduler functions, which includes wake ups. Unfortunately, printk()
also has a console_sem that it uses, and on release, the
up(&console_sem) may do a wake up of any pending waiters. This must be
avoided while holding the logbuf_lock.

Luckily, there's not many places that do the unlock, or hold the
logbuf_lock. By moving things around a little, the console_sem can be
released without ever holding the logbuf_lock, and we can safely have
printk_sched() use the printk buffer directly.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

diff --git a/kernel/printk.c b/kernel/printk.c
index 357f714..ed630fa 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -219,6 +219,9 @@ struct log {
 /*
  * The logbuf_lock protects kmsg buffer, indices, counters. It is also
  * used in interesting ways to provide interlocking in console_unlock();
+ * This can be taken within the scheduler's rq lock. It must be released
+ * before calling console_unlock() or anything else that might wake up
+ * a process.
  */
 static DEFINE_RAW_SPINLOCK(logbuf_lock);
 
@@ -1343,30 +1346,46 @@ static inline int can_use_console(unsigned int cpu)
  * interrupts disabled. It should return with 'lockbuf_lock'
  * released but interrupts still disabled.
  */
-static int console_trylock_for_printk(unsigned int cpu)
+static int console_trylock_for_printk(unsigned int cpu, bool in_sched)
 	__releases(&logbuf_lock)
 {
 	int retval = 0, wake = 0;
 
-	if (console_trylock()) {
-		retval = 1;
+	/* if called from the scheduler, we can not call up() */
+	if (in_sched)
+		goto out;
 
-		/*
-		 * If we can't use the console, we need to release
-		 * the console semaphore by hand to avoid flushing
-		 * the buffer. We need to hold the console semaphore
-		 * in order to do this test safely.
-		 */
-		if (!can_use_console(cpu)) {
-			console_locked = 0;
-			wake = 1;
-			retval = 0;
-		}
+	if (down_trylock(&console_sem))
+		goto out;
+
+	/*
+	 * If we can't use the console, we need to release
+	 * the console semaphore by hand to avoid flushing
+	 * the buffer. We need to hold the console semaphore
+	 * in order to do this test safely.
+	 */
+	if (console_suspended || !can_use_console(cpu)) {
+		wake = 1;
+		goto out;
 	}
+
+	/* console is now locked */
+
+	console_locked = 1;
+	console_may_schedule = 0;
+	mutex_acquire(&console_lock_dep_map, 0, 1, _RET_IP_);
+
+	retval = 1;
+
+out:
 	logbuf_cpu = UINT_MAX;
+	/*
+	 * The logbuf_lock must not be held when doing a wake up,
+	 * which the up(&console_sem) can do.
+	 */
+	raw_spin_unlock(&logbuf_lock);
 	if (wake)
 		up(&console_sem);
-	raw_spin_unlock(&logbuf_lock);
 	return retval;
 }
 
@@ -1495,11 +1514,17 @@ asmlinkage int vprintk_emit(int facility, int level,
 	static int recursion_bug;
 	static char textbuf[LOG_LINE_MAX];
 	char *text = textbuf;
-	size_t text_len;
+	size_t text_len = 0;
 	enum log_flags lflags = 0;
 	unsigned long flags;
 	int this_cpu;
 	int printed_len = 0;
+	bool in_sched = false;
+
+	if (level == -2) {
+		level = -1;
+		in_sched = true;
+	}
 
 	boot_delay_msec(level);
 	printk_delay();
@@ -1545,7 +1570,12 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
 	 */
-	text_len = vscnprintf(text, sizeof(textbuf), fmt, args);
+	if (in_sched)
+		text_len = scnprintf(text, sizeof(textbuf),
+				     KERN_WARNING "[sched_delayed] ");
+
+	text_len += vscnprintf(text + text_len,
+			       sizeof(textbuf) - text_len, fmt, args);
 
 	/* mark and strip a trailing newline */
 	if (text_len && text[text_len-1] == '\n') {
@@ -1620,7 +1650,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 * The console_trylock_for_printk() function will release 'logbuf_lock'
 	 * regardless of whether it actually gets the console semaphore or not.
 	 */
-	if (console_trylock_for_printk(this_cpu))
+	if (console_trylock_for_printk(this_cpu, in_sched))
 		console_unlock();
 
 	lockdep_on();
@@ -1962,18 +1992,20 @@ int is_console_locked(void)
 #define PRINTK_BUF_SIZE		512
 
 #define PRINTK_PENDING_WAKEUP	0x01
-#define PRINTK_PENDING_SCHED	0x02
+#define PRINTK_PENDING_OUTPUT	0x02
 
 static DEFINE_PER_CPU(int, printk_pending);
-static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
 
 void printk_tick(void)
 {
 	if (__this_cpu_read(printk_pending)) {
 		int pending = __this_cpu_xchg(printk_pending, 0);
-		if (pending & PRINTK_PENDING_SCHED) {
-			char *buf = __get_cpu_var(printk_sched_buf);
-			printk(KERN_WARNING "[sched_delayed] %s", buf);
+		if (pending & PRINTK_PENDING_OUTPUT) {
+			if (console_trylock())
+				console_unlock();
+			else
+				/* Try again later */
+				__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
 		}
 		if (pending & PRINTK_PENDING_WAKEUP)
 			wake_up_interruptible(&log_wait);
@@ -2458,20 +2490,14 @@ late_initcall(printk_late_init);
 
 int printk_sched(const char *fmt, ...)
 {
-	unsigned long flags;
 	va_list args;
-	char *buf;
 	int r;
 
-	local_irq_save(flags);
-	buf = __get_cpu_var(printk_sched_buf);
-
 	va_start(args, fmt);
-	r = vsnprintf(buf, PRINTK_BUF_SIZE, fmt, args);
+	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
 	va_end(args);
 
-	__this_cpu_or(printk_pending, PRINTK_PENDING_SCHED);
-	local_irq_restore(flags);
+	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
 
 	return r;
 }



  reply	other threads:[~2013-02-06  0:42 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-04 22:17 [PATCH v2] printk: Avoid softlockups in console_unlock() Jan Kara
2013-02-05 20:38 ` Andrew Morton
2013-02-05 22:56   ` Frederic Weisbecker
2013-02-06  0:42     ` Steven Rostedt [this message]
2013-02-06 14:23   ` Jan Kara
2013-02-06 17:58     ` David Rientjes
2013-02-06 18:52       ` Jan Kara
2013-02-06 19:26         ` David Rientjes
2013-02-06 19:51           ` Jan Kara
2013-02-06 20:19         ` Hugh Dickins
2013-02-06 21:29           ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1360111333.2621.22.camel@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=fweisbec@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jack@suse.cz \
    --cc=jslaby@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.