All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
@ 2013-02-06  1:05 Steven Rostedt
  2013-02-06 14:32 ` Jan Kara
  2013-02-06 23:02 ` Jan Kara
  0 siblings, 2 replies; 12+ messages in thread
From: Steven Rostedt @ 2013-02-06  1:05 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Andrew Morton, Jan Kara, jslaby,
	Greg Kroah-Hartman, Ingo Molnar, Peter Zijlstra, kay.sievers

[ I sent this in a reply to another thread, but wanted a bit more attention to it ]

To prevent deadlocks with doing a printk inside the scheduler,
printk_sched() was created. The issue is that printk has a console_sem
that it can grab and release. The release does a wake up if there's a
task pending on the sem, and this wake up grabs the rq locks that is
held in the scheduler. This leads to a possible deadlock if the wake up
uses the same rq as the one with the rq lock held already.

What printk_sched() does is to save the printk write in a per cpu buffer
and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
set, the printk() is done against the buffer.

There's a couple of issues with this approach.

1) If two printk_sched()s are called before the tick, the second one
will overwrite the first one.

2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
bit of space wasted for something that is seldom used.

In order to remove this, the printk_sched() can instead use the printk
buffer instead, and delay the console_trylock()/console_unlock() to the
tick.

Because printk_sched() would then be taking the logbuf_lock, the
logbuf_lock must not be held while doing anything that may call into the
scheduler functions, which includes wake ups. Unfortunately, printk()
also has a console_sem that it uses, and on release, the
up(&console_sem) may do a wake up of any pending waiters. This must be
avoided while holding the logbuf_lock.

Luckily, there's not many places that do the unlock, or hold the
logbuf_lock. By moving things around a little, the console_sem can be
released without ever holding the logbuf_lock, and we can safely have
printk_sched() use the printk buffer directly.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

diff --git a/kernel/printk.c b/kernel/printk.c
index 357f714..ed630fa 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -219,6 +219,9 @@ struct log {
 /*
  * The logbuf_lock protects kmsg buffer, indices, counters. It is also
  * used in interesting ways to provide interlocking in console_unlock();
+ * This can be taken within the scheduler's rq lock. It must be released
+ * before calling console_unlock() or anything else that might wake up
+ * a process.
  */
 static DEFINE_RAW_SPINLOCK(logbuf_lock);
 
@@ -1343,30 +1346,46 @@ static inline int can_use_console(unsigned int cpu)
  * interrupts disabled. It should return with 'lockbuf_lock'
  * released but interrupts still disabled.
  */
-static int console_trylock_for_printk(unsigned int cpu)
+static int console_trylock_for_printk(unsigned int cpu, bool in_sched)
 	__releases(&logbuf_lock)
 {
 	int retval = 0, wake = 0;
 
-	if (console_trylock()) {
-		retval = 1;
+	/* if called from the scheduler, we can not call up() */
+	if (in_sched)
+		goto out;
 
-		/*
-		 * If we can't use the console, we need to release
-		 * the console semaphore by hand to avoid flushing
-		 * the buffer. We need to hold the console semaphore
-		 * in order to do this test safely.
-		 */
-		if (!can_use_console(cpu)) {
-			console_locked = 0;
-			wake = 1;
-			retval = 0;
-		}
+	if (down_trylock(&console_sem))
+		goto out;
+
+	/*
+	 * If we can't use the console, we need to release
+	 * the console semaphore by hand to avoid flushing
+	 * the buffer. We need to hold the console semaphore
+	 * in order to do this test safely.
+	 */
+	if (console_suspended || !can_use_console(cpu)) {
+		wake = 1;
+		goto out;
 	}
+
+	/* console is now locked */
+
+	console_locked = 1;
+	console_may_schedule = 0;
+	mutex_acquire(&console_lock_dep_map, 0, 1, _RET_IP_);
+
+	retval = 1;
+
+out:
 	logbuf_cpu = UINT_MAX;
+	/*
+	 * The logbuf_lock must not be held when doing a wake up,
+	 * which the up(&console_sem) can do.
+	 */
+	raw_spin_unlock(&logbuf_lock);
 	if (wake)
 		up(&console_sem);
-	raw_spin_unlock(&logbuf_lock);
 	return retval;
 }
 
@@ -1495,11 +1514,17 @@ asmlinkage int vprintk_emit(int facility, int level,
 	static int recursion_bug;
 	static char textbuf[LOG_LINE_MAX];
 	char *text = textbuf;
-	size_t text_len;
+	size_t text_len = 0;
 	enum log_flags lflags = 0;
 	unsigned long flags;
 	int this_cpu;
 	int printed_len = 0;
+	bool in_sched = false;
+
+	if (level == -2) {
+		level = -1;
+		in_sched = true;
+	}
 
 	boot_delay_msec(level);
 	printk_delay();
@@ -1545,7 +1570,12 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
 	 */
-	text_len = vscnprintf(text, sizeof(textbuf), fmt, args);
+	if (in_sched)
+		text_len = scnprintf(text, sizeof(textbuf),
+				     KERN_WARNING "[sched_delayed] ");
+
+	text_len += vscnprintf(text + text_len,
+			       sizeof(textbuf) - text_len, fmt, args);
 
 	/* mark and strip a trailing newline */
 	if (text_len && text[text_len-1] == '\n') {
@@ -1620,7 +1650,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 * The console_trylock_for_printk() function will release 'logbuf_lock'
 	 * regardless of whether it actually gets the console semaphore or not.
 	 */
-	if (console_trylock_for_printk(this_cpu))
+	if (console_trylock_for_printk(this_cpu, in_sched))
 		console_unlock();
 
 	lockdep_on();
@@ -1962,18 +1992,20 @@ int is_console_locked(void)
 #define PRINTK_BUF_SIZE		512
 
 #define PRINTK_PENDING_WAKEUP	0x01
-#define PRINTK_PENDING_SCHED	0x02
+#define PRINTK_PENDING_OUTPUT	0x02
 
 static DEFINE_PER_CPU(int, printk_pending);
-static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
 
 void printk_tick(void)
 {
 	if (__this_cpu_read(printk_pending)) {
 		int pending = __this_cpu_xchg(printk_pending, 0);
-		if (pending & PRINTK_PENDING_SCHED) {
-			char *buf = __get_cpu_var(printk_sched_buf);
-			printk(KERN_WARNING "[sched_delayed] %s", buf);
+		if (pending & PRINTK_PENDING_OUTPUT) {
+			if (console_trylock())
+				console_unlock();
+			else
+				/* Try again later */
+				__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
 		}
 		if (pending & PRINTK_PENDING_WAKEUP)
 			wake_up_interruptible(&log_wait);
@@ -2458,20 +2490,14 @@ late_initcall(printk_late_init);
 
 int printk_sched(const char *fmt, ...)
 {
-	unsigned long flags;
 	va_list args;
-	char *buf;
 	int r;
 
-	local_irq_save(flags);
-	buf = __get_cpu_var(printk_sched_buf);
-
 	va_start(args, fmt);
-	r = vsnprintf(buf, PRINTK_BUF_SIZE, fmt, args);
+	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
 	va_end(args);
 
-	__this_cpu_or(printk_pending, PRINTK_PENDING_SCHED);
-	local_irq_restore(flags);
+	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
 
 	return r;
 }



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2013-02-06  1:05 [RFC][PATCH] printk: Remove separate printk_sched buffers and use printk buf instead Steven Rostedt
@ 2013-02-06 14:32 ` Jan Kara
  2013-02-06 23:02 ` Jan Kara
  1 sibling, 0 replies; 12+ messages in thread
From: Jan Kara @ 2013-02-06 14:32 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Frederic Weisbecker, Andrew Morton, Jan Kara, jslaby,
	Greg Kroah-Hartman, Ingo Molnar, Peter Zijlstra, kay.sievers

On Tue 05-02-13 20:05:48, Steven Rostedt wrote:
> [ I sent this in a reply to another thread, but wanted a bit more attention to it ]
> 
> To prevent deadlocks with doing a printk inside the scheduler,
> printk_sched() was created. The issue is that printk has a console_sem
> that it can grab and release. The release does a wake up if there's a
> task pending on the sem, and this wake up grabs the rq locks that is
> held in the scheduler. This leads to a possible deadlock if the wake up
> uses the same rq as the one with the rq lock held already.
> 
> What printk_sched() does is to save the printk write in a per cpu buffer
> and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
> set, the printk() is done against the buffer.
> 
> There's a couple of issues with this approach.
> 
> 1) If two printk_sched()s are called before the tick, the second one
> will overwrite the first one.
> 
> 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
> bit of space wasted for something that is seldom used.
> 
> In order to remove this, the printk_sched() can instead use the printk
> buffer instead, and delay the console_trylock()/console_unlock() to the
> tick.
> 
> Because printk_sched() would then be taking the logbuf_lock, the
> logbuf_lock must not be held while doing anything that may call into the
> scheduler functions, which includes wake ups. Unfortunately, printk()
> also has a console_sem that it uses, and on release, the
> up(&console_sem) may do a wake up of any pending waiters. This must be
> avoided while holding the logbuf_lock.
> 
> Luckily, there's not many places that do the unlock, or hold the
> logbuf_lock. By moving things around a little, the console_sem can be
> released without ever holding the logbuf_lock, and we can safely have
> printk_sched() use the printk buffer directly.
  I like this! It looks better than what I did with workqueue. I'll
experiment how things behave if I use PRINTK_PENDING_OUTPUT instead of my
workqueue solution. Thanks for the patch. BTW the patch looks OK to me but
I don't feel like an expert in that code so take that with a grain of salt.

								Honza
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> 
> diff --git a/kernel/printk.c b/kernel/printk.c
> index 357f714..ed630fa 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -219,6 +219,9 @@ struct log {
>  /*
>   * The logbuf_lock protects kmsg buffer, indices, counters. It is also
>   * used in interesting ways to provide interlocking in console_unlock();
> + * This can be taken within the scheduler's rq lock. It must be released
> + * before calling console_unlock() or anything else that might wake up
> + * a process.
>   */
>  static DEFINE_RAW_SPINLOCK(logbuf_lock);
>  
> @@ -1343,30 +1346,46 @@ static inline int can_use_console(unsigned int cpu)
>   * interrupts disabled. It should return with 'lockbuf_lock'
>   * released but interrupts still disabled.
>   */
> -static int console_trylock_for_printk(unsigned int cpu)
> +static int console_trylock_for_printk(unsigned int cpu, bool in_sched)
>  	__releases(&logbuf_lock)
>  {
>  	int retval = 0, wake = 0;
>  
> -	if (console_trylock()) {
> -		retval = 1;
> +	/* if called from the scheduler, we can not call up() */
> +	if (in_sched)
> +		goto out;
>  
> -		/*
> -		 * If we can't use the console, we need to release
> -		 * the console semaphore by hand to avoid flushing
> -		 * the buffer. We need to hold the console semaphore
> -		 * in order to do this test safely.
> -		 */
> -		if (!can_use_console(cpu)) {
> -			console_locked = 0;
> -			wake = 1;
> -			retval = 0;
> -		}
> +	if (down_trylock(&console_sem))
> +		goto out;
> +
> +	/*
> +	 * If we can't use the console, we need to release
> +	 * the console semaphore by hand to avoid flushing
> +	 * the buffer. We need to hold the console semaphore
> +	 * in order to do this test safely.
> +	 */
> +	if (console_suspended || !can_use_console(cpu)) {
> +		wake = 1;
> +		goto out;
>  	}
> +
> +	/* console is now locked */
> +
> +	console_locked = 1;
> +	console_may_schedule = 0;
> +	mutex_acquire(&console_lock_dep_map, 0, 1, _RET_IP_);
> +
> +	retval = 1;
> +
> +out:
>  	logbuf_cpu = UINT_MAX;
> +	/*
> +	 * The logbuf_lock must not be held when doing a wake up,
> +	 * which the up(&console_sem) can do.
> +	 */
> +	raw_spin_unlock(&logbuf_lock);
>  	if (wake)
>  		up(&console_sem);
> -	raw_spin_unlock(&logbuf_lock);
>  	return retval;
>  }
>  
> @@ -1495,11 +1514,17 @@ asmlinkage int vprintk_emit(int facility, int level,
>  	static int recursion_bug;
>  	static char textbuf[LOG_LINE_MAX];
>  	char *text = textbuf;
> -	size_t text_len;
> +	size_t text_len = 0;
>  	enum log_flags lflags = 0;
>  	unsigned long flags;
>  	int this_cpu;
>  	int printed_len = 0;
> +	bool in_sched = false;
> +
> +	if (level == -2) {
> +		level = -1;
> +		in_sched = true;
> +	}
>  
>  	boot_delay_msec(level);
>  	printk_delay();
> @@ -1545,7 +1570,12 @@ asmlinkage int vprintk_emit(int facility, int level,
>  	 * The printf needs to come first; we need the syslog
>  	 * prefix which might be passed-in as a parameter.
>  	 */
> -	text_len = vscnprintf(text, sizeof(textbuf), fmt, args);
> +	if (in_sched)
> +		text_len = scnprintf(text, sizeof(textbuf),
> +				     KERN_WARNING "[sched_delayed] ");
> +
> +	text_len += vscnprintf(text + text_len,
> +			       sizeof(textbuf) - text_len, fmt, args);
>  
>  	/* mark and strip a trailing newline */
>  	if (text_len && text[text_len-1] == '\n') {
> @@ -1620,7 +1650,7 @@ asmlinkage int vprintk_emit(int facility, int level,
>  	 * The console_trylock_for_printk() function will release 'logbuf_lock'
>  	 * regardless of whether it actually gets the console semaphore or not.
>  	 */
> -	if (console_trylock_for_printk(this_cpu))
> +	if (console_trylock_for_printk(this_cpu, in_sched))
>  		console_unlock();
>  
>  	lockdep_on();
> @@ -1962,18 +1992,20 @@ int is_console_locked(void)
>  #define PRINTK_BUF_SIZE		512
>  
>  #define PRINTK_PENDING_WAKEUP	0x01
> -#define PRINTK_PENDING_SCHED	0x02
> +#define PRINTK_PENDING_OUTPUT	0x02
>  
>  static DEFINE_PER_CPU(int, printk_pending);
> -static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
>  
>  void printk_tick(void)
>  {
>  	if (__this_cpu_read(printk_pending)) {
>  		int pending = __this_cpu_xchg(printk_pending, 0);
> -		if (pending & PRINTK_PENDING_SCHED) {
> -			char *buf = __get_cpu_var(printk_sched_buf);
> -			printk(KERN_WARNING "[sched_delayed] %s", buf);
> +		if (pending & PRINTK_PENDING_OUTPUT) {
> +			if (console_trylock())
> +				console_unlock();
> +			else
> +				/* Try again later */
> +				__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
>  		}
>  		if (pending & PRINTK_PENDING_WAKEUP)
>  			wake_up_interruptible(&log_wait);
> @@ -2458,20 +2490,14 @@ late_initcall(printk_late_init);
>  
>  int printk_sched(const char *fmt, ...)
>  {
> -	unsigned long flags;
>  	va_list args;
> -	char *buf;
>  	int r;
>  
> -	local_irq_save(flags);
> -	buf = __get_cpu_var(printk_sched_buf);
> -
>  	va_start(args, fmt);
> -	r = vsnprintf(buf, PRINTK_BUF_SIZE, fmt, args);
> +	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
>  	va_end(args);
>  
> -	__this_cpu_or(printk_pending, PRINTK_PENDING_SCHED);
> -	local_irq_restore(flags);
> +	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
>  
>  	return r;
>  }
> 
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2013-02-06  1:05 [RFC][PATCH] printk: Remove separate printk_sched buffers and use printk buf instead Steven Rostedt
  2013-02-06 14:32 ` Jan Kara
@ 2013-02-06 23:02 ` Jan Kara
  2013-02-12 12:22   ` Jan Kara
                     ` (2 more replies)
  1 sibling, 3 replies; 12+ messages in thread
From: Jan Kara @ 2013-02-06 23:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Frederic Weisbecker, Andrew Morton, Jan Kara, jslaby,
	Greg Kroah-Hartman, Ingo Molnar, Peter Zijlstra, kay.sievers

[-- Attachment #1: Type: text/plain, Size: 2754 bytes --]

On Tue 05-02-13 20:05:48, Steven Rostedt wrote:
> [ I sent this in a reply to another thread, but wanted a bit more attention to it ]
> 
> To prevent deadlocks with doing a printk inside the scheduler,
> printk_sched() was created. The issue is that printk has a console_sem
> that it can grab and release. The release does a wake up if there's a
> task pending on the sem, and this wake up grabs the rq locks that is
> held in the scheduler. This leads to a possible deadlock if the wake up
> uses the same rq as the one with the rq lock held already.
> 
> What printk_sched() does is to save the printk write in a per cpu buffer
> and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
> set, the printk() is done against the buffer.
> 
> There's a couple of issues with this approach.
> 
> 1) If two printk_sched()s are called before the tick, the second one
> will overwrite the first one.
> 
> 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
> bit of space wasted for something that is seldom used.
> 
> In order to remove this, the printk_sched() can instead use the printk
> buffer instead, and delay the console_trylock()/console_unlock() to the
> tick.
> 
> Because printk_sched() would then be taking the logbuf_lock, the
> logbuf_lock must not be held while doing anything that may call into the
> scheduler functions, which includes wake ups. Unfortunately, printk()
> also has a console_sem that it uses, and on release, the
> up(&console_sem) may do a wake up of any pending waiters. This must be
> avoided while holding the logbuf_lock.
> 
> Luckily, there's not many places that do the unlock, or hold the
> logbuf_lock. By moving things around a little, the console_sem can be
> released without ever holding the logbuf_lock, and we can safely have
> printk_sched() use the printk buffer directly.
  So after quite some experiments and some hair tearing I have a patch that
uses PRINTK_PENDING_OUTPUT and makes the machine survive my heavy-printk
test. The first patch I attach is actually a small improvement of your
patch which I think can be folded in it. I was also wondering whether we
still need printk_needs_cpu(). I left it in since I don't know about a
better way of keeping at least one CPU ticking. But maybe others do?

The second patch then makes use of PRINTK_PENDING_OUTPUT to handle the
printing when console_unlock() would take too long. If you wonder whether
the last_printing_cpu in printk_tick() is necessary - it is... Without it
we keep printing on one CPU and the machine complains, looses drives,
etc... (I guess I should add this comment somewhere to the code).

Anyway, what do you guys think about this version?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

[-- Attachment #2: 0002-printk-Remove-per-cpu-printk-flags.patch --]
[-- Type: text/x-patch, Size: 2363 bytes --]

>From bda93cfa4133c41e8f10dcd371d1f025d94b85dc Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Wed, 6 Feb 2013 19:15:30 +0100
Subject: [PATCH 2/3] printk: Remove per-cpu printk flags

There's no need to have printk flags percpu. Just make a single variable
operated by atomic operations from it. It also has an advantage that any
cpu can do the printing / wakeup work.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/printk.c |   28 ++++++++++++----------------
 1 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/kernel/printk.c b/kernel/printk.c
index ed630fa..2fe917d 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1986,28 +1986,24 @@ int is_console_locked(void)
 	return console_locked;
 }
 
-/*
- * Delayed printk version, for scheduler-internal messages:
- */
-#define PRINTK_BUF_SIZE		512
+#define PRINTK_PENDING_WAKEUP	1
+#define PRINTK_PENDING_OUTPUT	2
 
-#define PRINTK_PENDING_WAKEUP	0x01
-#define PRINTK_PENDING_OUTPUT	0x02
-
-static DEFINE_PER_CPU(int, printk_pending);
+static unsigned long printk_pending;
 
 void printk_tick(void)
 {
-	if (__this_cpu_read(printk_pending)) {
-		int pending = __this_cpu_xchg(printk_pending, 0);
-		if (pending & PRINTK_PENDING_OUTPUT) {
+	if (printk_pending) {
+		unsigned long pending = xchg(&printk_pending, 0);
+
+		if (test_bit(PRINTK_PENDING_OUTPUT, &pending)) {
 			if (console_trylock())
 				console_unlock();
 			else
 				/* Try again later */
-				__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
+				set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
 		}
-		if (pending & PRINTK_PENDING_WAKEUP)
+		if (test_bit(PRINTK_PENDING_WAKEUP, &pending))
 			wake_up_interruptible(&log_wait);
 	}
 }
@@ -2016,13 +2012,13 @@ int printk_needs_cpu(int cpu)
 {
 	if (cpu_is_offline(cpu))
 		printk_tick();
-	return __this_cpu_read(printk_pending);
+	return printk_pending;
 }
 
 void wake_up_klogd(void)
 {
 	if (waitqueue_active(&log_wait))
-		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
+		set_bit(PRINTK_PENDING_WAKEUP, &printk_pending);
 }
 
 static void console_cont_flush(char *text, size_t size)
@@ -2497,7 +2493,7 @@ int printk_sched(const char *fmt, ...)
 	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
 	va_end(args);
 
-	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
+	set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
 
 	return r;
 }
-- 
1.7.1


[-- Attachment #3: 0003-printk-Avoid-softlockups-in-console_unlock.patch --]
[-- Type: text/x-patch, Size: 6237 bytes --]

>From 4d44b91477c73a29dd16a032bc431227d159f1bf Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Tue, 15 Jan 2013 15:15:06 +0100
Subject: [PATCH 3/3] printk: Avoid softlockups in console_unlock()

A CPU can be caught in console_unlock() for a long time (tens of seconds are
reported by our customers) when other CPUs are using printk heavily and serial
console makes printing slow. Despite serial console drivers are calling
touch_nmi_watchdog() this triggers softlockup warnings because
interrupts are disabled for the whole time console_unlock() runs (e.g.
vprintk() calls console_unlock() with interrupts disabled). Thus IPIs
cannot be processed and other CPUs get stuck spinning in calls like
smp_call_function_many(). Also RCU eventually starts reporting lockups.

In my artifical testing I also managed to trigger a situation when disk
disappeared from the system apparently because commands to / from it
could not be delivered for long enough. This is why just silencing
watchdogs isn't a reliable solution to the problem and we simply have to
avoid spending too long in console_unlock().

We fix the issue by limiting the time we spend in console_unlock() to
watchdog_thresh() / 4 (unless we are in an early boot stage or oops is
happening). The rest of the buffer will be printed either by further
callers to printk() or during next timer tick.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 include/linux/nmi.h |   18 +++++++++++++
 kernel/printk.c     |   70 +++++++++++++++++++++++++++++++++++++++++---------
 2 files changed, 75 insertions(+), 13 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index db50840..ce32959 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -51,6 +51,24 @@ extern int watchdog_thresh;
 struct ctl_table;
 extern int proc_dowatchdog(struct ctl_table *, int ,
 			   void __user *, size_t *, loff_t *);
+/*
+ * Return the maximum number of nanosecond for which interrupts may be disabled
+ * on the current CPU
+ */
+static inline u64 max_interrupt_disabled_duration(void)
+{
+	/*
+	 * We give us some headroom because timers need time to fire before the
+	 * watchdog period expires.
+	 */
+	return ((u64)watchdog_thresh) * NSEC_PER_SEC / 2;
+}
+#else
+static inline u64 max_interrupt_disabled_duration(void)
+{
+	/* About the value we'd get with the default watchdog setting */
+	return 5ULL * NSEC_PER_SEC;
+}
 #endif
 
 #endif
diff --git a/kernel/printk.c b/kernel/printk.c
index 2fe917d..c4f7f6f 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1990,17 +1990,31 @@ int is_console_locked(void)
 #define PRINTK_PENDING_OUTPUT	2
 
 static unsigned long printk_pending;
+static int last_printing_cpu = -1;
+
+static bool __console_unlock(void);
 
 void printk_tick(void)
 {
 	if (printk_pending) {
-		unsigned long pending = xchg(&printk_pending, 0);
+		unsigned long pending;
+		int cpu = smp_processor_id();
+
+		/*
+		 * Did we last print? Give other cpus a chance to do it but
+		 * reset the variable so that we do printing in the next tick
+		 * if noone else got to it.
+		 */
+		if (cmpxchg(&last_printing_cpu, cpu, -1))
+			return;
 
+		pending = xchg(&printk_pending, 0);
 		if (test_bit(PRINTK_PENDING_OUTPUT, &pending)) {
+			bool more_work = true;
+
 			if (console_trylock())
-				console_unlock();
-			else
-				/* Try again later */
+				more_work = __console_unlock();
+			if (more_work)
 				set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
 		}
 		if (test_bit(PRINTK_PENDING_WAKEUP, &pending))
@@ -2051,33 +2065,46 @@ out:
 }
 
 /**
- * console_unlock - unlock the console system
+ * __console_unlock - unlock the console system
  *
  * Releases the console_lock which the caller holds on the console system
  * and the console driver list.
  *
- * While the console_lock was held, console output may have been buffered
- * by printk().  If this is the case, console_unlock(); emits
- * the output prior to releasing the lock.
+ * While the console_lock was held, console output may have been buffered by
+ * printk(). If this is the case, console_unlock() emits the output prior to
+ * releasing the lock. However we need not write all the data in the buffer if
+ * we would hog the CPU for too long. Function returns true, if there's more
+ * data that needs printing in the buffer.
  *
  * If there is output waiting, we wake /dev/kmsg and syslog() users.
  *
- * console_unlock(); may be called from any context.
+ * __console_unlock() may be called from any context.
  */
-void console_unlock(void)
+static bool __console_unlock(void)
 {
 	static char text[LOG_LINE_MAX + PREFIX_MAX];
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;
 	bool retry;
+	u64 end_time, now;
+	int cur_cpu;
 
 	if (console_suspended) {
 		up(&console_sem);
-		return;
+		return false;
 	}
 
 	console_may_schedule = 0;
+	preempt_disable();
+	cur_cpu = smp_processor_id();
+	/*
+	 * We give us some headroom because we check the time only after
+	 * printing the whole message
+	 */
+	end_time = cpu_clock(cur_cpu) +
+				max_interrupt_disabled_duration() / 2;
+	preempt_enable();
 
 	/* flush buffered message fragment immediately to console */
 	console_cont_flush(text, sizeof(text));
@@ -2100,7 +2127,9 @@ again:
 			console_prev = 0;
 		}
 skip:
-		if (console_seq == log_next_seq)
+		now = sched_clock_cpu(cur_cpu);
+		if (console_seq == log_next_seq ||
+		    (now > end_time && !oops_in_progress))
 			break;
 
 		msg = log_from_idx(console_idx);
@@ -2143,6 +2172,7 @@ skip:
 
 	raw_spin_unlock(&logbuf_lock);
 
+	last_printing_cpu = cur_cpu;
 	up(&console_sem);
 
 	/*
@@ -2155,11 +2185,25 @@ skip:
 	retry = console_seq != log_next_seq;
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
-	if (retry && console_trylock())
+	if (retry && (now <= end_time || oops_in_progress) &&
+	    console_trylock())
 		goto again;
 
 	if (wake_klogd)
 		wake_up_klogd();
+	return retry;
+}
+
+/*
+ * This is a wrapper against __console_unlock() that makes sure the rest of
+ * buffer is printed in future.
+ */
+void console_unlock(void)
+{
+	if (__console_unlock()) {
+		/* Leave the rest of printing for a timer tick */
+		set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
+	}
 }
 EXPORT_SYMBOL(console_unlock);
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2013-02-06 23:02 ` Jan Kara
@ 2013-02-12 12:22   ` Jan Kara
  2013-02-14 18:48     ` Steven Rostedt
  2013-02-15 16:54   ` Jan Kara
  2013-02-15 16:57   ` [PATCH 3/3] printk: Avoid softlockups in console_unlock() Jan Kara
  2 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2013-02-12 12:22 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Frederic Weisbecker, Andrew Morton, Jan Kara, jslaby,
	Greg Kroah-Hartman, Ingo Molnar, Peter Zijlstra, kay.sievers

On Thu 07-02-13 00:02:20, Jan Kara wrote:
> On Tue 05-02-13 20:05:48, Steven Rostedt wrote:
> > [ I sent this in a reply to another thread, but wanted a bit more attention to it ]
> > 
> > To prevent deadlocks with doing a printk inside the scheduler,
> > printk_sched() was created. The issue is that printk has a console_sem
> > that it can grab and release. The release does a wake up if there's a
> > task pending on the sem, and this wake up grabs the rq locks that is
> > held in the scheduler. This leads to a possible deadlock if the wake up
> > uses the same rq as the one with the rq lock held already.
> > 
> > What printk_sched() does is to save the printk write in a per cpu buffer
> > and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
> > set, the printk() is done against the buffer.
> > 
> > There's a couple of issues with this approach.
> > 
> > 1) If two printk_sched()s are called before the tick, the second one
> > will overwrite the first one.
> > 
> > 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
> > bit of space wasted for something that is seldom used.
> > 
> > In order to remove this, the printk_sched() can instead use the printk
> > buffer instead, and delay the console_trylock()/console_unlock() to the
> > tick.
> > 
> > Because printk_sched() would then be taking the logbuf_lock, the
> > logbuf_lock must not be held while doing anything that may call into the
> > scheduler functions, which includes wake ups. Unfortunately, printk()
> > also has a console_sem that it uses, and on release, the
> > up(&console_sem) may do a wake up of any pending waiters. This must be
> > avoided while holding the logbuf_lock.
> > 
> > Luckily, there's not many places that do the unlock, or hold the
> > logbuf_lock. By moving things around a little, the console_sem can be
> > released without ever holding the logbuf_lock, and we can safely have
> > printk_sched() use the printk buffer directly.
>   So after quite some experiments and some hair tearing I have a patch that
> uses PRINTK_PENDING_OUTPUT and makes the machine survive my heavy-printk
> test. The first patch I attach is actually a small improvement of your
> patch which I think can be folded in it. I was also wondering whether we
> still need printk_needs_cpu(). I left it in since I don't know about a
> better way of keeping at least one CPU ticking. But maybe others do?
> 
> The second patch then makes use of PRINTK_PENDING_OUTPUT to handle the
> printing when console_unlock() would take too long. If you wonder whether
> the last_printing_cpu in printk_tick() is necessary - it is... Without it
> we keep printing on one CPU and the machine complains, looses drives,
> etc... (I guess I should add this comment somewhere to the code).
> 
> Anyway, what do you guys think about this version?
  Steven, Andrew, did you have a chance to look at my patches?

								Honza

> From bda93cfa4133c41e8f10dcd371d1f025d94b85dc Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@suse.cz>
> Date: Wed, 6 Feb 2013 19:15:30 +0100
> Subject: [PATCH 2/3] printk: Remove per-cpu printk flags
> 
> There's no need to have printk flags percpu. Just make a single variable
> operated by atomic operations from it. It also has an advantage that any
> cpu can do the printing / wakeup work.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  kernel/printk.c |   28 ++++++++++++----------------
>  1 files changed, 12 insertions(+), 16 deletions(-)
> 
> diff --git a/kernel/printk.c b/kernel/printk.c
> index ed630fa..2fe917d 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -1986,28 +1986,24 @@ int is_console_locked(void)
>  	return console_locked;
>  }
>  
> -/*
> - * Delayed printk version, for scheduler-internal messages:
> - */
> -#define PRINTK_BUF_SIZE		512
> +#define PRINTK_PENDING_WAKEUP	1
> +#define PRINTK_PENDING_OUTPUT	2
>  
> -#define PRINTK_PENDING_WAKEUP	0x01
> -#define PRINTK_PENDING_OUTPUT	0x02
> -
> -static DEFINE_PER_CPU(int, printk_pending);
> +static unsigned long printk_pending;
>  
>  void printk_tick(void)
>  {
> -	if (__this_cpu_read(printk_pending)) {
> -		int pending = __this_cpu_xchg(printk_pending, 0);
> -		if (pending & PRINTK_PENDING_OUTPUT) {
> +	if (printk_pending) {
> +		unsigned long pending = xchg(&printk_pending, 0);
> +
> +		if (test_bit(PRINTK_PENDING_OUTPUT, &pending)) {
>  			if (console_trylock())
>  				console_unlock();
>  			else
>  				/* Try again later */
> -				__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> +				set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
>  		}
> -		if (pending & PRINTK_PENDING_WAKEUP)
> +		if (test_bit(PRINTK_PENDING_WAKEUP, &pending))
>  			wake_up_interruptible(&log_wait);
>  	}
>  }
> @@ -2016,13 +2012,13 @@ int printk_needs_cpu(int cpu)
>  {
>  	if (cpu_is_offline(cpu))
>  		printk_tick();
> -	return __this_cpu_read(printk_pending);
> +	return printk_pending;
>  }
>  
>  void wake_up_klogd(void)
>  {
>  	if (waitqueue_active(&log_wait))
> -		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
> +		set_bit(PRINTK_PENDING_WAKEUP, &printk_pending);
>  }
>  
>  static void console_cont_flush(char *text, size_t size)
> @@ -2497,7 +2493,7 @@ int printk_sched(const char *fmt, ...)
>  	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
>  	va_end(args);
>  
> -	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> +	set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
>  
>  	return r;
>  }
> -- 
> 1.7.1
> 

> From 4d44b91477c73a29dd16a032bc431227d159f1bf Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@suse.cz>
> Date: Tue, 15 Jan 2013 15:15:06 +0100
> Subject: [PATCH 3/3] printk: Avoid softlockups in console_unlock()
> 
> A CPU can be caught in console_unlock() for a long time (tens of seconds are
> reported by our customers) when other CPUs are using printk heavily and serial
> console makes printing slow. Despite serial console drivers are calling
> touch_nmi_watchdog() this triggers softlockup warnings because
> interrupts are disabled for the whole time console_unlock() runs (e.g.
> vprintk() calls console_unlock() with interrupts disabled). Thus IPIs
> cannot be processed and other CPUs get stuck spinning in calls like
> smp_call_function_many(). Also RCU eventually starts reporting lockups.
> 
> In my artifical testing I also managed to trigger a situation when disk
> disappeared from the system apparently because commands to / from it
> could not be delivered for long enough. This is why just silencing
> watchdogs isn't a reliable solution to the problem and we simply have to
> avoid spending too long in console_unlock().
> 
> We fix the issue by limiting the time we spend in console_unlock() to
> watchdog_thresh() / 4 (unless we are in an early boot stage or oops is
> happening). The rest of the buffer will be printed either by further
> callers to printk() or during next timer tick.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  include/linux/nmi.h |   18 +++++++++++++
>  kernel/printk.c     |   70 +++++++++++++++++++++++++++++++++++++++++---------
>  2 files changed, 75 insertions(+), 13 deletions(-)
> 
> diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> index db50840..ce32959 100644
> --- a/include/linux/nmi.h
> +++ b/include/linux/nmi.h
> @@ -51,6 +51,24 @@ extern int watchdog_thresh;
>  struct ctl_table;
>  extern int proc_dowatchdog(struct ctl_table *, int ,
>  			   void __user *, size_t *, loff_t *);
> +/*
> + * Return the maximum number of nanosecond for which interrupts may be disabled
> + * on the current CPU
> + */
> +static inline u64 max_interrupt_disabled_duration(void)
> +{
> +	/*
> +	 * We give us some headroom because timers need time to fire before the
> +	 * watchdog period expires.
> +	 */
> +	return ((u64)watchdog_thresh) * NSEC_PER_SEC / 2;
> +}
> +#else
> +static inline u64 max_interrupt_disabled_duration(void)
> +{
> +	/* About the value we'd get with the default watchdog setting */
> +	return 5ULL * NSEC_PER_SEC;
> +}
>  #endif
>  
>  #endif
> diff --git a/kernel/printk.c b/kernel/printk.c
> index 2fe917d..c4f7f6f 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -1990,17 +1990,31 @@ int is_console_locked(void)
>  #define PRINTK_PENDING_OUTPUT	2
>  
>  static unsigned long printk_pending;
> +static int last_printing_cpu = -1;
> +
> +static bool __console_unlock(void);
>  
>  void printk_tick(void)
>  {
>  	if (printk_pending) {
> -		unsigned long pending = xchg(&printk_pending, 0);
> +		unsigned long pending;
> +		int cpu = smp_processor_id();
> +
> +		/*
> +		 * Did we last print? Give other cpus a chance to do it but
> +		 * reset the variable so that we do printing in the next tick
> +		 * if noone else got to it.
> +		 */
> +		if (cmpxchg(&last_printing_cpu, cpu, -1))
> +			return;
>  
> +		pending = xchg(&printk_pending, 0);
>  		if (test_bit(PRINTK_PENDING_OUTPUT, &pending)) {
> +			bool more_work = true;
> +
>  			if (console_trylock())
> -				console_unlock();
> -			else
> -				/* Try again later */
> +				more_work = __console_unlock();
> +			if (more_work)
>  				set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
>  		}
>  		if (test_bit(PRINTK_PENDING_WAKEUP, &pending))
> @@ -2051,33 +2065,46 @@ out:
>  }
>  
>  /**
> - * console_unlock - unlock the console system
> + * __console_unlock - unlock the console system
>   *
>   * Releases the console_lock which the caller holds on the console system
>   * and the console driver list.
>   *
> - * While the console_lock was held, console output may have been buffered
> - * by printk().  If this is the case, console_unlock(); emits
> - * the output prior to releasing the lock.
> + * While the console_lock was held, console output may have been buffered by
> + * printk(). If this is the case, console_unlock() emits the output prior to
> + * releasing the lock. However we need not write all the data in the buffer if
> + * we would hog the CPU for too long. Function returns true, if there's more
> + * data that needs printing in the buffer.
>   *
>   * If there is output waiting, we wake /dev/kmsg and syslog() users.
>   *
> - * console_unlock(); may be called from any context.
> + * __console_unlock() may be called from any context.
>   */
> -void console_unlock(void)
> +static bool __console_unlock(void)
>  {
>  	static char text[LOG_LINE_MAX + PREFIX_MAX];
>  	static u64 seen_seq;
>  	unsigned long flags;
>  	bool wake_klogd = false;
>  	bool retry;
> +	u64 end_time, now;
> +	int cur_cpu;
>  
>  	if (console_suspended) {
>  		up(&console_sem);
> -		return;
> +		return false;
>  	}
>  
>  	console_may_schedule = 0;
> +	preempt_disable();
> +	cur_cpu = smp_processor_id();
> +	/*
> +	 * We give us some headroom because we check the time only after
> +	 * printing the whole message
> +	 */
> +	end_time = cpu_clock(cur_cpu) +
> +				max_interrupt_disabled_duration() / 2;
> +	preempt_enable();
>  
>  	/* flush buffered message fragment immediately to console */
>  	console_cont_flush(text, sizeof(text));
> @@ -2100,7 +2127,9 @@ again:
>  			console_prev = 0;
>  		}
>  skip:
> -		if (console_seq == log_next_seq)
> +		now = sched_clock_cpu(cur_cpu);
> +		if (console_seq == log_next_seq ||
> +		    (now > end_time && !oops_in_progress))
>  			break;
>  
>  		msg = log_from_idx(console_idx);
> @@ -2143,6 +2172,7 @@ skip:
>  
>  	raw_spin_unlock(&logbuf_lock);
>  
> +	last_printing_cpu = cur_cpu;
>  	up(&console_sem);
>  
>  	/*
> @@ -2155,11 +2185,25 @@ skip:
>  	retry = console_seq != log_next_seq;
>  	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
>  
> -	if (retry && console_trylock())
> +	if (retry && (now <= end_time || oops_in_progress) &&
> +	    console_trylock())
>  		goto again;
>  
>  	if (wake_klogd)
>  		wake_up_klogd();
> +	return retry;
> +}
> +
> +/*
> + * This is a wrapper against __console_unlock() that makes sure the rest of
> + * buffer is printed in future.
> + */
> +void console_unlock(void)
> +{
> +	if (__console_unlock()) {
> +		/* Leave the rest of printing for a timer tick */
> +		set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
> +	}
>  }
>  EXPORT_SYMBOL(console_unlock);
>  
> -- 
> 1.7.1
> 

-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2013-02-12 12:22   ` Jan Kara
@ 2013-02-14 18:48     ` Steven Rostedt
  2013-02-15 16:50       ` Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2013-02-14 18:48 UTC (permalink / raw)
  To: Jan Kara
  Cc: LKML, Frederic Weisbecker, Andrew Morton, jslaby,
	Greg Kroah-Hartman, Ingo Molnar, Peter Zijlstra, kay.sievers

On Tue, 2013-02-12 at 13:22 +0100, Jan Kara wrote:

> > Anyway, what do you guys think about this version?
>   Steven, Andrew, did you have a chance to look at my patches?

Can you resend without being attachments. Patches sent as attachments
seldom get reviewed.

-- Steve



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2013-02-14 18:48     ` Steven Rostedt
@ 2013-02-15 16:50       ` Jan Kara
  2013-02-15 16:53         ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2013-02-15 16:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jan Kara, LKML, Frederic Weisbecker, Andrew Morton, jslaby,
	Greg Kroah-Hartman, Ingo Molnar, Peter Zijlstra, kay.sievers

On Thu 14-02-13 13:48:11, Steven Rostedt wrote:
> On Tue, 2013-02-12 at 13:22 +0100, Jan Kara wrote:
> 
> > > Anyway, what do you guys think about this version?
> >   Steven, Andrew, did you have a chance to look at my patches?
> 
> Can you resend without being attachments. Patches sent as attachments
> seldom get reviewed.
  Well, for example I do review text/plain attachments but that may count
into 'seldom' ;) I'll send them inline.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2013-02-15 16:50       ` Jan Kara
@ 2013-02-15 16:53         ` Steven Rostedt
  2013-02-15 17:00           ` Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2013-02-15 16:53 UTC (permalink / raw)
  To: Jan Kara
  Cc: LKML, Frederic Weisbecker, Andrew Morton, jslaby,
	Greg Kroah-Hartman, Ingo Molnar, Peter Zijlstra, kay.sievers

On Fri, 2013-02-15 at 17:50 +0100, Jan Kara wrote:
> On Thu 14-02-13 13:48:11, Steven Rostedt wrote:
> > On Tue, 2013-02-12 at 13:22 +0100, Jan Kara wrote:
> > 
> > > > Anyway, what do you guys think about this version?
> > >   Steven, Andrew, did you have a chance to look at my patches?
> > 
> > Can you resend without being attachments. Patches sent as attachments
> > seldom get reviewed.
>   Well, for example I do review text/plain attachments but that may count
> into 'seldom' ;) I'll send them inline.
> 

The problem is that it makes it difficult to reply with comments. I
haven't looked at the patches yet so I may not have comments, but on my
email client, commenting attachments means that I need to cut and paste
the text to do the comment, which isn't the case for inline patches.

-- Steve



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2013-02-06 23:02 ` Jan Kara
  2013-02-12 12:22   ` Jan Kara
@ 2013-02-15 16:54   ` Jan Kara
  2013-02-15 16:57   ` [PATCH 3/3] printk: Avoid softlockups in console_unlock() Jan Kara
  2 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2013-02-15 16:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Frederic Weisbecker, Andrew Morton, Jan Kara, jslaby,
	Greg Kroah-Hartman, Ingo Molnar, Peter Zijlstra, kay.sievers

Here comes addition to Steven's patch from this thread...

>From bda93cfa4133c41e8f10dcd371d1f025d94b85dc Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Wed, 6 Feb 2013 19:15:30 +0100
Subject: [PATCH 2/3] printk: Remove per-cpu printk flags

There's no need to have printk flags percpu. Just make a single variable
operated by atomic operations from it. It also has an advantage that any
cpu can do the printing / wakeup work.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/printk.c |   28 ++++++++++++----------------
 1 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/kernel/printk.c b/kernel/printk.c
index ed630fa..2fe917d 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1986,28 +1986,24 @@ int is_console_locked(void)
 	return console_locked;
 }
 
-/*
- * Delayed printk version, for scheduler-internal messages:
- */
-#define PRINTK_BUF_SIZE		512
+#define PRINTK_PENDING_WAKEUP	1
+#define PRINTK_PENDING_OUTPUT	2
 
-#define PRINTK_PENDING_WAKEUP	0x01
-#define PRINTK_PENDING_OUTPUT	0x02
-
-static DEFINE_PER_CPU(int, printk_pending);
+static unsigned long printk_pending;
 
 void printk_tick(void)
 {
-	if (__this_cpu_read(printk_pending)) {
-		int pending = __this_cpu_xchg(printk_pending, 0);
-		if (pending & PRINTK_PENDING_OUTPUT) {
+	if (printk_pending) {
+		unsigned long pending = xchg(&printk_pending, 0);
+
+		if (test_bit(PRINTK_PENDING_OUTPUT, &pending)) {
 			if (console_trylock())
 				console_unlock();
 			else
 				/* Try again later */
-				__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
+				set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
 		}
-		if (pending & PRINTK_PENDING_WAKEUP)
+		if (test_bit(PRINTK_PENDING_WAKEUP, &pending))
 			wake_up_interruptible(&log_wait);
 	}
 }
@@ -2016,13 +2012,13 @@ int printk_needs_cpu(int cpu)
 {
 	if (cpu_is_offline(cpu))
 		printk_tick();
-	return __this_cpu_read(printk_pending);
+	return printk_pending;
 }
 
 void wake_up_klogd(void)
 {
 	if (waitqueue_active(&log_wait))
-		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
+		set_bit(PRINTK_PENDING_WAKEUP, &printk_pending);
 }
 
 static void console_cont_flush(char *text, size_t size)
@@ -2497,7 +2493,7 @@ int printk_sched(const char *fmt, ...)
 	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
 	va_end(args);
 
-	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
+	set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
 
 	return r;
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] printk: Avoid softlockups in console_unlock()
  2013-02-06 23:02 ` Jan Kara
  2013-02-12 12:22   ` Jan Kara
  2013-02-15 16:54   ` Jan Kara
@ 2013-02-15 16:57   ` Jan Kara
  2013-02-15 22:22     ` Andrew Morton
  2 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2013-02-15 16:57 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Frederic Weisbecker, Andrew Morton, Jan Kara, jslaby,
	Greg Kroah-Hartman, Ingo Molnar, Peter Zijlstra, kay.sievers

A CPU can be caught in console_unlock() for a long time (tens of seconds are
reported by our customers) when other CPUs are using printk heavily and serial
console makes printing slow. Despite serial console drivers are calling
touch_nmi_watchdog() this triggers softlockup warnings because
interrupts are disabled for the whole time console_unlock() runs (e.g.
vprintk() calls console_unlock() with interrupts disabled). Thus IPIs
cannot be processed and other CPUs get stuck spinning in calls like
smp_call_function_many(). Also RCU eventually starts reporting lockups.

In my artifical testing I also managed to trigger a situation when disk
disappeared from the system apparently because commands to / from it
could not be delivered for long enough. This is why just silencing
watchdogs isn't a reliable solution to the problem and we simply have to
avoid spending too long in console_unlock().

We fix the issue by limiting the time we spend in console_unlock() to
watchdog_thresh() / 4 (unless we are in an early boot stage or oops is
happening). The rest of the buffer will be printed either by further
callers to printk() or during next timer tick.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 include/linux/nmi.h |   18 +++++++++++++
 kernel/printk.c     |   70 +++++++++++++++++++++++++++++++++++++++++---------
 2 files changed, 75 insertions(+), 13 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index db50840..ce32959 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -51,6 +51,24 @@ extern int watchdog_thresh;
 struct ctl_table;
 extern int proc_dowatchdog(struct ctl_table *, int ,
 			   void __user *, size_t *, loff_t *);
+/*
+ * Return the maximum number of nanosecond for which interrupts may be disabled
+ * on the current CPU
+ */
+static inline u64 max_interrupt_disabled_duration(void)
+{
+	/*
+	 * We give us some headroom because timers need time to fire before the
+	 * watchdog period expires.
+	 */
+	return ((u64)watchdog_thresh) * NSEC_PER_SEC / 2;
+}
+#else
+static inline u64 max_interrupt_disabled_duration(void)
+{
+	/* About the value we'd get with the default watchdog setting */
+	return 5ULL * NSEC_PER_SEC;
+}
 #endif
 
 #endif
diff --git a/kernel/printk.c b/kernel/printk.c
index 2fe917d..c4f7f6f 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1990,17 +1990,31 @@ int is_console_locked(void)
 #define PRINTK_PENDING_OUTPUT	2
 
 static unsigned long printk_pending;
+static int last_printing_cpu = -1;
+
+static bool __console_unlock(void);
 
 void printk_tick(void)
 {
 	if (printk_pending) {
-		unsigned long pending = xchg(&printk_pending, 0);
+		unsigned long pending;
+		int cpu = smp_processor_id();
+
+		/*
+		 * Did we last print? Give other cpus a chance to do it but
+		 * reset the variable so that we do printing in the next tick
+		 * if noone else got to it.
+		 */
+		if (cmpxchg(&last_printing_cpu, cpu, -1))
+			return;
 
+		pending = xchg(&printk_pending, 0);
 		if (test_bit(PRINTK_PENDING_OUTPUT, &pending)) {
+			bool more_work = true;
+
 			if (console_trylock())
-				console_unlock();
-			else
-				/* Try again later */
+				more_work = __console_unlock();
+			if (more_work)
 				set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
 		}
 		if (test_bit(PRINTK_PENDING_WAKEUP, &pending))
@@ -2051,33 +2065,46 @@ out:
 }
 
 /**
- * console_unlock - unlock the console system
+ * __console_unlock - unlock the console system
  *
  * Releases the console_lock which the caller holds on the console system
  * and the console driver list.
  *
- * While the console_lock was held, console output may have been buffered
- * by printk().  If this is the case, console_unlock(); emits
- * the output prior to releasing the lock.
+ * While the console_lock was held, console output may have been buffered by
+ * printk(). If this is the case, console_unlock() emits the output prior to
+ * releasing the lock. However we need not write all the data in the buffer if
+ * we would hog the CPU for too long. Function returns true, if there's more
+ * data that needs printing in the buffer.
  *
  * If there is output waiting, we wake /dev/kmsg and syslog() users.
  *
- * console_unlock(); may be called from any context.
+ * __console_unlock() may be called from any context.
  */
-void console_unlock(void)
+static bool __console_unlock(void)
 {
 	static char text[LOG_LINE_MAX + PREFIX_MAX];
 	static u64 seen_seq;
 	unsigned long flags;
 	bool wake_klogd = false;
 	bool retry;
+	u64 end_time, now;
+	int cur_cpu;
 
 	if (console_suspended) {
 		up(&console_sem);
-		return;
+		return false;
 	}
 
 	console_may_schedule = 0;
+	preempt_disable();
+	cur_cpu = smp_processor_id();
+	/*
+	 * We give us some headroom because we check the time only after
+	 * printing the whole message
+	 */
+	end_time = cpu_clock(cur_cpu) +
+				max_interrupt_disabled_duration() / 2;
+	preempt_enable();
 
 	/* flush buffered message fragment immediately to console */
 	console_cont_flush(text, sizeof(text));
@@ -2100,7 +2127,9 @@ again:
 			console_prev = 0;
 		}
 skip:
-		if (console_seq == log_next_seq)
+		now = sched_clock_cpu(cur_cpu);
+		if (console_seq == log_next_seq ||
+		    (now > end_time && !oops_in_progress))
 			break;
 
 		msg = log_from_idx(console_idx);
@@ -2143,6 +2172,7 @@ skip:
 
 	raw_spin_unlock(&logbuf_lock);
 
+	last_printing_cpu = cur_cpu;
 	up(&console_sem);
 
 	/*
@@ -2155,11 +2185,25 @@ skip:
 	retry = console_seq != log_next_seq;
 	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
-	if (retry && console_trylock())
+	if (retry && (now <= end_time || oops_in_progress) &&
+	    console_trylock())
 		goto again;
 
 	if (wake_klogd)
 		wake_up_klogd();
+	return retry;
+}
+
+/*
+ * This is a wrapper against __console_unlock() that makes sure the rest of
+ * buffer is printed in future.
+ */
+void console_unlock(void)
+{
+	if (__console_unlock()) {
+		/* Leave the rest of printing for a timer tick */
+		set_bit(PRINTK_PENDING_OUTPUT, &printk_pending);
+	}
 }
 EXPORT_SYMBOL(console_unlock);
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2013-02-15 16:53         ` Steven Rostedt
@ 2013-02-15 17:00           ` Jan Kara
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2013-02-15 17:00 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jan Kara, LKML, Frederic Weisbecker, Andrew Morton, jslaby,
	Greg Kroah-Hartman, Ingo Molnar, Peter Zijlstra, kay.sievers

On Fri 15-02-13 11:53:37, Steven Rostedt wrote:
> On Fri, 2013-02-15 at 17:50 +0100, Jan Kara wrote:
> > On Thu 14-02-13 13:48:11, Steven Rostedt wrote:
> > > On Tue, 2013-02-12 at 13:22 +0100, Jan Kara wrote:
> > > 
> > > > > Anyway, what do you guys think about this version?
> > > >   Steven, Andrew, did you have a chance to look at my patches?
> > > 
> > > Can you resend without being attachments. Patches sent as attachments
> > > seldom get reviewed.
> >   Well, for example I do review text/plain attachments but that may count
> > into 'seldom' ;) I'll send them inline.
> > 
> 
> The problem is that it makes it difficult to reply with comments. I
> haven't looked at the patches yet so I may not have comments, but on my
> email client, commenting attachments means that I need to cut and paste
> the text to do the comment, which isn't the case for inline patches.
  Yeah, I understand. I use mutt and it automatically includes text/plain
attachments into reply for me so I barely notice a difference between
inline and attached patch. But with other clients it may be a hassle...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] printk: Avoid softlockups in console_unlock()
  2013-02-15 16:57   ` [PATCH 3/3] printk: Avoid softlockups in console_unlock() Jan Kara
@ 2013-02-15 22:22     ` Andrew Morton
  2013-02-18 16:31       ` Jan Kara
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2013-02-15 22:22 UTC (permalink / raw)
  To: Jan Kara
  Cc: Steven Rostedt, LKML, Frederic Weisbecker, jslaby,
	Greg Kroah-Hartman, Ingo Molnar, Peter Zijlstra, kay.sievers

On Fri, 15 Feb 2013 17:57:10 +0100
Jan Kara <jack@suse.cz> wrote:

> A CPU can be caught in console_unlock() for a long time (tens of seconds are
> reported by our customers) when other CPUs are using printk heavily and serial
> console makes printing slow. Despite serial console drivers are calling
> touch_nmi_watchdog() this triggers softlockup warnings because
> interrupts are disabled for the whole time console_unlock() runs (e.g.
> vprintk() calls console_unlock() with interrupts disabled). Thus IPIs
> cannot be processed and other CPUs get stuck spinning in calls like
> smp_call_function_many(). Also RCU eventually starts reporting lockups.
> 
> In my artifical testing I also managed to trigger a situation when disk
> disappeared from the system apparently because commands to / from it
> could not be delivered for long enough. This is why just silencing
> watchdogs isn't a reliable solution to the problem and we simply have to
> avoid spending too long in console_unlock().
> 
> We fix the issue by limiting the time we spend in console_unlock() to
> watchdog_thresh() / 4 (unless we are in an early boot stage or oops is
> happening). The rest of the buffer will be printed either by further
> callers to printk() or during next timer tick.
> 

It still gives me tummy ache :(

The patch adds additional tests of oops_in_progress.  Some description
of your thinking on that matter would be appropriate?

> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -1990,17 +1990,31 @@ int is_console_locked(void)
>  #define PRINTK_PENDING_OUTPUT	2
>  
>  static unsigned long printk_pending;
> +static int last_printing_cpu = -1;
> +
> +static bool __console_unlock(void);
>  
>  void printk_tick(void)

printk_tick() no longer exists in linux-next.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] printk: Avoid softlockups in console_unlock()
  2013-02-15 22:22     ` Andrew Morton
@ 2013-02-18 16:31       ` Jan Kara
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2013-02-18 16:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jan Kara, Steven Rostedt, LKML, Frederic Weisbecker, jslaby,
	Greg Kroah-Hartman, Ingo Molnar, Peter Zijlstra, kay.sievers

On Fri 15-02-13 14:22:19, Andrew Morton wrote:
> On Fri, 15 Feb 2013 17:57:10 +0100
> Jan Kara <jack@suse.cz> wrote:
> 
> > A CPU can be caught in console_unlock() for a long time (tens of seconds are
> > reported by our customers) when other CPUs are using printk heavily and serial
> > console makes printing slow. Despite serial console drivers are calling
> > touch_nmi_watchdog() this triggers softlockup warnings because
> > interrupts are disabled for the whole time console_unlock() runs (e.g.
> > vprintk() calls console_unlock() with interrupts disabled). Thus IPIs
> > cannot be processed and other CPUs get stuck spinning in calls like
> > smp_call_function_many(). Also RCU eventually starts reporting lockups.
> > 
> > In my artifical testing I also managed to trigger a situation when disk
> > disappeared from the system apparently because commands to / from it
> > could not be delivered for long enough. This is why just silencing
> > watchdogs isn't a reliable solution to the problem and we simply have to
> > avoid spending too long in console_unlock().
> > 
> > We fix the issue by limiting the time we spend in console_unlock() to
> > watchdog_thresh() / 4 (unless we are in an early boot stage or oops is
> > happening). The rest of the buffer will be printed either by further
> > callers to printk() or during next timer tick.
> > 
> 
> It still gives me tummy ache :(
  But it's better than it used to be, isn't it? At least I like this
version more than the one with postponing to worker thread since we only
depend on timer ticks to occur...

> The patch adds additional tests of oops_in_progress.  Some description
> of your thinking on that matter would be appropriate?
  Good point, I'll add that. My thinking was that when we are oopsing, all
bets are off and we want to get the messages to console as reliably as
possible and we don't care about soflockups anymore as we have bigger
trouble anyway.

> > --- a/kernel/printk.c
> > +++ b/kernel/printk.c
> > @@ -1990,17 +1990,31 @@ int is_console_locked(void)
> >  #define PRINTK_PENDING_OUTPUT	2
> >  
> >  static unsigned long printk_pending;
> > +static int last_printing_cpu = -1;
> > +
> > +static bool __console_unlock(void);
> >  
> >  void printk_tick(void)
> 
> printk_tick() no longer exists in linux-next.
  Thanks for notice, I'll rebase and fix this up.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-02-18 16:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-06  1:05 [RFC][PATCH] printk: Remove separate printk_sched buffers and use printk buf instead Steven Rostedt
2013-02-06 14:32 ` Jan Kara
2013-02-06 23:02 ` Jan Kara
2013-02-12 12:22   ` Jan Kara
2013-02-14 18:48     ` Steven Rostedt
2013-02-15 16:50       ` Jan Kara
2013-02-15 16:53         ` Steven Rostedt
2013-02-15 17:00           ` Jan Kara
2013-02-15 16:54   ` Jan Kara
2013-02-15 16:57   ` [PATCH 3/3] printk: Avoid softlockups in console_unlock() Jan Kara
2013-02-15 22:22     ` Andrew Morton
2013-02-18 16:31       ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.