linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
@ 2014-05-05 23:18 Steven Rostedt
  2014-05-05 23:33 ` Joe Perches
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Steven Rostedt @ 2014-05-05 23:18 UTC (permalink / raw)
  To: LKML
  Cc: Andrew Morton, Frederic Weisbecker, pmladek, Jan Kara,
	Peter Zijlstra, John Stultz, Paul Gortmaker

To prevent deadlocks with doing a printk inside the scheduler,
printk_sched() was created. The issue is that printk has a console_sem
that it can grab and release. The release does a wake up if there's a
task pending on the sem, and this wake up grabs the rq locks that is
held in the scheduler. This leads to a possible deadlock if the wake up
uses the same rq as the one with the rq lock held already.

What printk_sched() does is to save the printk write in a per cpu buffer
and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
set, the printk() is done against the buffer.

There's a couple of issues with this approach.

1) If two printk_sched()s are called before the tick, the second one
will overwrite the first one.

2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
bit of space wasted for something that is seldom used.

In order to remove this, the printk_sched() can use the printk buffer
instead, and delay the console_trylock()/console_unlock() to the queued
work.

Because printk_sched() would then be taking the logbuf_lock, the
logbuf_lock must not be held while doing anything that may call into the
scheduler functions, which includes wake ups. Unfortunately, printk()
also has a console_sem that it uses, and on release, the
up(&console_sem) may do a wake up of any pending waiters. This must be
avoided while holding the logbuf_lock.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
This version has been forward ported to the 3.15-rc releases.
---
 kernel/printk/printk.c |   87 +++++++++++++++++++++++++++++++------------------
 1 file changed, 56 insertions(+), 31 deletions(-)

Index: linux-trace.git/kernel/printk/printk.c
===================================================================
--- linux-trace.git.orig/kernel/printk/printk.c	2014-05-05 16:46:17.280815365 -0400
+++ linux-trace.git/kernel/printk/printk.c	2014-05-05 16:52:23.398378897 -0400
@@ -208,6 +208,9 @@
 /*
  * The logbuf_lock protects kmsg buffer, indices, counters. It is also
  * used in interesting ways to provide interlocking in console_unlock();
+ * This can be taken within the scheduler's rq lock. It must be released
+ * before calling console_unlock() or anything else that might wake up
+ * a process.
  */
 static DEFINE_RAW_SPINLOCK(logbuf_lock);
 
@@ -1338,27 +1341,43 @@
  * interrupts disabled. It should return with 'lockbuf_lock'
  * released but interrupts still disabled.
  */
-static int console_trylock_for_printk(unsigned int cpu)
+static int console_trylock_for_printk(unsigned int cpu, bool in_sched)
 	__releases(&logbuf_lock)
 {
 	int retval = 0, wake = 0;
 
-	if (console_trylock()) {
-		retval = 1;
+	/* if called from the scheduler, we can not call up() */
+	if (in_sched)
+		goto out;
 
-		/*
-		 * If we can't use the console, we need to release
-		 * the console semaphore by hand to avoid flushing
-		 * the buffer. We need to hold the console semaphore
-		 * in order to do this test safely.
-		 */
-		if (!can_use_console(cpu)) {
-			console_locked = 0;
-			wake = 1;
-			retval = 0;
-		}
+	if (down_trylock(&console_sem))
+		goto out;
+
+	/*
+	 * If we can't use the console, we need to release
+	 * the console semaphore by hand to avoid flushing
+	 * the buffer. We need to hold the console semaphore
+	 * in order to do this test safely.
+	 */
+	if (console_suspended || !can_use_console(cpu)) {
+		wake = 1;
+		goto out;
 	}
+
+	/* console is now locked */
+
+	console_locked = 1;
+	console_may_schedule = 0;
+	mutex_acquire(&console_lock_dep_map, 0, 1, _RET_IP_);
+
+	retval = 1;
+
+out:
 	logbuf_cpu = UINT_MAX;
+	/*
+	 * The logbuf_lock must not be held when doing a wake up,
+	 * which the up(&console_sem) can do.
+	 */
 	raw_spin_unlock(&logbuf_lock);
 	if (wake)
 		up(&console_sem);
@@ -1490,11 +1509,17 @@
 	static int recursion_bug;
 	static char textbuf[LOG_LINE_MAX];
 	char *text = textbuf;
-	size_t text_len;
+	size_t text_len = 0;
 	enum log_flags lflags = 0;
 	unsigned long flags;
 	int this_cpu;
 	int printed_len = 0;
+	bool in_sched = false;
+
+	if (level == -2) {
+		level = -1;
+		in_sched = true;
+	}
 
 	boot_delay_msec(level);
 	printk_delay();
@@ -1540,7 +1565,12 @@
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
 	 */
-	text_len = vscnprintf(text, sizeof(textbuf), fmt, args);
+	if (in_sched)
+		text_len = scnprintf(text, sizeof(textbuf),
+				     KERN_WARNING "[sched_delayed] ");
+
+	text_len += vscnprintf(text + text_len,
+			       sizeof(textbuf) - text_len, fmt, args);
 
 	/* mark and strip a trailing newline */
 	if (text_len && text[text_len-1] == '\n') {
@@ -1621,7 +1651,7 @@
 	 * The console_trylock_for_printk() function will release 'logbuf_lock'
 	 * regardless of whether it actually gets the console semaphore or not.
 	 */
-	if (console_trylock_for_printk(this_cpu))
+	if (console_trylock_for_printk(this_cpu, in_sched))
 		console_unlock();
 
 	lockdep_on();
@@ -2440,18 +2470,20 @@
 #define PRINTK_BUF_SIZE		512
 
 #define PRINTK_PENDING_WAKEUP	0x01
-#define PRINTK_PENDING_SCHED	0x02
+#define PRINTK_PENDING_OUTPUT	0x02
 
 static DEFINE_PER_CPU(int, printk_pending);
-static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
 
 static void wake_up_klogd_work_func(struct irq_work *irq_work)
 {
 	int pending = __this_cpu_xchg(printk_pending, 0);
 
-	if (pending & PRINTK_PENDING_SCHED) {
-		char *buf = __get_cpu_var(printk_sched_buf);
-		pr_warn("[sched_delayed] %s", buf);
+	if (pending & PRINTK_PENDING_OUTPUT) {
+		if (console_trylock())
+			console_unlock();
+		else
+			/* Try again later */
+			__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
 	}
 
 	if (pending & PRINTK_PENDING_WAKEUP)
@@ -2475,21 +2507,14 @@
 
 int printk_sched(const char *fmt, ...)
 {
-	unsigned long flags;
 	va_list args;
-	char *buf;
 	int r;
 
-	local_irq_save(flags);
-	buf = __get_cpu_var(printk_sched_buf);
-
 	va_start(args, fmt);
-	r = vsnprintf(buf, PRINTK_BUF_SIZE, fmt, args);
+	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
 	va_end(args);
 
-	__this_cpu_or(printk_pending, PRINTK_PENDING_SCHED);
-	irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
-	local_irq_restore(flags);
+	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
 
 	return r;
 }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2014-05-05 23:18 [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead Steven Rostedt
@ 2014-05-05 23:33 ` Joe Perches
  2014-05-05 23:55   ` Steven Rostedt
  2014-05-06  9:45 ` Jan Kara
  2014-05-07  9:13 ` Petr Mládek
  2 siblings, 1 reply; 13+ messages in thread
From: Joe Perches @ 2014-05-05 23:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Andrew Morton, Frederic Weisbecker, pmladek, Jan Kara,
	Peter Zijlstra, John Stultz, Paul Gortmaker

On Mon, 2014-05-05 at 19:18 -0400, Steven Rostedt wrote:
> To prevent deadlocks with doing a printk inside the scheduler,
> printk_sched() was created. The issue is that printk has a console_sem
> that it can grab and release. The release does a wake up if there's a
> task pending on the sem, and this wake up grabs the rq locks that is
> held in the scheduler. This leads to a possible deadlock if the wake up
> uses the same rq as the one with the rq lock held already.
[]
>  kernel/printk/printk.c |   87 +++++++++++++++++++++++++++++++------------------
[]
> @@ -2440,18 +2470,20 @@
>  #define PRINTK_BUF_SIZE		512

Wouldn't this define be unused and shouldn't it be deleted too?



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2014-05-05 23:33 ` Joe Perches
@ 2014-05-05 23:55   ` Steven Rostedt
  0 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2014-05-05 23:55 UTC (permalink / raw)
  To: Joe Perches
  Cc: LKML, Andrew Morton, Frederic Weisbecker, pmladek, Jan Kara,
	Peter Zijlstra, John Stultz, Paul Gortmaker

On Mon, 05 May 2014 16:33:51 -0700
Joe Perches <joe@perches.com> wrote:

> On Mon, 2014-05-05 at 19:18 -0400, Steven Rostedt wrote:
> > To prevent deadlocks with doing a printk inside the scheduler,
> > printk_sched() was created. The issue is that printk has a console_sem
> > that it can grab and release. The release does a wake up if there's a
> > task pending on the sem, and this wake up grabs the rq locks that is
> > held in the scheduler. This leads to a possible deadlock if the wake up
> > uses the same rq as the one with the rq lock held already.
> []
> >  kernel/printk/printk.c |   87 +++++++++++++++++++++++++++++++------------------
> []
> > @@ -2440,18 +2470,20 @@
> >  #define PRINTK_BUF_SIZE		512
> 
> Wouldn't this define be unused and shouldn't it be deleted too?
> 

I guess it would.

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2014-05-05 23:18 [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead Steven Rostedt
  2014-05-05 23:33 ` Joe Perches
@ 2014-05-06  9:45 ` Jan Kara
  2014-05-06 11:04   ` Steven Rostedt
  2014-05-07  9:13 ` Petr Mládek
  2 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2014-05-06  9:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Andrew Morton, Frederic Weisbecker, pmladek, Jan Kara,
	Peter Zijlstra, John Stultz, Paul Gortmaker

On Mon 05-05-14 19:18:46, Steven Rostedt wrote:
> To prevent deadlocks with doing a printk inside the scheduler,
> printk_sched() was created. The issue is that printk has a console_sem
> that it can grab and release. The release does a wake up if there's a
> task pending on the sem, and this wake up grabs the rq locks that is
> held in the scheduler. This leads to a possible deadlock if the wake up
> uses the same rq as the one with the rq lock held already.
> 
> What printk_sched() does is to save the printk write in a per cpu buffer
> and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
> set, the printk() is done against the buffer.
> 
> There's a couple of issues with this approach.
> 
> 1) If two printk_sched()s are called before the tick, the second one
> will overwrite the first one.
> 
> 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
> bit of space wasted for something that is seldom used.
> 
> In order to remove this, the printk_sched() can use the printk buffer
> instead, and delay the console_trylock()/console_unlock() to the queued
> work.
> 
> Because printk_sched() would then be taking the logbuf_lock, the
> logbuf_lock must not be held while doing anything that may call into the
> scheduler functions, which includes wake ups. Unfortunately, printk()
> also has a console_sem that it uses, and on release, the
> up(&console_sem) may do a wake up of any pending waiters. This must be
> avoided while holding the logbuf_lock.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
  Andrew already has this patch in -mm tree AFAIK...

								Honza

> ---
> This version has been forward ported to the 3.15-rc releases.
> ---
>  kernel/printk/printk.c |   87 +++++++++++++++++++++++++++++++------------------
>  1 file changed, 56 insertions(+), 31 deletions(-)
> 
> Index: linux-trace.git/kernel/printk/printk.c
> ===================================================================
> --- linux-trace.git.orig/kernel/printk/printk.c	2014-05-05 16:46:17.280815365 -0400
> +++ linux-trace.git/kernel/printk/printk.c	2014-05-05 16:52:23.398378897 -0400
> @@ -208,6 +208,9 @@
>  /*
>   * The logbuf_lock protects kmsg buffer, indices, counters. It is also
>   * used in interesting ways to provide interlocking in console_unlock();
> + * This can be taken within the scheduler's rq lock. It must be released
> + * before calling console_unlock() or anything else that might wake up
> + * a process.
>   */
>  static DEFINE_RAW_SPINLOCK(logbuf_lock);
>  
> @@ -1338,27 +1341,43 @@
>   * interrupts disabled. It should return with 'lockbuf_lock'
>   * released but interrupts still disabled.
>   */
> -static int console_trylock_for_printk(unsigned int cpu)
> +static int console_trylock_for_printk(unsigned int cpu, bool in_sched)
>  	__releases(&logbuf_lock)
>  {
>  	int retval = 0, wake = 0;
>  
> -	if (console_trylock()) {
> -		retval = 1;
> +	/* if called from the scheduler, we can not call up() */
> +	if (in_sched)
> +		goto out;
>  
> -		/*
> -		 * If we can't use the console, we need to release
> -		 * the console semaphore by hand to avoid flushing
> -		 * the buffer. We need to hold the console semaphore
> -		 * in order to do this test safely.
> -		 */
> -		if (!can_use_console(cpu)) {
> -			console_locked = 0;
> -			wake = 1;
> -			retval = 0;
> -		}
> +	if (down_trylock(&console_sem))
> +		goto out;
> +
> +	/*
> +	 * If we can't use the console, we need to release
> +	 * the console semaphore by hand to avoid flushing
> +	 * the buffer. We need to hold the console semaphore
> +	 * in order to do this test safely.
> +	 */
> +	if (console_suspended || !can_use_console(cpu)) {
> +		wake = 1;
> +		goto out;
>  	}
> +
> +	/* console is now locked */
> +
> +	console_locked = 1;
> +	console_may_schedule = 0;
> +	mutex_acquire(&console_lock_dep_map, 0, 1, _RET_IP_);
> +
> +	retval = 1;
> +
> +out:
>  	logbuf_cpu = UINT_MAX;
> +	/*
> +	 * The logbuf_lock must not be held when doing a wake up,
> +	 * which the up(&console_sem) can do.
> +	 */
>  	raw_spin_unlock(&logbuf_lock);
>  	if (wake)
>  		up(&console_sem);
> @@ -1490,11 +1509,17 @@
>  	static int recursion_bug;
>  	static char textbuf[LOG_LINE_MAX];
>  	char *text = textbuf;
> -	size_t text_len;
> +	size_t text_len = 0;
>  	enum log_flags lflags = 0;
>  	unsigned long flags;
>  	int this_cpu;
>  	int printed_len = 0;
> +	bool in_sched = false;
> +
> +	if (level == -2) {
> +		level = -1;
> +		in_sched = true;
> +	}
>  
>  	boot_delay_msec(level);
>  	printk_delay();
> @@ -1540,7 +1565,12 @@
>  	 * The printf needs to come first; we need the syslog
>  	 * prefix which might be passed-in as a parameter.
>  	 */
> -	text_len = vscnprintf(text, sizeof(textbuf), fmt, args);
> +	if (in_sched)
> +		text_len = scnprintf(text, sizeof(textbuf),
> +				     KERN_WARNING "[sched_delayed] ");
> +
> +	text_len += vscnprintf(text + text_len,
> +			       sizeof(textbuf) - text_len, fmt, args);
>  
>  	/* mark and strip a trailing newline */
>  	if (text_len && text[text_len-1] == '\n') {
> @@ -1621,7 +1651,7 @@
>  	 * The console_trylock_for_printk() function will release 'logbuf_lock'
>  	 * regardless of whether it actually gets the console semaphore or not.
>  	 */
> -	if (console_trylock_for_printk(this_cpu))
> +	if (console_trylock_for_printk(this_cpu, in_sched))
>  		console_unlock();
>  
>  	lockdep_on();
> @@ -2440,18 +2470,20 @@
>  #define PRINTK_BUF_SIZE		512
>  
>  #define PRINTK_PENDING_WAKEUP	0x01
> -#define PRINTK_PENDING_SCHED	0x02
> +#define PRINTK_PENDING_OUTPUT	0x02
>  
>  static DEFINE_PER_CPU(int, printk_pending);
> -static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
>  
>  static void wake_up_klogd_work_func(struct irq_work *irq_work)
>  {
>  	int pending = __this_cpu_xchg(printk_pending, 0);
>  
> -	if (pending & PRINTK_PENDING_SCHED) {
> -		char *buf = __get_cpu_var(printk_sched_buf);
> -		pr_warn("[sched_delayed] %s", buf);
> +	if (pending & PRINTK_PENDING_OUTPUT) {
> +		if (console_trylock())
> +			console_unlock();
> +		else
> +			/* Try again later */
> +			__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
>  	}
>  
>  	if (pending & PRINTK_PENDING_WAKEUP)
> @@ -2475,21 +2507,14 @@
>  
>  int printk_sched(const char *fmt, ...)
>  {
> -	unsigned long flags;
>  	va_list args;
> -	char *buf;
>  	int r;
>  
> -	local_irq_save(flags);
> -	buf = __get_cpu_var(printk_sched_buf);
> -
>  	va_start(args, fmt);
> -	r = vsnprintf(buf, PRINTK_BUF_SIZE, fmt, args);
> +	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
>  	va_end(args);
>  
> -	__this_cpu_or(printk_pending, PRINTK_PENDING_SCHED);
> -	irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
> -	local_irq_restore(flags);
> +	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
>  
>  	return r;
>  }
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2014-05-06  9:45 ` Jan Kara
@ 2014-05-06 11:04   ` Steven Rostedt
  2014-05-06 23:37     ` Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2014-05-06 11:04 UTC (permalink / raw)
  To: Jan Kara
  Cc: LKML, Andrew Morton, Frederic Weisbecker, pmladek,
	Peter Zijlstra, John Stultz, Paul Gortmaker

On Tue, 6 May 2014 11:45:57 +0200
Jan Kara <jack@suse.cz> wrote:

> On Mon 05-05-14 19:18:46, Steven Rostedt wrote:
> > To prevent deadlocks with doing a printk inside the scheduler,
> > printk_sched() was created. The issue is that printk has a console_sem
> > that it can grab and release. The release does a wake up if there's a
> > task pending on the sem, and this wake up grabs the rq locks that is
> > held in the scheduler. This leads to a possible deadlock if the wake up
> > uses the same rq as the one with the rq lock held already.
> > 
> > What printk_sched() does is to save the printk write in a per cpu buffer
> > and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
> > set, the printk() is done against the buffer.
> > 
> > There's a couple of issues with this approach.
> > 
> > 1) If two printk_sched()s are called before the tick, the second one
> > will overwrite the first one.
> > 
> > 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
> > bit of space wasted for something that is seldom used.
> > 
> > In order to remove this, the printk_sched() can use the printk buffer
> > instead, and delay the console_trylock()/console_unlock() to the queued
> > work.
> > 
> > Because printk_sched() would then be taking the logbuf_lock, the
> > logbuf_lock must not be held while doing anything that may call into the
> > scheduler functions, which includes wake ups. Unfortunately, printk()
> > also has a console_sem that it uses, and on release, the
> > up(&console_sem) may do a wake up of any pending waiters. This must be
> > avoided while holding the logbuf_lock.
> > 
> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
>   Andrew already has this patch in -mm tree AFAIK...

Ah, and somebody told me that it wasn't in linux-next. But looking at
it now, it seems to be.

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2014-05-06 11:04   ` Steven Rostedt
@ 2014-05-06 23:37     ` Andrew Morton
  2014-05-06 23:46       ` Steven Rostedt
  2014-05-07  0:05       ` Steven Rostedt
  0 siblings, 2 replies; 13+ messages in thread
From: Andrew Morton @ 2014-05-06 23:37 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jan Kara, LKML, Frederic Weisbecker, pmladek, Peter Zijlstra,
	John Stultz, Paul Gortmaker

On Tue, 6 May 2014 07:04:46 -0400 Steven Rostedt <rostedt@goodmis.org> wrote:

> On Tue, 6 May 2014 11:45:57 +0200
> Jan Kara <jack@suse.cz> wrote:
> 
> > On Mon 05-05-14 19:18:46, Steven Rostedt wrote:
> > > To prevent deadlocks with doing a printk inside the scheduler,
> > > printk_sched() was created. The issue is that printk has a console_sem
> > > that it can grab and release. The release does a wake up if there's a
> > > task pending on the sem, and this wake up grabs the rq locks that is
> > > held in the scheduler. This leads to a possible deadlock if the wake up
> > > uses the same rq as the one with the rq lock held already.
> > > 
> > > What printk_sched() does is to save the printk write in a per cpu buffer
> > > and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
> > > set, the printk() is done against the buffer.
> > > 
> > > There's a couple of issues with this approach.
> > > 
> > > 1) If two printk_sched()s are called before the tick, the second one
> > > will overwrite the first one.
> > > 
> > > 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
> > > bit of space wasted for something that is seldom used.
> > > 
> > > In order to remove this, the printk_sched() can use the printk buffer
> > > instead, and delay the console_trylock()/console_unlock() to the queued
> > > work.
> > > 
> > > Because printk_sched() would then be taking the logbuf_lock, the
> > > logbuf_lock must not be held while doing anything that may call into the
> > > scheduler functions, which includes wake ups. Unfortunately, printk()
> > > also has a console_sem that it uses, and on release, the
> > > up(&console_sem) may do a wake up of any pending waiters. This must be
> > > avoided while holding the logbuf_lock.
> > > 
> > > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> >   Andrew already has this patch in -mm tree AFAIK...
> 
> Ah, and somebody told me that it wasn't in linux-next. But looking at
> it now, it seems to be.
> 

What I have in -next is very different from this version of the patch. 
What's happening?


From: Steven Rostedt <rostedt@goodmis.org>
Subject: printk: remove separate printk_sched buffers and use printk buf instead

To prevent deadlocks with doing a printk inside the scheduler,
printk_sched() was created.  The issue is that printk has a console_sem
that it can grab and release.  The release does a wake up if there's a
task pending on the sem, and this wake up grabs the rq locks that is held
in the scheduler.  This leads to a possible deadlock if the wake up uses
the same rq as the one with the rq lock held already.

What printk_sched() does is to save the printk write in a per cpu buffer
and sets the PRINTK_PENDING_SCHED flag.  On a timer tick, if this flag is
set, the printk() is done against the buffer.

There's a couple of issues with this approach.

1) If two printk_sched()s are called before the tick, the second one
   will overwrite the first one.

2) The temporary buffer is 512 bytes and is per cpu.  This is a quite a
   bit of space wasted for something that is seldom used.

In order to remove this, the printk_sched() can use the printk buffer
instead, and delay the console_trylock()/console_unlock() to the queued
work.

Because printk_sched() would then be taking the logbuf_lock, the
logbuf_lock must not be held while doing anything that may call into the
scheduler functions, which includes wake ups.  Unfortunately, printk()
also has a console_sem that it uses, and on release, the up(&console_sem)
may do a wake up of any pending waiters.  This must be avoided while
holding the logbuf_lock.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/printk/printk.c |   47 ++++++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 18 deletions(-)

diff -puN kernel/printk/printk.c~printk-remove-separate-printk_sched-buffers-and-use-printk-buf-instead kernel/printk/printk.c
--- a/kernel/printk/printk.c~printk-remove-separate-printk_sched-buffers-and-use-printk-buf-instead
+++ a/kernel/printk/printk.c
@@ -68,6 +68,9 @@ int console_printk[4] = {
 	DEFAULT_CONSOLE_LOGLEVEL,	/* default_console_loglevel */
 };
 
+/* Deferred messaged from sched code are marked by this special level */
+#define SCHED_MESSAGE_LOGLEVEL -2
+
 /*
  * Low level drivers may need that to know if they can schedule in
  * their unblank() callback or not. So let's export it.
@@ -229,7 +232,9 @@ struct printk_log {
 };
 
 /*
- * The logbuf_lock protects kmsg buffer, indices, counters.
+ * The logbuf_lock protects kmsg buffer, indices, counters.  This can be taken
+ * within the scheduler's rq lock. It must be released before calling
+ * console_unlock() or anything else that might wake up a process.
  */
 static DEFINE_RAW_SPINLOCK(logbuf_lock);
 
@@ -1577,14 +1582,19 @@ asmlinkage int vprintk_emit(int facility
 	static int recursion_bug;
 	static char textbuf[LOG_LINE_MAX];
 	char *text = textbuf;
-	size_t text_len;
+	size_t text_len = 0;
 	enum log_flags lflags = 0;
 	unsigned long flags;
 	int this_cpu;
 	int printed_len = 0;
+	bool in_sched = false;
 	/* cpu currently holding logbuf_lock in this function */
 	static volatile unsigned int logbuf_cpu = UINT_MAX;
 
+	if (level == SCHED_MESSAGE_LOGLEVEL) {
+		level = -1;
+		in_sched = true;
+	}
 
 	boot_delay_msec(level);
 	printk_delay();
@@ -1631,7 +1641,12 @@ asmlinkage int vprintk_emit(int facility
 	 * The printf needs to come first; we need the syslog
 	 * prefix which might be passed-in as a parameter.
 	 */
-	text_len = vscnprintf(text, sizeof(textbuf), fmt, args);
+	if (in_sched)
+		text_len = scnprintf(text, sizeof(textbuf),
+				     KERN_WARNING "[sched_delayed] ");
+
+	text_len += vscnprintf(text + text_len,
+			       sizeof(textbuf) - text_len, fmt, args);
 
 	/* mark and strip a trailing newline */
 	if (text_len && text[text_len-1] == '\n') {
@@ -1713,6 +1728,10 @@ asmlinkage int vprintk_emit(int facility
 	lockdep_on();
 	local_irq_restore(flags);
 
+	/* If called from the scheduler, we can not call up(). */
+	if (in_sched)
+		return printed_len;
+
 	/*
 	 * Disable preemption to avoid being preempted while holding
 	 * console_sem which would prevent anyone from printing to console
@@ -2531,21 +2550,19 @@ late_initcall(printk_late_init);
 /*
  * Delayed printk version, for scheduler-internal messages:
  */
-#define PRINTK_BUF_SIZE		512
-
 #define PRINTK_PENDING_WAKEUP	0x01
-#define PRINTK_PENDING_SCHED	0x02
+#define PRINTK_PENDING_OUTPUT	0x02
 
 static DEFINE_PER_CPU(int, printk_pending);
-static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
 
 static void wake_up_klogd_work_func(struct irq_work *irq_work)
 {
 	int pending = __this_cpu_xchg(printk_pending, 0);
 
-	if (pending & PRINTK_PENDING_SCHED) {
-		char *buf = __get_cpu_var(printk_sched_buf);
-		pr_warn("[sched_delayed] %s", buf);
+	if (pending & PRINTK_PENDING_OUTPUT) {
+		/* If trylock fails, someone else is doing the printing */
+		if (console_trylock())
+			console_unlock();
 	}
 
 	if (pending & PRINTK_PENDING_WAKEUP)
@@ -2569,21 +2586,15 @@ void wake_up_klogd(void)
 
 int printk_sched(const char *fmt, ...)
 {
-	unsigned long flags;
 	va_list args;
-	char *buf;
 	int r;
 
-	local_irq_save(flags);
-	buf = __get_cpu_var(printk_sched_buf);
-
 	va_start(args, fmt);
-	r = vsnprintf(buf, PRINTK_BUF_SIZE, fmt, args);
+	r = vprintk_emit(0, SCHED_MESSAGE_LOGLEVEL, NULL, 0, fmt, args);
 	va_end(args);
 
-	__this_cpu_or(printk_pending, PRINTK_PENDING_SCHED);
+	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
 	irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
-	local_irq_restore(flags);
 
 	return r;
 }
_

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2014-05-06 23:37     ` Andrew Morton
@ 2014-05-06 23:46       ` Steven Rostedt
  2014-05-07  0:05       ` Steven Rostedt
  1 sibling, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2014-05-06 23:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jan Kara, LKML, Frederic Weisbecker, pmladek, Peter Zijlstra,
	John Stultz, Paul Gortmaker

On Tue, 6 May 2014 16:37:14 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> What I have in -next is very different from this version of the patch. 
> What's happening?

Hmm, good question.

Let me review this one. The one I sent recently was one I pulled from
my "sent" folder. I thought it was the latest one, but perhaps I sent
another one after that.


I'll go investigate.

-- Steve

> 
> 
> From: Steven Rostedt <rostedt@goodmis.org>
> Subject: printk: remove separate printk_sched buffers and use printk buf instead
> 
> To prevent deadlocks with doing a printk inside the scheduler,
> printk_sched() was created.  The issue is that printk has a console_sem
> that it can grab and release.  The release does a wake up if there's a
> task pending on the sem, and this wake up grabs the rq locks that is held
> in the scheduler.  This leads to a possible deadlock if the wake up uses
> the same rq as the one with the rq lock held already.
> 
> What printk_sched() does is to save the printk write in a per cpu buffer
> and sets the PRINTK_PENDING_SCHED flag.  On a timer tick, if this flag is
> set, the printk() is done against the buffer.
> 
> There's a couple of issues with this approach.
> 
> 1) If two printk_sched()s are called before the tick, the second one
>    will overwrite the first one.
> 
> 2) The temporary buffer is 512 bytes and is per cpu.  This is a quite a
>    bit of space wasted for something that is seldom used.
> 
> In order to remove this, the printk_sched() can use the printk buffer
> instead, and delay the console_trylock()/console_unlock() to the queued
> work.
> 
> Because printk_sched() would then be taking the logbuf_lock, the
> logbuf_lock must not be held while doing anything that may call into the
> scheduler functions, which includes wake ups.  Unfortunately, printk()
> also has a console_sem that it uses, and on release, the up(&console_sem)
> may do a wake up of any pending waiters.  This must be avoided while
> holding the logbuf_lock.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  kernel/printk/printk.c |   47 ++++++++++++++++++++++++---------------
>  1 file changed, 29 insertions(+), 18 deletions(-)
> 
> diff -puN kernel/printk/printk.c~printk-remove-separate-printk_sched-buffers-and-use-printk-buf-instead kernel/printk/printk.c
> --- a/kernel/printk/printk.c~printk-remove-separate-printk_sched-buffers-and-use-printk-buf-instead
> +++ a/kernel/printk/printk.c
> @@ -68,6 +68,9 @@ int console_printk[4] = {
>  	DEFAULT_CONSOLE_LOGLEVEL,	/* default_console_loglevel */
>  };
>  
> +/* Deferred messaged from sched code are marked by this special level */
> +#define SCHED_MESSAGE_LOGLEVEL -2
> +
>  /*
>   * Low level drivers may need that to know if they can schedule in
>   * their unblank() callback or not. So let's export it.
> @@ -229,7 +232,9 @@ struct printk_log {
>  };
>  
>  /*
> - * The logbuf_lock protects kmsg buffer, indices, counters.
> + * The logbuf_lock protects kmsg buffer, indices, counters.  This can be taken
> + * within the scheduler's rq lock. It must be released before calling
> + * console_unlock() or anything else that might wake up a process.
>   */
>  static DEFINE_RAW_SPINLOCK(logbuf_lock);
>  
> @@ -1577,14 +1582,19 @@ asmlinkage int vprintk_emit(int facility
>  	static int recursion_bug;
>  	static char textbuf[LOG_LINE_MAX];
>  	char *text = textbuf;
> -	size_t text_len;
> +	size_t text_len = 0;
>  	enum log_flags lflags = 0;
>  	unsigned long flags;
>  	int this_cpu;
>  	int printed_len = 0;
> +	bool in_sched = false;
>  	/* cpu currently holding logbuf_lock in this function */
>  	static volatile unsigned int logbuf_cpu = UINT_MAX;
>  
> +	if (level == SCHED_MESSAGE_LOGLEVEL) {
> +		level = -1;
> +		in_sched = true;
> +	}
>  
>  	boot_delay_msec(level);
>  	printk_delay();
> @@ -1631,7 +1641,12 @@ asmlinkage int vprintk_emit(int facility
>  	 * The printf needs to come first; we need the syslog
>  	 * prefix which might be passed-in as a parameter.
>  	 */
> -	text_len = vscnprintf(text, sizeof(textbuf), fmt, args);
> +	if (in_sched)
> +		text_len = scnprintf(text, sizeof(textbuf),
> +				     KERN_WARNING "[sched_delayed] ");
> +
> +	text_len += vscnprintf(text + text_len,
> +			       sizeof(textbuf) - text_len, fmt, args);
>  
>  	/* mark and strip a trailing newline */
>  	if (text_len && text[text_len-1] == '\n') {
> @@ -1713,6 +1728,10 @@ asmlinkage int vprintk_emit(int facility
>  	lockdep_on();
>  	local_irq_restore(flags);
>  
> +	/* If called from the scheduler, we can not call up(). */
> +	if (in_sched)
> +		return printed_len;
> +
>  	/*
>  	 * Disable preemption to avoid being preempted while holding
>  	 * console_sem which would prevent anyone from printing to console
> @@ -2531,21 +2550,19 @@ late_initcall(printk_late_init);
>  /*
>   * Delayed printk version, for scheduler-internal messages:
>   */
> -#define PRINTK_BUF_SIZE		512
> -
>  #define PRINTK_PENDING_WAKEUP	0x01
> -#define PRINTK_PENDING_SCHED	0x02
> +#define PRINTK_PENDING_OUTPUT	0x02
>  
>  static DEFINE_PER_CPU(int, printk_pending);
> -static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
>  
>  static void wake_up_klogd_work_func(struct irq_work *irq_work)
>  {
>  	int pending = __this_cpu_xchg(printk_pending, 0);
>  
> -	if (pending & PRINTK_PENDING_SCHED) {
> -		char *buf = __get_cpu_var(printk_sched_buf);
> -		pr_warn("[sched_delayed] %s", buf);
> +	if (pending & PRINTK_PENDING_OUTPUT) {
> +		/* If trylock fails, someone else is doing the printing */
> +		if (console_trylock())
> +			console_unlock();
>  	}
>  
>  	if (pending & PRINTK_PENDING_WAKEUP)
> @@ -2569,21 +2586,15 @@ void wake_up_klogd(void)
>  
>  int printk_sched(const char *fmt, ...)
>  {
> -	unsigned long flags;
>  	va_list args;
> -	char *buf;
>  	int r;
>  
> -	local_irq_save(flags);
> -	buf = __get_cpu_var(printk_sched_buf);
> -
>  	va_start(args, fmt);
> -	r = vsnprintf(buf, PRINTK_BUF_SIZE, fmt, args);
> +	r = vprintk_emit(0, SCHED_MESSAGE_LOGLEVEL, NULL, 0, fmt, args);
>  	va_end(args);
>  
> -	__this_cpu_or(printk_pending, PRINTK_PENDING_SCHED);
> +	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
>  	irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
> -	local_irq_restore(flags);
>  
>  	return r;
>  }
> _


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2014-05-06 23:37     ` Andrew Morton
  2014-05-06 23:46       ` Steven Rostedt
@ 2014-05-07  0:05       ` Steven Rostedt
  2014-05-07 14:20         ` Jan Kara
  1 sibling, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2014-05-07  0:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jan Kara, LKML, Frederic Weisbecker, pmladek, Peter Zijlstra,
	John Stultz, Paul Gortmaker

On Tue, 6 May 2014 16:37:14 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:


> What I have in -next is very different from this version of the patch. 
> What's happening?

Hmm, it looks massaged from what I originally sent in order to be
placed after the other patches in the series. A quick review of the
patch seems to be mostly the same logic, but it is a bit different.

Jan, did you update my patch?

> 
> 
> From: Steven Rostedt <rostedt@goodmis.org>
> Subject: printk: remove separate printk_sched buffers and use printk buf instead
> 
> To prevent deadlocks with doing a printk inside the scheduler,
> printk_sched() was created.  The issue is that printk has a console_sem
> that it can grab and release.  The release does a wake up if there's a
> task pending on the sem, and this wake up grabs the rq locks that is held
> in the scheduler.  This leads to a possible deadlock if the wake up uses
> the same rq as the one with the rq lock held already.
> 
> What printk_sched() does is to save the printk write in a per cpu buffer
> and sets the PRINTK_PENDING_SCHED flag.  On a timer tick, if this flag is
> set, the printk() is done against the buffer.
> 
> There's a couple of issues with this approach.
> 
> 1) If two printk_sched()s are called before the tick, the second one
>    will overwrite the first one.
> 
> 2) The temporary buffer is 512 bytes and is per cpu.  This is a quite a
>    bit of space wasted for something that is seldom used.
> 
> In order to remove this, the printk_sched() can use the printk buffer
> instead, and delay the console_trylock()/console_unlock() to the queued
> work.
> 
> Because printk_sched() would then be taking the logbuf_lock, the
> logbuf_lock must not be held while doing anything that may call into the
> scheduler functions, which includes wake ups.  Unfortunately, printk()
> also has a console_sem that it uses, and on release, the up(&console_sem)
> may do a wake up of any pending waiters.  This must be avoided while
> holding the logbuf_lock.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Jan, if you did massage my patch, you should add here what you did.
Usually in [brackets]. Otherwise it puts the blame on me if something
breaks, or adds confusion if I happen to send out another patch like I
just did.

-- Steve

> Signed-off-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2014-05-05 23:18 [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead Steven Rostedt
  2014-05-05 23:33 ` Joe Perches
  2014-05-06  9:45 ` Jan Kara
@ 2014-05-07  9:13 ` Petr Mládek
  2014-05-07 14:33   ` Jan Kara
  2 siblings, 1 reply; 13+ messages in thread
From: Petr Mládek @ 2014-05-07  9:13 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Andrew Morton, Frederic Weisbecker, Jan Kara,
	Peter Zijlstra, John Stultz, Paul Gortmaker

On Mon 2014-05-05 19:18:46, Steven Rostedt wrote:
> To prevent deadlocks with doing a printk inside the scheduler,
> printk_sched() was created. The issue is that printk has a console_sem
> that it can grab and release. The release does a wake up if there's a
> task pending on the sem, and this wake up grabs the rq locks that is
> held in the scheduler. This leads to a possible deadlock if the wake up
> uses the same rq as the one with the rq lock held already.
> 
> What printk_sched() does is to save the printk write in a per cpu buffer
> and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
> set, the printk() is done against the buffer.
> 
> There's a couple of issues with this approach.
> 
> 1) If two printk_sched()s are called before the tick, the second one
> will overwrite the first one.
> 
> 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
> bit of space wasted for something that is seldom used.
> 
> In order to remove this, the printk_sched() can use the printk buffer
> instead, and delay the console_trylock()/console_unlock() to the queued
> work.
> 
> Because printk_sched() would then be taking the logbuf_lock, the
> logbuf_lock must not be held while doing anything that may call into the
> scheduler functions, which includes wake ups. Unfortunately, printk()
> also has a console_sem that it uses, and on release, the
> up(&console_sem) may do a wake up of any pending waiters. This must be
> avoided while holding the logbuf_lock.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
> This version has been forward ported to the 3.15-rc releases.
> ---
>  kernel/printk/printk.c |   87 +++++++++++++++++++++++++++++++------------------
>  1 file changed, 56 insertions(+), 31 deletions(-)
> 
> Index: linux-trace.git/kernel/printk/printk.c
> ===================================================================
> --- linux-trace.git.orig/kernel/printk/printk.c	2014-05-05 16:46:17.280815365 -0400
> +++ linux-trace.git/kernel/printk/printk.c	2014-05-05 16:52:23.398378897 -0400
> @@ -208,6 +208,9 @@
>  /*
>   * The logbuf_lock protects kmsg buffer, indices, counters. It is also
>   * used in interesting ways to provide interlocking in console_unlock();
> + * This can be taken within the scheduler's rq lock. It must be released
> + * before calling console_unlock() or anything else that might wake up
> + * a process.
>   */
>  static DEFINE_RAW_SPINLOCK(logbuf_lock);
>  
> @@ -1338,27 +1341,43 @@
>   * interrupts disabled. It should return with 'lockbuf_lock'
>   * released but interrupts still disabled.
>   */
> -static int console_trylock_for_printk(unsigned int cpu)
> +static int console_trylock_for_printk(unsigned int cpu, bool in_sched)
>  	__releases(&logbuf_lock)
>  {
>  	int retval = 0, wake = 0;
>  
> -	if (console_trylock()) {
> -		retval = 1;
> +	/* if called from the scheduler, we can not call up() */
> +	if (in_sched)
> +		goto out;
>  
> -		/*
> -		 * If we can't use the console, we need to release
> -		 * the console semaphore by hand to avoid flushing
> -		 * the buffer. We need to hold the console semaphore
> -		 * in order to do this test safely.
> -		 */
> -		if (!can_use_console(cpu)) {
> -			console_locked = 0;
> -			wake = 1;
> -			retval = 0;
> -		}
> +	if (down_trylock(&console_sem))
> +		goto out;
> +
> +	/*
> +	 * If we can't use the console, we need to release
> +	 * the console semaphore by hand to avoid flushing
> +	 * the buffer. We need to hold the console semaphore
> +	 * in order to do this test safely.
> +	 */
> +	if (console_suspended || !can_use_console(cpu)) {
> +		wake = 1;
> +		goto out;
>  	}
> +
> +	/* console is now locked */
> +
> +	console_locked = 1;
> +	console_may_schedule = 0;
> +	mutex_acquire(&console_lock_dep_map, 0, 1, _RET_IP_);
> +
> +	retval = 1;
> +
> +out:
>  	logbuf_cpu = UINT_MAX;
> +	/*
> +	 * The logbuf_lock must not be held when doing a wake up,
> +	 * which the up(&console_sem) can do.
> +	 */
>  	raw_spin_unlock(&logbuf_lock);
>  	if (wake)
>  		up(&console_sem);
> @@ -1490,11 +1509,17 @@
>  	static int recursion_bug;
>  	static char textbuf[LOG_LINE_MAX];
>  	char *text = textbuf;
> -	size_t text_len;
> +	size_t text_len = 0;
>  	enum log_flags lflags = 0;
>  	unsigned long flags;
>  	int this_cpu;
>  	int printed_len = 0;
> +	bool in_sched = false;
> +
> +	if (level == -2) {
> +		level = -1;
> +		in_sched = true;
> +	}
>  
>  	boot_delay_msec(level);
>  	printk_delay();
> @@ -1540,7 +1565,12 @@
>  	 * The printf needs to come first; we need the syslog
>  	 * prefix which might be passed-in as a parameter.
>  	 */
> -	text_len = vscnprintf(text, sizeof(textbuf), fmt, args);
> +	if (in_sched)
> +		text_len = scnprintf(text, sizeof(textbuf),
> +				     KERN_WARNING "[sched_delayed] ");
> +
> +	text_len += vscnprintf(text + text_len,
> +			       sizeof(textbuf) - text_len, fmt, args);
>  
>  	/* mark and strip a trailing newline */
>  	if (text_len && text[text_len-1] == '\n') {
> @@ -1621,7 +1651,7 @@
>  	 * The console_trylock_for_printk() function will release 'logbuf_lock'
>  	 * regardless of whether it actually gets the console semaphore or not.
>  	 */
> -	if (console_trylock_for_printk(this_cpu))
> +	if (console_trylock_for_printk(this_cpu, in_sched))
>  		console_unlock();
>  
>  	lockdep_on();
> @@ -2440,18 +2470,20 @@
>  #define PRINTK_BUF_SIZE		512
>  
>  #define PRINTK_PENDING_WAKEUP	0x01
> -#define PRINTK_PENDING_SCHED	0x02
> +#define PRINTK_PENDING_OUTPUT	0x02
>  
>  static DEFINE_PER_CPU(int, printk_pending);
> -static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
>  
>  static void wake_up_klogd_work_func(struct irq_work *irq_work)
>  {
>  	int pending = __this_cpu_xchg(printk_pending, 0);
>  
> -	if (pending & PRINTK_PENDING_SCHED) {
> -		char *buf = __get_cpu_var(printk_sched_buf);
> -		pr_warn("[sched_delayed] %s", buf);
> +	if (pending & PRINTK_PENDING_OUTPUT) {
> +		if (console_trylock())
> +			console_unlock();

I wonder if we should call here console_trylock_for_printk() which checks
whether the console is really usable.

The check for usable console was introduced in the commit
76a8ad293912cd2f (Make printk work for really early debugging).
I think that this IRQ work could get called during early boot,
so the check would make sense here as well. Or have I missed something?

Best Regards,
Petr

> +		else
> +			/* Try again later */
> +			__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
>  	}
>  
>  	if (pending & PRINTK_PENDING_WAKEUP)
> @@ -2475,21 +2507,14 @@
>  
>  int printk_sched(const char *fmt, ...)
>  {
> -	unsigned long flags;
>  	va_list args;
> -	char *buf;
>  	int r;
>  
> -	local_irq_save(flags);
> -	buf = __get_cpu_var(printk_sched_buf);
> -
>  	va_start(args, fmt);
> -	r = vsnprintf(buf, PRINTK_BUF_SIZE, fmt, args);
> +	r = vprintk_emit(0, -2, NULL, 0, fmt, args);
>  	va_end(args);
>  
> -	__this_cpu_or(printk_pending, PRINTK_PENDING_SCHED);
> -	irq_work_queue(&__get_cpu_var(wake_up_klogd_work));
> -	local_irq_restore(flags);
> +	__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
>  
>  	return r;
>  }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2014-05-07  0:05       ` Steven Rostedt
@ 2014-05-07 14:20         ` Jan Kara
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2014-05-07 14:20 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andrew Morton, Jan Kara, LKML, Frederic Weisbecker, pmladek,
	Peter Zijlstra, John Stultz, Paul Gortmaker

On Tue 06-05-14 20:05:19, Steven Rostedt wrote:
> On Tue, 6 May 2014 16:37:14 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> 
> > What I have in -next is very different from this version of the patch. 
> > What's happening?
> 
> Hmm, it looks massaged from what I originally sent in order to be
> placed after the other patches in the series. A quick review of the
> patch seems to be mostly the same logic, but it is a bit different.
> 
> Jan, did you update my patch?
  Yes, I was carrying it for quite long and with each rebase and rewrite of
the patch set it got massaged a bit. So in the end it's probably somewhat
different from what you originally submitted.

> > From: Steven Rostedt <rostedt@goodmis.org>
> > Subject: printk: remove separate printk_sched buffers and use printk buf instead
> > 
> > To prevent deadlocks with doing a printk inside the scheduler,
> > printk_sched() was created.  The issue is that printk has a console_sem
> > that it can grab and release.  The release does a wake up if there's a
> > task pending on the sem, and this wake up grabs the rq locks that is held
> > in the scheduler.  This leads to a possible deadlock if the wake up uses
> > the same rq as the one with the rq lock held already.
> > 
> > What printk_sched() does is to save the printk write in a per cpu buffer
> > and sets the PRINTK_PENDING_SCHED flag.  On a timer tick, if this flag is
> > set, the printk() is done against the buffer.
> > 
> > There's a couple of issues with this approach.
> > 
> > 1) If two printk_sched()s are called before the tick, the second one
> >    will overwrite the first one.
> > 
> > 2) The temporary buffer is 512 bytes and is per cpu.  This is a quite a
> >    bit of space wasted for something that is seldom used.
> > 
> > In order to remove this, the printk_sched() can use the printk buffer
> > instead, and delay the console_trylock()/console_unlock() to the queued
> > work.
> > 
> > Because printk_sched() would then be taking the logbuf_lock, the
> > logbuf_lock must not be held while doing anything that may call into the
> > scheduler functions, which includes wake ups.  Unfortunately, printk()
> > also has a console_sem that it uses, and on release, the up(&console_sem)
> > may do a wake up of any pending waiters.  This must be avoided while
> > holding the logbuf_lock.
> > 
> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> 
> Jan, if you did massage my patch, you should add here what you did.
> Usually in [brackets]. Otherwise it puts the blame on me if something
> breaks, or adds confusion if I happen to send out another patch like I
> just did.
  Yeah, sorry for that. I should have been updating the changelog with what
I did...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2014-05-07  9:13 ` Petr Mládek
@ 2014-05-07 14:33   ` Jan Kara
  2014-05-07 15:57     ` Petr Mládek
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2014-05-07 14:33 UTC (permalink / raw)
  To: Petr Mládek
  Cc: Steven Rostedt, LKML, Andrew Morton, Frederic Weisbecker,
	Jan Kara, Peter Zijlstra, John Stultz, Paul Gortmaker

On Wed 07-05-14 11:13:56, Petr Mládek wrote:
> On Mon 2014-05-05 19:18:46, Steven Rostedt wrote:
> > To prevent deadlocks with doing a printk inside the scheduler,
> > printk_sched() was created. The issue is that printk has a console_sem
> > that it can grab and release. The release does a wake up if there's a
> > task pending on the sem, and this wake up grabs the rq locks that is
> > held in the scheduler. This leads to a possible deadlock if the wake up
> > uses the same rq as the one with the rq lock held already.
> > 
> > What printk_sched() does is to save the printk write in a per cpu buffer
> > and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
> > set, the printk() is done against the buffer.
> > 
> > There's a couple of issues with this approach.
> > 
> > 1) If two printk_sched()s are called before the tick, the second one
> > will overwrite the first one.
> > 
> > 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
> > bit of space wasted for something that is seldom used.
> > 
> > In order to remove this, the printk_sched() can use the printk buffer
> > instead, and delay the console_trylock()/console_unlock() to the queued
> > work.
> > 
> > Because printk_sched() would then be taking the logbuf_lock, the
> > logbuf_lock must not be held while doing anything that may call into the
> > scheduler functions, which includes wake ups. Unfortunately, printk()
> > also has a console_sem that it uses, and on release, the
> > up(&console_sem) may do a wake up of any pending waiters. This must be
> > avoided while holding the logbuf_lock.
> > 
> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> > ---
> > This version has been forward ported to the 3.15-rc releases.
> > ---
...
> > @@ -2440,18 +2470,20 @@
> >  #define PRINTK_BUF_SIZE		512
> >  
> >  #define PRINTK_PENDING_WAKEUP	0x01
> > -#define PRINTK_PENDING_SCHED	0x02
> > +#define PRINTK_PENDING_OUTPUT	0x02
> >  
> >  static DEFINE_PER_CPU(int, printk_pending);
> > -static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
> >  
> >  static void wake_up_klogd_work_func(struct irq_work *irq_work)
> >  {
> >  	int pending = __this_cpu_xchg(printk_pending, 0);
> >  
> > -	if (pending & PRINTK_PENDING_SCHED) {
> > -		char *buf = __get_cpu_var(printk_sched_buf);
> > -		pr_warn("[sched_delayed] %s", buf);
> > +	if (pending & PRINTK_PENDING_OUTPUT) {
> > +		if (console_trylock())
> > +			console_unlock();
> 
> I wonder if we should call here console_trylock_for_printk() which checks
> whether the console is really usable.
  So Stephen couldn't use console_trylock_for_printk() because that expects
logbuf_lock to be locked in vanilla kernel. Only after locking changes I
did it would be usable.
 
> The check for usable console was introduced in the commit
> 76a8ad293912cd2f (Make printk work for really early debugging).
> I think that this IRQ work could get called during early boot,
> so the check would make sense here as well. Or have I missed something?
  I'm not really sure if IRQ work can be run on CPU which is not online.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2014-05-07 14:33   ` Jan Kara
@ 2014-05-07 15:57     ` Petr Mládek
  2014-05-07 16:29       ` Petr Mládek
  0 siblings, 1 reply; 13+ messages in thread
From: Petr Mládek @ 2014-05-07 15:57 UTC (permalink / raw)
  To: Jan Kara
  Cc: Steven Rostedt, LKML, Andrew Morton, Frederic Weisbecker,
	Peter Zijlstra, John Stultz, Paul Gortmaker

On Wed 2014-05-07 16:33:20, Jan Kara wrote:
> On Wed 07-05-14 11:13:56, Petr Mládek wrote:
> > On Mon 2014-05-05 19:18:46, Steven Rostedt wrote:
> > > To prevent deadlocks with doing a printk inside the scheduler,
> > > printk_sched() was created. The issue is that printk has a console_sem
> > > that it can grab and release. The release does a wake up if there's a
> > > task pending on the sem, and this wake up grabs the rq locks that is
> > > held in the scheduler. This leads to a possible deadlock if the wake up
> > > uses the same rq as the one with the rq lock held already.
> > > 
> > > What printk_sched() does is to save the printk write in a per cpu buffer
> > > and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
> > > set, the printk() is done against the buffer.
> > > 
> > > There's a couple of issues with this approach.
> > > 
> > > 1) If two printk_sched()s are called before the tick, the second one
> > > will overwrite the first one.
> > > 
> > > 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
> > > bit of space wasted for something that is seldom used.
> > > 
> > > In order to remove this, the printk_sched() can use the printk buffer
> > > instead, and delay the console_trylock()/console_unlock() to the queued
> > > work.
> > > 
> > > Because printk_sched() would then be taking the logbuf_lock, the
> > > logbuf_lock must not be held while doing anything that may call into the
> > > scheduler functions, which includes wake ups. Unfortunately, printk()
> > > also has a console_sem that it uses, and on release, the
> > > up(&console_sem) may do a wake up of any pending waiters. This must be
> > > avoided while holding the logbuf_lock.
> > > 
> > > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> > > ---
> > > This version has been forward ported to the 3.15-rc releases.
> > > ---
> ...
> > > @@ -2440,18 +2470,20 @@
> > >  #define PRINTK_BUF_SIZE		512
> > >  
> > >  #define PRINTK_PENDING_WAKEUP	0x01
> > > -#define PRINTK_PENDING_SCHED	0x02
> > > +#define PRINTK_PENDING_OUTPUT	0x02
> > >  
> > >  static DEFINE_PER_CPU(int, printk_pending);
> > > -static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
> > >  
> > >  static void wake_up_klogd_work_func(struct irq_work *irq_work)
> > >  {
> > >  	int pending = __this_cpu_xchg(printk_pending, 0);
> > >  
> > > -	if (pending & PRINTK_PENDING_SCHED) {
> > > -		char *buf = __get_cpu_var(printk_sched_buf);
> > > -		pr_warn("[sched_delayed] %s", buf);
> > > +	if (pending & PRINTK_PENDING_OUTPUT) {
> > > +		if (console_trylock())
> > > +			console_unlock();
> > 
> > I wonder if we should call here console_trylock_for_printk() which checks
> > whether the console is really usable.
>   So Stephen couldn't use console_trylock_for_printk() because that expects
> logbuf_lock to be locked in vanilla kernel. Only after locking changes I
> did it would be usable.

Ah yes, I meant to use console_trylock_for_printk() from current
linux-next git tree. I am sorry, I should have been more explicit.


> > The check for usable console was introduced in the commit
> > 76a8ad293912cd2f (Make printk work for really early debugging).
> > I think that this IRQ work could get called during early boot,
> > so the check would make sense here as well. Or have I missed something?
>   I'm not really sure if IRQ work can be run on CPU which is not online.

It would make sense. I was just curious because
console_trylock_for_printk() was previously indirectly used via pr_warn().

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead
  2014-05-07 15:57     ` Petr Mládek
@ 2014-05-07 16:29       ` Petr Mládek
  0 siblings, 0 replies; 13+ messages in thread
From: Petr Mládek @ 2014-05-07 16:29 UTC (permalink / raw)
  To: Jan Kara
  Cc: Steven Rostedt, LKML, Andrew Morton, Frederic Weisbecker,
	Peter Zijlstra, John Stultz, Paul Gortmaker

On Wed 2014-05-07 17:57:03, Petr Mládek wrote:
> On Wed 2014-05-07 16:33:20, Jan Kara wrote:
> > On Wed 07-05-14 11:13:56, Petr Mládek wrote:
> > > On Mon 2014-05-05 19:18:46, Steven Rostedt wrote:
> > > > To prevent deadlocks with doing a printk inside the scheduler,
> > > > printk_sched() was created. The issue is that printk has a console_sem
> > > > that it can grab and release. The release does a wake up if there's a
> > > > task pending on the sem, and this wake up grabs the rq locks that is
> > > > held in the scheduler. This leads to a possible deadlock if the wake up
> > > > uses the same rq as the one with the rq lock held already.
> > > > 
> > > > What printk_sched() does is to save the printk write in a per cpu buffer
> > > > and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
> > > > set, the printk() is done against the buffer.
> > > > 
> > > > There's a couple of issues with this approach.
> > > > 
> > > > 1) If two printk_sched()s are called before the tick, the second one
> > > > will overwrite the first one.
> > > > 
> > > > 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
> > > > bit of space wasted for something that is seldom used.
> > > > 
> > > > In order to remove this, the printk_sched() can use the printk buffer
> > > > instead, and delay the console_trylock()/console_unlock() to the queued
> > > > work.
> > > > 
> > > > Because printk_sched() would then be taking the logbuf_lock, the
> > > > logbuf_lock must not be held while doing anything that may call into the
> > > > scheduler functions, which includes wake ups. Unfortunately, printk()
> > > > also has a console_sem that it uses, and on release, the
> > > > up(&console_sem) may do a wake up of any pending waiters. This must be
> > > > avoided while holding the logbuf_lock.
> > > > 
> > > > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> > > > ---
> > > > This version has been forward ported to the 3.15-rc releases.
> > > > ---
> > ...
> > > > @@ -2440,18 +2470,20 @@
> > > >  #define PRINTK_BUF_SIZE		512
> > > >  
> > > >  #define PRINTK_PENDING_WAKEUP	0x01
> > > > -#define PRINTK_PENDING_SCHED	0x02
> > > > +#define PRINTK_PENDING_OUTPUT	0x02
> > > >  
> > > >  static DEFINE_PER_CPU(int, printk_pending);
> > > > -static DEFINE_PER_CPU(char [PRINTK_BUF_SIZE], printk_sched_buf);
> > > >  
> > > >  static void wake_up_klogd_work_func(struct irq_work *irq_work)
> > > >  {
> > > >  	int pending = __this_cpu_xchg(printk_pending, 0);
> > > >  
> > > > -	if (pending & PRINTK_PENDING_SCHED) {
> > > > -		char *buf = __get_cpu_var(printk_sched_buf);
> > > > -		pr_warn("[sched_delayed] %s", buf);
> > > > +	if (pending & PRINTK_PENDING_OUTPUT) {
> > > > +		if (console_trylock())
> > > > +			console_unlock();
> > > 
> > > I wonder if we should call here console_trylock_for_printk() which checks
> > > whether the console is really usable.
> >   So Stephen couldn't use console_trylock_for_printk() because that expects
> > logbuf_lock to be locked in vanilla kernel. Only after locking changes I
> > did it would be usable.
> 
> Ah yes, I meant to use console_trylock_for_printk() from current
> linux-next git tree. I am sorry, I should have been more explicit.
> 
> 
> > > The check for usable console was introduced in the commit
> > > 76a8ad293912cd2f (Make printk work for really early debugging).
> > > I think that this IRQ work could get called during early boot,
> > > so the check would make sense here as well. Or have I missed something?
> >   I'm not really sure if IRQ work can be run on CPU which is not online.
> 
> It would make sense. I was just curious because
> console_trylock_for_printk() was previously indirectly used via pr_warn().

I replied too fast ;-) The function can_use_console() also checks
whether the console is callable. IMHO, it can fail during early boot.
So the check might make sense even in the IRQ work.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-05-07 16:29 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-05 23:18 [PATCH] printk: Remove separate printk_sched buffers and use printk buf instead Steven Rostedt
2014-05-05 23:33 ` Joe Perches
2014-05-05 23:55   ` Steven Rostedt
2014-05-06  9:45 ` Jan Kara
2014-05-06 11:04   ` Steven Rostedt
2014-05-06 23:37     ` Andrew Morton
2014-05-06 23:46       ` Steven Rostedt
2014-05-07  0:05       ` Steven Rostedt
2014-05-07 14:20         ` Jan Kara
2014-05-07  9:13 ` Petr Mládek
2014-05-07 14:33   ` Jan Kara
2014-05-07 15:57     ` Petr Mládek
2014-05-07 16:29       ` Petr Mládek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).