All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] printk: Correctly handle preemption in console_unlock()
@ 2017-01-25 14:08 ` Petr Mladek
  0 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2017-01-25 14:08 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Steven Rostedt, Peter Zijlstra, Andrew Morton,
	Greg Kroah-Hartman, Jiri Slaby, linux-fbdev, linux-kernel,
	Petr Mladek

Some console drivers code calls console_conditional_schedule()
that looks at @console_may_schedule. The value must be cleared
when the drivers are called from console_unlock() with
interrupts disabled. But rescheduling is fine when the same
code is called, for example, from tty operations where the
console semaphore is taken via console_lock().

This is why @console_may_schedule is cleared before calling console
drivers. The original value is stored to decide if we could sleep
between lines.

Now, @console_may_schedule is not cleared when we call
console_trylock() and jump back to the "again" goto label.
This has become a problem, since the commit 6b97a20d3a7909daa066
("printk: set may_schedule for some of console_trylock() callers").
@console_may_schedule might get enabled now.

There is also the opposite problem. console_lock() can be called
only from preemptive context. It can always enable scheduling in
the console code. But console_trylock() is not able to detect it
when CONFIG_PREEMPT_COUNT is disabled. Therefore we should use the
original @console_may_schedule value after re-acquiring
the console semaphore in console_unlock().

This patch solves both problems by moving the "again" goto label.

Alternative solution was to clear and restore the value around
call_console_drivers(). Then console_conditional_schedule() could
be used also inside console_unlock(). But there was a potential race
with console_flush_on_panic() as reported by Sergey Senozhatsky.
That function should be called only where there is only one CPU
and with interrupts disabled. But better be on the safe side
because stopping CPUs might fail.

Fixes: 6b97a20d3a7909 ("printk: set may_schedule for some of console_trylock() callers")
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
Changes against v2:

  + use conservative solution with the following rules:

    + always clear console_may_schedule after again goto label

    + save and use the original value to decide about sleeping
      inside console unlock

    + do not set console_may_schedule using the saved value;
      it avoids potential race with console_flush_on_panic();
      also it avoids breaking the complex logic used in other
      functions manipulating this variable.

 kernel/printk/printk.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 7180088cbb23..cc90c0a5ae21 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2158,19 +2158,18 @@ void console_unlock(void)
 	}
 
 	/*
-	 * Console drivers are called under logbuf_lock, so
-	 * @console_may_schedule should be cleared before; however, we may
-	 * end up dumping a lot of lines, for example, if called from
-	 * console registration path, and should invoke cond_resched()
-	 * between lines if allowable.  Not doing so can cause a very long
-	 * scheduling stall on a slow console leading to RCU stall and
-	 * softlockup warnings which exacerbate the issue with more
-	 * messages practically incapacitating the system.
+	 * Console drivers are called with interrupts disabled, so
+	 * @console_may_schedule must be cleared before. The original
+	 * value must be stored so that we could schedule between lines.
+	 *
+	 * console_trylock() is not able to detect the preemtible
+	 * context reliably. Therefore the value must be stored before
+	 * and cleared after the the "again" goto label.
 	 */
 	do_cond_resched = console_may_schedule;
+again:
 	console_may_schedule = 0;
 
-again:
 	/*
 	 * We released the console_sem lock, so we need to recheck if
 	 * cpu is online and (if not) is there at least one CON_ANYTIME
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2] printk: Correctly handle preemption in console_unlock()
@ 2017-01-25 14:08 ` Petr Mladek
  0 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2017-01-25 14:08 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Steven Rostedt, Peter Zijlstra, Andrew Morton,
	Greg Kroah-Hartman, Jiri Slaby, linux-fbdev, linux-kernel,
	Petr Mladek

Some console drivers code calls console_conditional_schedule()
that looks at @console_may_schedule. The value must be cleared
when the drivers are called from console_unlock() with
interrupts disabled. But rescheduling is fine when the same
code is called, for example, from tty operations where the
console semaphore is taken via console_lock().

This is why @console_may_schedule is cleared before calling console
drivers. The original value is stored to decide if we could sleep
between lines.

Now, @console_may_schedule is not cleared when we call
console_trylock() and jump back to the "again" goto label.
This has become a problem, since the commit 6b97a20d3a7909daa066
("printk: set may_schedule for some of console_trylock() callers").
@console_may_schedule might get enabled now.

There is also the opposite problem. console_lock() can be called
only from preemptive context. It can always enable scheduling in
the console code. But console_trylock() is not able to detect it
when CONFIG_PREEMPT_COUNT is disabled. Therefore we should use the
original @console_may_schedule value after re-acquiring
the console semaphore in console_unlock().

This patch solves both problems by moving the "again" goto label.

Alternative solution was to clear and restore the value around
call_console_drivers(). Then console_conditional_schedule() could
be used also inside console_unlock(). But there was a potential race
with console_flush_on_panic() as reported by Sergey Senozhatsky.
That function should be called only where there is only one CPU
and with interrupts disabled. But better be on the safe side
because stopping CPUs might fail.

Fixes: 6b97a20d3a7909 ("printk: set may_schedule for some of console_trylock() callers")
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
Changes against v2:

  + use conservative solution with the following rules:

    + always clear console_may_schedule after again goto label

    + save and use the original value to decide about sleeping
      inside console unlock

    + do not set console_may_schedule using the saved value;
      it avoids potential race with console_flush_on_panic();
      also it avoids breaking the complex logic used in other
      functions manipulating this variable.

 kernel/printk/printk.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 7180088cbb23..cc90c0a5ae21 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2158,19 +2158,18 @@ void console_unlock(void)
 	}
 
 	/*
-	 * Console drivers are called under logbuf_lock, so
-	 * @console_may_schedule should be cleared before; however, we may
-	 * end up dumping a lot of lines, for example, if called from
-	 * console registration path, and should invoke cond_resched()
-	 * between lines if allowable.  Not doing so can cause a very long
-	 * scheduling stall on a slow console leading to RCU stall and
-	 * softlockup warnings which exacerbate the issue with more
-	 * messages practically incapacitating the system.
+	 * Console drivers are called with interrupts disabled, so
+	 * @console_may_schedule must be cleared before. The original
+	 * value must be stored so that we could schedule between lines.
+	 *
+	 * console_trylock() is not able to detect the preemtible
+	 * context reliably. Therefore the value must be stored before
+	 * and cleared after the the "again" goto label.
 	 */
 	do_cond_resched = console_may_schedule;
+again:
 	console_may_schedule = 0;
 
-again:
 	/*
 	 * We released the console_sem lock, so we need to recheck if
 	 * cpu is online and (if not) is there at least one CON_ANYTIME
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] printk: Correctly handle preemption in console_unlock()
  2017-01-25 14:08 ` Petr Mladek
@ 2017-02-02 13:31   ` Sergey Senozhatsky
  -1 siblings, 0 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2017-02-02 13:31 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Tetsuo Handa, Steven Rostedt, Peter Zijlstra,
	Andrew Morton, Greg Kroah-Hartman, Jiri Slaby, linux-fbdev,
	linux-kernel

On (01/25/17 15:08), Petr Mladek wrote:
[..]
> Fixes: 6b97a20d3a7909 ("printk: set may_schedule for some of console_trylock() callers")
> Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

	-ss

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] printk: Correctly handle preemption in console_unlock()
@ 2017-02-02 13:31   ` Sergey Senozhatsky
  0 siblings, 0 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2017-02-02 13:31 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Tetsuo Handa, Steven Rostedt, Peter Zijlstra,
	Andrew Morton, Greg Kroah-Hartman, Jiri Slaby, linux-fbdev,
	linux-kernel

On (01/25/17 15:08), Petr Mladek wrote:
[..]
> Fixes: 6b97a20d3a7909 ("printk: set may_schedule for some of console_trylock() callers")
> Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

	-ss

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] printk: Correctly handle preemption in console_unlock()
  2017-01-25 14:08 ` Petr Mladek
@ 2017-02-02 14:30   ` Steven Rostedt
  -1 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2017-02-02 14:30 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Tetsuo Handa, Peter Zijlstra, Andrew Morton,
	Greg Kroah-Hartman, Jiri Slaby, linux-fbdev, linux-kernel

On Wed, 25 Jan 2017 15:08:45 +0100
Petr Mladek <pmladek@suse.com> wrote:

> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 7180088cbb23..cc90c0a5ae21 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2158,19 +2158,18 @@ void console_unlock(void)
>  	}
>  
>  	/*
> -	 * Console drivers are called under logbuf_lock, so
> -	 * @console_may_schedule should be cleared before; however, we may
> -	 * end up dumping a lot of lines, for example, if called from
> -	 * console registration path, and should invoke cond_resched()
> -	 * between lines if allowable.  Not doing so can cause a very long
> -	 * scheduling stall on a slow console leading to RCU stall and
> -	 * softlockup warnings which exacerbate the issue with more
> -	 * messages practically incapacitating the system.

Why did you remove the comment about invoking cond_resched()? It's
still pertinent to the code, as there still exists a:

	if (do_cond_resched)
		cond_resched();

And the rational in the comment is still correct.

-- Steve

> +	 * Console drivers are called with interrupts disabled, so
> +	 * @console_may_schedule must be cleared before. The original
> +	 * value must be stored so that we could schedule between lines.
> +	 *
> +	 * console_trylock() is not able to detect the preemtible
> +	 * context reliably. Therefore the value must be stored before
> +	 * and cleared after the the "again" goto label.
>  	 */
>  	do_cond_resched = console_may_schedule;
> +again:
>  	console_may_schedule = 0;
>  
> -again:
>  	/*
>  	 * We released the console_sem lock, so we need to recheck if
>  	 * cpu is online and (if not) is there at least one CON_ANYTIME

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] printk: Correctly handle preemption in console_unlock()
@ 2017-02-02 14:30   ` Steven Rostedt
  0 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2017-02-02 14:30 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Tetsuo Handa, Peter Zijlstra, Andrew Morton,
	Greg Kroah-Hartman, Jiri Slaby, linux-fbdev, linux-kernel

On Wed, 25 Jan 2017 15:08:45 +0100
Petr Mladek <pmladek@suse.com> wrote:

> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 7180088cbb23..cc90c0a5ae21 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2158,19 +2158,18 @@ void console_unlock(void)
>  	}
>  
>  	/*
> -	 * Console drivers are called under logbuf_lock, so
> -	 * @console_may_schedule should be cleared before; however, we may
> -	 * end up dumping a lot of lines, for example, if called from
> -	 * console registration path, and should invoke cond_resched()
> -	 * between lines if allowable.  Not doing so can cause a very long
> -	 * scheduling stall on a slow console leading to RCU stall and
> -	 * softlockup warnings which exacerbate the issue with more
> -	 * messages practically incapacitating the system.

Why did you remove the comment about invoking cond_resched()? It's
still pertinent to the code, as there still exists a:

	if (do_cond_resched)
		cond_resched();

And the rational in the comment is still correct.

-- Steve

> +	 * Console drivers are called with interrupts disabled, so
> +	 * @console_may_schedule must be cleared before. The original
> +	 * value must be stored so that we could schedule between lines.
> +	 *
> +	 * console_trylock() is not able to detect the preemtible
> +	 * context reliably. Therefore the value must be stored before
> +	 * and cleared after the the "again" goto label.
>  	 */
>  	do_cond_resched = console_may_schedule;
> +again:
>  	console_may_schedule = 0;
>  
> -again:
>  	/*
>  	 * We released the console_sem lock, so we need to recheck if
>  	 * cpu is online and (if not) is there at least one CON_ANYTIME


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] printk: Correctly handle preemption in console_unlock()
  2017-02-02 14:30   ` Steven Rostedt
@ 2017-02-02 16:40     ` Petr Mladek
  -1 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2017-02-02 16:40 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Tetsuo Handa, Peter Zijlstra, Andrew Morton,
	Greg Kroah-Hartman, Jiri Slaby, linux-fbdev, linux-kernel

On Thu 2017-02-02 09:30:11, Steven Rostedt wrote:
> On Wed, 25 Jan 2017 15:08:45 +0100
> Petr Mladek <pmladek@suse.com> wrote:
> 
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index 7180088cbb23..cc90c0a5ae21 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -2158,19 +2158,18 @@ void console_unlock(void)
> >  	}
> >  
> >  	/*
> > -	 * Console drivers are called under logbuf_lock, so
> > -	 * @console_may_schedule should be cleared before; however, we may
> > -	 * end up dumping a lot of lines, for example, if called from
> > -	 * console registration path, and should invoke cond_resched()
> > -	 * between lines if allowable.  Not doing so can cause a very long
> > -	 * scheduling stall on a slow console leading to RCU stall and
> > -	 * softlockup warnings which exacerbate the issue with more
> > -	 * messages practically incapacitating the system.
> 
> Why did you remove the comment about invoking cond_resched()? It's
> still pertinent to the code, as there still exists a:
>
> 	if (do_cond_resched)
> 		cond_resched();
> 
> And the rational in the comment is still correct.

The comment was above an assignment to a variable called
"do_cond_resched". The variable was later used only on one place.
I hoped that the purpose of cond_resched() was a common knowledge.

The new comment still mentions scheduling between lines. But it
newly explains why it has to be done here and the relation to
the "again" label.

Would you prefer to keep the original text as is (modulo the context
in which the drivers are called) and just add the second paragraph?

Anyway, thank a lot for review.

Best Regards,
Petr

> -- Steve
> 
> > +	 * Console drivers are called with interrupts disabled, so
> > +	 * @console_may_schedule must be cleared before. The original
> > +	 * value must be stored so that we could schedule between lines.
> > +	 *
> > +	 * console_trylock() is not able to detect the preemtible
> > +	 * context reliably. Therefore the value must be stored before
> > +	 * and cleared after the the "again" goto label.
> >  	 */
> >  	do_cond_resched = console_may_schedule;
> > +again:
> >  	console_may_schedule = 0;
> >  
> > -again:
> >  	/*
> >  	 * We released the console_sem lock, so we need to recheck if
> >  	 * cpu is online and (if not) is there at least one CON_ANYTIME
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] printk: Correctly handle preemption in console_unlock()
@ 2017-02-02 16:40     ` Petr Mladek
  0 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2017-02-02 16:40 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Tetsuo Handa, Peter Zijlstra, Andrew Morton,
	Greg Kroah-Hartman, Jiri Slaby, linux-fbdev, linux-kernel

On Thu 2017-02-02 09:30:11, Steven Rostedt wrote:
> On Wed, 25 Jan 2017 15:08:45 +0100
> Petr Mladek <pmladek@suse.com> wrote:
> 
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index 7180088cbb23..cc90c0a5ae21 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -2158,19 +2158,18 @@ void console_unlock(void)
> >  	}
> >  
> >  	/*
> > -	 * Console drivers are called under logbuf_lock, so
> > -	 * @console_may_schedule should be cleared before; however, we may
> > -	 * end up dumping a lot of lines, for example, if called from
> > -	 * console registration path, and should invoke cond_resched()
> > -	 * between lines if allowable.  Not doing so can cause a very long
> > -	 * scheduling stall on a slow console leading to RCU stall and
> > -	 * softlockup warnings which exacerbate the issue with more
> > -	 * messages practically incapacitating the system.
> 
> Why did you remove the comment about invoking cond_resched()? It's
> still pertinent to the code, as there still exists a:
>
> 	if (do_cond_resched)
> 		cond_resched();
> 
> And the rational in the comment is still correct.

The comment was above an assignment to a variable called
"do_cond_resched". The variable was later used only on one place.
I hoped that the purpose of cond_resched() was a common knowledge.

The new comment still mentions scheduling between lines. But it
newly explains why it has to be done here and the relation to
the "again" label.

Would you prefer to keep the original text as is (modulo the context
in which the drivers are called) and just add the second paragraph?

Anyway, thank a lot for review.

Best Regards,
Petr

> -- Steve
> 
> > +	 * Console drivers are called with interrupts disabled, so
> > +	 * @console_may_schedule must be cleared before. The original
> > +	 * value must be stored so that we could schedule between lines.
> > +	 *
> > +	 * console_trylock() is not able to detect the preemtible
> > +	 * context reliably. Therefore the value must be stored before
> > +	 * and cleared after the the "again" goto label.
> >  	 */
> >  	do_cond_resched = console_may_schedule;
> > +again:
> >  	console_may_schedule = 0;
> >  
> > -again:
> >  	/*
> >  	 * We released the console_sem lock, so we need to recheck if
> >  	 * cpu is online and (if not) is there at least one CON_ANYTIME
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3] printk: Correctly handle preemption in console_unlock()
  2017-02-02 14:30   ` Steven Rostedt
@ 2017-03-24 16:14     ` Petr Mladek
  -1 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2017-03-24 16:14 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Steven Rostedt, Peter Zijlstra, Andrew Morton,
	Greg Kroah-Hartman, Jiri Slaby, linux-fbdev, linux-kernel,
	Petr Mladek

Some console drivers code calls console_conditional_schedule()
that looks at @console_may_schedule. The value must be cleared
when the drivers are called from console_unlock() with
interrupts disabled. But rescheduling is fine when the same
code is called, for example, from tty operations where the
console semaphore is taken via console_lock().

This is why @console_may_schedule is cleared before calling console
drivers. The original value is stored to decide if we could sleep
between lines.

Now, @console_may_schedule is not cleared when we call
console_trylock() and jump back to the "again" goto label.
This has become a problem, since the commit 6b97a20d3a7909daa066
("printk: set may_schedule for some of console_trylock() callers").
@console_may_schedule might get enabled now.

There is also the opposite problem. console_lock() can be called
only from preemptive context. It can always enable scheduling in
the console code. But console_trylock() is not able to detect it
when CONFIG_PREEMPT_COUNT is disabled. Therefore we should use the
original @console_may_schedule value after re-acquiring
the console semaphore in console_unlock().

This patch solves both problems by moving the "again" goto label.

Alternative solution was to clear and restore the value around
call_console_drivers(). Then console_conditional_schedule() could
be used also inside console_unlock(). But there was a potential race
with console_flush_on_panic() as reported by Sergey Senozhatsky.
That function should be called only where there is only one CPU
and with interrupts disabled. But better be on the safe side
because stopping CPUs might fail.

Fixes: 6b97a20d3a7909 ("printk: set may_schedule for some of console_trylock() callers")
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
Link to v2: https://lkml.kernel.org/r/1485353325-26591-1-git-send-email-pmladek@suse.com

Changes against v3:

  + do not remove useful details from the original comment
    as suggested by Steven and Sergey

  + fix typo in the new comment

Changes against v2:

  + use conservative solution with the following rules:

    + always clear console_may_schedule after again goto label

    + save and use the original value to decide about sleeping
      inside console unlock

    + do not set console_may_schedule using the saved value;
      it avoids potential race with console_flush_on_panic();
      also it avoids breaking the complex logic used in other
      functions manipulating this variable.

 kernel/printk/printk.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 2984fb0f0257..e5636fa04e66 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2161,7 +2161,7 @@ void console_unlock(void)
 	}
 
 	/*
-	 * Console drivers are called under logbuf_lock, so
+	 * Console drivers are called with interrupts disabled, so
 	 * @console_may_schedule should be cleared before; however, we may
 	 * end up dumping a lot of lines, for example, if called from
 	 * console registration path, and should invoke cond_resched()
@@ -2169,12 +2169,15 @@ void console_unlock(void)
 	 * scheduling stall on a slow console leading to RCU stall and
 	 * softlockup warnings which exacerbate the issue with more
 	 * messages practically incapacitating the system.
+	 *
+	 * console_trylock() is not able to detect the preemptive
+	 * context reliably. Therefore the value must be stored before
+	 * and cleared after the the "again" goto label.
 	 */
 	do_cond_resched = console_may_schedule;
+again:
 	console_may_schedule = 0;
 
-again:
 	/*
 	 * We released the console_sem lock, so we need to recheck if
 	 * cpu is online and (if not) is there at least one CON_ANYTIME
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3] printk: Correctly handle preemption in console_unlock()
@ 2017-03-24 16:14     ` Petr Mladek
  0 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2017-03-24 16:14 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Steven Rostedt, Peter Zijlstra, Andrew Morton,
	Greg Kroah-Hartman, Jiri Slaby, linux-fbdev, linux-kernel,
	Petr Mladek

Some console drivers code calls console_conditional_schedule()
that looks at @console_may_schedule. The value must be cleared
when the drivers are called from console_unlock() with
interrupts disabled. But rescheduling is fine when the same
code is called, for example, from tty operations where the
console semaphore is taken via console_lock().

This is why @console_may_schedule is cleared before calling console
drivers. The original value is stored to decide if we could sleep
between lines.

Now, @console_may_schedule is not cleared when we call
console_trylock() and jump back to the "again" goto label.
This has become a problem, since the commit 6b97a20d3a7909daa066
("printk: set may_schedule for some of console_trylock() callers").
@console_may_schedule might get enabled now.

There is also the opposite problem. console_lock() can be called
only from preemptive context. It can always enable scheduling in
the console code. But console_trylock() is not able to detect it
when CONFIG_PREEMPT_COUNT is disabled. Therefore we should use the
original @console_may_schedule value after re-acquiring
the console semaphore in console_unlock().

This patch solves both problems by moving the "again" goto label.

Alternative solution was to clear and restore the value around
call_console_drivers(). Then console_conditional_schedule() could
be used also inside console_unlock(). But there was a potential race
with console_flush_on_panic() as reported by Sergey Senozhatsky.
That function should be called only where there is only one CPU
and with interrupts disabled. But better be on the safe side
because stopping CPUs might fail.

Fixes: 6b97a20d3a7909 ("printk: set may_schedule for some of console_trylock() callers")
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
Link to v2: https://lkml.kernel.org/r/1485353325-26591-1-git-send-email-pmladek@suse.com

Changes against v3:

  + do not remove useful details from the original comment
    as suggested by Steven and Sergey

  + fix typo in the new comment

Changes against v2:

  + use conservative solution with the following rules:

    + always clear console_may_schedule after again goto label

    + save and use the original value to decide about sleeping
      inside console unlock

    + do not set console_may_schedule using the saved value;
      it avoids potential race with console_flush_on_panic();
      also it avoids breaking the complex logic used in other
      functions manipulating this variable.

 kernel/printk/printk.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 2984fb0f0257..e5636fa04e66 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2161,7 +2161,7 @@ void console_unlock(void)
 	}
 
 	/*
-	 * Console drivers are called under logbuf_lock, so
+	 * Console drivers are called with interrupts disabled, so
 	 * @console_may_schedule should be cleared before; however, we may
 	 * end up dumping a lot of lines, for example, if called from
 	 * console registration path, and should invoke cond_resched()
@@ -2169,12 +2169,15 @@ void console_unlock(void)
 	 * scheduling stall on a slow console leading to RCU stall and
 	 * softlockup warnings which exacerbate the issue with more
 	 * messages practically incapacitating the system.
+	 *
+	 * console_trylock() is not able to detect the preemptive
+	 * context reliably. Therefore the value must be stored before
+	 * and cleared after the the "again" goto label.
 	 */
 	do_cond_resched = console_may_schedule;
+again:
 	console_may_schedule = 0;
 
-again:
 	/*
 	 * We released the console_sem lock, so we need to recheck if
 	 * cpu is online and (if not) is there at least one CON_ANYTIME
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] printk: Correctly handle preemption in console_unlock()
  2017-03-24 16:14     ` Petr Mladek
@ 2017-03-24 23:57       ` Sergey Senozhatsky
  -1 siblings, 0 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2017-03-24 23:57 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Tetsuo Handa, Steven Rostedt, Peter Zijlstra,
	Andrew Morton, Greg Kroah-Hartman, Jiri Slaby, linux-fbdev,
	linux-kernel

On (03/24/17 17:14), Petr Mladek wrote:
[..]
> Fixes: 6b97a20d3a7909 ("printk: set may_schedule for some of console_trylock() callers")
> Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

	-ss

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] printk: Correctly handle preemption in console_unlock()
@ 2017-03-24 23:57       ` Sergey Senozhatsky
  0 siblings, 0 replies; 12+ messages in thread
From: Sergey Senozhatsky @ 2017-03-24 23:57 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Tetsuo Handa, Steven Rostedt, Peter Zijlstra,
	Andrew Morton, Greg Kroah-Hartman, Jiri Slaby, linux-fbdev,
	linux-kernel

On (03/24/17 17:14), Petr Mladek wrote:
[..]
> Fixes: 6b97a20d3a7909 ("printk: set may_schedule for some of console_trylock() callers")
> Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

	-ss

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-03-24 23:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-25 14:08 [PATCH v2] printk: Correctly handle preemption in console_unlock() Petr Mladek
2017-01-25 14:08 ` Petr Mladek
2017-02-02 13:31 ` Sergey Senozhatsky
2017-02-02 13:31   ` Sergey Senozhatsky
2017-02-02 14:30 ` Steven Rostedt
2017-02-02 14:30   ` Steven Rostedt
2017-02-02 16:40   ` Petr Mladek
2017-02-02 16:40     ` Petr Mladek
2017-03-24 16:14   ` [PATCH v3] " Petr Mladek
2017-03-24 16:14     ` Petr Mladek
2017-03-24 23:57     ` Sergey Senozhatsky
2017-03-24 23:57       ` Sergey Senozhatsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.