linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code
@ 2016-01-29 12:43 Byungchul Park
  2016-01-29 12:54 ` Byungchul Park
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Byungchul Park @ 2016-01-29 12:43 UTC (permalink / raw)
  To: akpm
  Cc: mingo, linux-kernel, akinobu.mita, jack, sergey.senozhatsky.work,
	peter, torvalds

changes from v4 to v5
- found out a clear scenario which make a system crazy. at least it
  should not be caused by the debug code.

changes from v3 to v4
- reuse a existing code as much as possible for preventing an infinite
  recursive cycle.

changes from v2 to v3
- avoid printk() only in case of lockup suspected, not real lockup in
  which case it does not help at all.
- consider not only console_sem.lock but also logbuf_lock which is used
  by printk().

changes from v1 to v2
- only change comment and commit message esp. replacing "deadlock" with
  "infinite recursive cycle", since it is not an actual deadlock.

thanks,
byungchul

-----8<-----
>From eed077240e0b0d9f14d91037ef1915feab85aa4d Mon Sep 17 00:00:00 2001
From: Byungchul Park <byungchul.park@lge.com>
Date: Fri, 29 Jan 2016 21:23:24 +0900
Subject: [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the
 debug code

It causes an infinite recursive cycle when using CONFIG_DEBUG_SPINLOCK,
in the spin_dump(). Backtrace prints printk() -> console_trylock() ->
do_raw_spin_lock() -> spin_dump() -> printk()... infinitely.

When the spin_dump() is called from printk(), we should prevent the
debug spinlock code from calling printk() again in that context. It's
reasonable to avoid printing "lockup suspected" which is just a warning
message but it would cause a real lockup definitely.

The scenario is,

cpu0
====
printk
  console_trylock
  console_unlock
    up_console_sem
      up
        raw_spin_lock_irqsave(&sem->lock, flags)
        __up
          wake_up_process
            try_to_wake_up
              raw_spin_lock_irqsave(&p->pi_lock)
                __spin_lock_debug
                  spin_dump <=== the problem point!
                    printk
                      console_trylock
                        raw_spin_lock_irqsave(&sem->lock, flags)

                        <=== DEADLOCK

cpu1
====
printk
  console_trylock
    raw_spin_lock_irqsave(&sem->lock, flags)
    __spin_lock_debug
      spin_dump
        printk
          ...

          <=== repeat the recursive cycle infinitely

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 kernel/locking/spinlock_debug.c | 16 +++++++++++++---
 kernel/printk/printk.c          |  5 +++++
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
index 0374a59..cf7bc96 100644
--- a/kernel/locking/spinlock_debug.c
+++ b/kernel/locking/spinlock_debug.c
@@ -103,6 +103,8 @@ static inline void debug_spin_unlock(raw_spinlock_t *lock)
 	lock->owner_cpu = -1;
 }
 
+extern int is_console_lock(raw_spinlock_t *lock);
+
 static void __spin_lock_debug(raw_spinlock_t *lock)
 {
 	u64 i;
@@ -113,11 +115,19 @@ static void __spin_lock_debug(raw_spinlock_t *lock)
 			return;
 		__delay(1);
 	}
-	/* lockup suspected: */
-	spin_dump(lock, "lockup suspected");
+
+	/*
+	 * If this function is called from printk(), then we should
+	 * not call printk() more. Or it will cause an infinite
+	 * recursive cycle!
+	 */
+	if (likely(!is_console_lock(lock))) {
+		/* lockup suspected: */
+		spin_dump(lock, "lockup suspected");
 #ifdef CONFIG_SMP
-	trigger_all_cpu_backtrace();
+		trigger_all_cpu_backtrace();
 #endif
+	}
 
 	/*
 	 * The trylock above was causing a livelock.  Give the lower level arch
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 2ce8826..568ab11 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1981,6 +1981,11 @@ asmlinkage __visible void early_printk(const char *fmt, ...)
 }
 #endif
 
+int is_console_lock(raw_spinlock_t *lock)
+{
+	return lock == &console_sem.lock;
+}
+
 static int __add_preferred_console(char *name, int idx, char *options,
 				   char *brl_options)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code
  2016-01-29 12:43 [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code Byungchul Park
@ 2016-01-29 12:54 ` Byungchul Park
  2016-01-31 12:40   ` Sergey Senozhatsky
  2016-01-30  9:27 ` Ingo Molnar
  2016-02-01  2:31 ` Sergey Senozhatsky
  2 siblings, 1 reply; 9+ messages in thread
From: Byungchul Park @ 2016-01-29 12:54 UTC (permalink / raw)
  To: akpm
  Cc: mingo, linux-kernel, akinobu.mita, jack, sergey.senozhatsky.work,
	peter, torvalds

On Fri, Jan 29, 2016 at 09:43:37PM +0900, Byungchul Park wrote:
> changes from v4 to v5
> - found out a clear scenario which make a system crazy. at least it
>   should not be caused by the debug code.

Hello, Andrew

Please take this v5 patch instead of v2 patch, which you took. Or give your
opinion.

Thanks,
Byungchul

> 
> changes from v3 to v4
> - reuse a existing code as much as possible for preventing an infinite
>   recursive cycle.
> 
> changes from v2 to v3
> - avoid printk() only in case of lockup suspected, not real lockup in
>   which case it does not help at all.
> - consider not only console_sem.lock but also logbuf_lock which is used
>   by printk().
> 
> changes from v1 to v2
> - only change comment and commit message esp. replacing "deadlock" with
>   "infinite recursive cycle", since it is not an actual deadlock.
> 
> thanks,
> byungchul
> 
> -----8<-----
> >From eed077240e0b0d9f14d91037ef1915feab85aa4d Mon Sep 17 00:00:00 2001
> From: Byungchul Park <byungchul.park@lge.com>
> Date: Fri, 29 Jan 2016 21:23:24 +0900
> Subject: [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the
>  debug code
> 
> It causes an infinite recursive cycle when using CONFIG_DEBUG_SPINLOCK,
> in the spin_dump(). Backtrace prints printk() -> console_trylock() ->
> do_raw_spin_lock() -> spin_dump() -> printk()... infinitely.
> 
> When the spin_dump() is called from printk(), we should prevent the
> debug spinlock code from calling printk() again in that context. It's
> reasonable to avoid printing "lockup suspected" which is just a warning
> message but it would cause a real lockup definitely.
> 
> The scenario is,
> 
> cpu0
> ====
> printk
>   console_trylock
>   console_unlock
>     up_console_sem
>       up
>         raw_spin_lock_irqsave(&sem->lock, flags)
>         __up
>           wake_up_process
>             try_to_wake_up
>               raw_spin_lock_irqsave(&p->pi_lock)
>                 __spin_lock_debug
>                   spin_dump <=== the problem point!
>                     printk
>                       console_trylock
>                         raw_spin_lock_irqsave(&sem->lock, flags)
> 
>                         <=== DEADLOCK
> 
> cpu1
> ====
> printk
>   console_trylock
>     raw_spin_lock_irqsave(&sem->lock, flags)
>     __spin_lock_debug
>       spin_dump
>         printk
>           ...
> 
>           <=== repeat the recursive cycle infinitely
> 
> Signed-off-by: Byungchul Park <byungchul.park@lge.com>
> ---
>  kernel/locking/spinlock_debug.c | 16 +++++++++++++---
>  kernel/printk/printk.c          |  5 +++++
>  2 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
> index 0374a59..cf7bc96 100644
> --- a/kernel/locking/spinlock_debug.c
> +++ b/kernel/locking/spinlock_debug.c
> @@ -103,6 +103,8 @@ static inline void debug_spin_unlock(raw_spinlock_t *lock)
>  	lock->owner_cpu = -1;
>  }
>  
> +extern int is_console_lock(raw_spinlock_t *lock);
> +
>  static void __spin_lock_debug(raw_spinlock_t *lock)
>  {
>  	u64 i;
> @@ -113,11 +115,19 @@ static void __spin_lock_debug(raw_spinlock_t *lock)
>  			return;
>  		__delay(1);
>  	}
> -	/* lockup suspected: */
> -	spin_dump(lock, "lockup suspected");
> +
> +	/*
> +	 * If this function is called from printk(), then we should
> +	 * not call printk() more. Or it will cause an infinite
> +	 * recursive cycle!
> +	 */
> +	if (likely(!is_console_lock(lock))) {
> +		/* lockup suspected: */
> +		spin_dump(lock, "lockup suspected");
>  #ifdef CONFIG_SMP
> -	trigger_all_cpu_backtrace();
> +		trigger_all_cpu_backtrace();
>  #endif
> +	}
>  
>  	/*
>  	 * The trylock above was causing a livelock.  Give the lower level arch
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 2ce8826..568ab11 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1981,6 +1981,11 @@ asmlinkage __visible void early_printk(const char *fmt, ...)
>  }
>  #endif
>  
> +int is_console_lock(raw_spinlock_t *lock)
> +{
> +	return lock == &console_sem.lock;
> +}
> +
>  static int __add_preferred_console(char *name, int idx, char *options,
>  				   char *brl_options)
>  {
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code
  2016-01-29 12:43 [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code Byungchul Park
  2016-01-29 12:54 ` Byungchul Park
@ 2016-01-30  9:27 ` Ingo Molnar
  2016-02-02  2:34   ` Byungchul Park
  2016-02-01  2:31 ` Sergey Senozhatsky
  2 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2016-01-30  9:27 UTC (permalink / raw)
  To: Byungchul Park
  Cc: akpm, linux-kernel, akinobu.mita, jack, sergey.senozhatsky.work,
	peter, torvalds


* Byungchul Park <byungchul.park@lge.com> wrote:

> +
> +	/*
> +	 * If this function is called from printk(), then we should
> +	 * not call printk() more. Or it will cause an infinite
> +	 * recursive cycle!

This should be something like:

> +	 * If this function is called from within printk() then we
> +	 * should not call printk() again, or it will recurse
> +	*  infinitely.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code
  2016-01-29 12:54 ` Byungchul Park
@ 2016-01-31 12:40   ` Sergey Senozhatsky
  2016-02-01  1:45     ` Byungchul Park
  0 siblings, 1 reply; 9+ messages in thread
From: Sergey Senozhatsky @ 2016-01-31 12:40 UTC (permalink / raw)
  To: Byungchul Park
  Cc: akpm, mingo, linux-kernel, akinobu.mita, jack,
	sergey.senozhatsky.work, peter, torvalds

On (01/29/16 21:54), Byungchul Park wrote:
> Hello, Andrew
> 
> Please take this v5 patch instead of v2 patch, which you took. Or give your
> opinion.
> 
> > It causes an infinite recursive cycle when using CONFIG_DEBUG_SPINLOCK,
> > in the spin_dump(). Backtrace prints printk() -> console_trylock() ->
> > do_raw_spin_lock() -> spin_dump() -> printk()... infinitely.
> > 
> > When the spin_dump() is called from printk(), we should prevent the
> > debug spinlock code from calling printk() again in that context. It's
> > reasonable to avoid printing "lockup suspected" which is just a warning
> > message but it would cause a real lockup definitely.


Hello Byungchul,

thanks for the patch and thanks for bringing this topic to discussion.
let's not rush, if you don't mind, and return back for a bit. there are
some serious cases (when we really would want to see a spin_dump output)
that are broken.

	-ss

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code
  2016-01-31 12:40   ` Sergey Senozhatsky
@ 2016-02-01  1:45     ` Byungchul Park
  2016-02-01  2:13       ` Sergey Senozhatsky
  0 siblings, 1 reply; 9+ messages in thread
From: Byungchul Park @ 2016-02-01  1:45 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: akpm, mingo, linux-kernel, akinobu.mita, jack, peter, torvalds

On Sun, Jan 31, 2016 at 09:40:08PM +0900, Sergey Senozhatsky wrote:
> On (01/29/16 21:54), Byungchul Park wrote:
> > Hello, Andrew
> > 
> > Please take this v5 patch instead of v2 patch, which you took. Or give your
> > opinion.
> > 
> > > It causes an infinite recursive cycle when using CONFIG_DEBUG_SPINLOCK,
> > > in the spin_dump(). Backtrace prints printk() -> console_trylock() ->
> > > do_raw_spin_lock() -> spin_dump() -> printk()... infinitely.
> > > 
> > > When the spin_dump() is called from printk(), we should prevent the
> > > debug spinlock code from calling printk() again in that context. It's
> > > reasonable to avoid printing "lockup suspected" which is just a warning
> > > message but it would cause a real lockup definitely.
> 
> 
> Hello Byungchul,
> 
> thanks for the patch and thanks for bringing this topic to discussion.
> let's not rush, if you don't mind, and return back for a bit. there are
> some serious cases (when we really would want to see a spin_dump output)
> that are broken.

Hello Sergey,

I reviewed your patch, and I hope your proposal to be merged so that the
problematic recursive cycle of printk() can be handled properly e.g. by
panic(). But avoiding an unnecessary recursive cycle is better than
panic(). What I handled in this patch is the warning case which causes
unnecessary lockup and don't need to happen. I think reseting lock or
zapping lock to print a kind of warning is not a good option, even though
your suggestion looks good in the unavoidable lockup case.

Thanks,
Byungchul

> 
> 	-ss

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code
  2016-02-01  1:45     ` Byungchul Park
@ 2016-02-01  2:13       ` Sergey Senozhatsky
  0 siblings, 0 replies; 9+ messages in thread
From: Sergey Senozhatsky @ 2016-02-01  2:13 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Sergey Senozhatsky, akpm, mingo, linux-kernel, akinobu.mita,
	jack, peter, torvalds

On (02/01/16 10:45), Byungchul Park wrote:
> But avoiding an unnecessary recursive cycle is better than panic(). What I handled
> in this patch is the warning case which causes unnecessary lockup and don't need to
> happen.

Hello,

correct, that was one of the reasons why I proposed to
return back to discussion. it's a bit hard to tell if
we have any chance to survive a "lockup suspected"
spin_dump() recursion; even if we have one, it's a race
spin_unlock on CPUA vs. stack overflow on CPUB. we can
be more certain with ->magic mismatch, for example, but
"lockup suspected" is tricky.

	-ss

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code
  2016-01-29 12:43 [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code Byungchul Park
  2016-01-29 12:54 ` Byungchul Park
  2016-01-30  9:27 ` Ingo Molnar
@ 2016-02-01  2:31 ` Sergey Senozhatsky
  2016-02-01  6:28   ` Byungchul Park
  2 siblings, 1 reply; 9+ messages in thread
From: Sergey Senozhatsky @ 2016-02-01  2:31 UTC (permalink / raw)
  To: Byungchul Park
  Cc: akpm, mingo, linux-kernel, akinobu.mita, jack,
	sergey.senozhatsky.work, peter, torvalds

On (01/29/16 21:43), Byungchul Park wrote:
[..]
> +extern int is_console_lock(raw_spinlock_t *lock);
> +
>  static void __spin_lock_debug(raw_spinlock_t *lock)
>  {
>  	u64 i;
> @@ -113,11 +115,19 @@ static void __spin_lock_debug(raw_spinlock_t *lock)
>  			return;
>  		__delay(1);
>  	}
> -	/* lockup suspected: */
> -	spin_dump(lock, "lockup suspected");
> +
> +	/*
> +	 * If this function is called from printk(), then we should
> +	 * not call printk() more. Or it will cause an infinite
> +	 * recursive cycle!
> +	 */
> +	if (likely(!is_console_lock(lock))) {
> +		/* lockup suspected: */
> +		spin_dump(lock, "lockup suspected");
>  #ifdef CONFIG_SMP
> -	trigger_all_cpu_backtrace();
> +		trigger_all_cpu_backtrace();
>  #endif
> +	}

/* speaking in a context of printk locks only */

... may be for a recoverable "lockup suspected" spin_dump()
we can switch to deferred printk of the messages, which is a
bit better than nothing. but for unrecoverable "lockup suspected"
spin_dump() -- an actual bug (spin lock owner is not going to
release the lock any more) -- we need something else, I think.
the bug will neither be reported nor fixed.

	-ss

>  	/*
>  	 * The trylock above was causing a livelock.  Give the lower level arch
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 2ce8826..568ab11 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1981,6 +1981,11 @@ asmlinkage __visible void early_printk(const char *fmt, ...)
>  }
>  #endif
>  
> +int is_console_lock(raw_spinlock_t *lock)
> +{
> +	return lock == &console_sem.lock;
> +}
> +
>  static int __add_preferred_console(char *name, int idx, char *options,
>  				   char *brl_options)
>  {
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code
  2016-02-01  2:31 ` Sergey Senozhatsky
@ 2016-02-01  6:28   ` Byungchul Park
  0 siblings, 0 replies; 9+ messages in thread
From: Byungchul Park @ 2016-02-01  6:28 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: akpm, mingo, linux-kernel, akinobu.mita, jack, peter, torvalds

On Mon, Feb 01, 2016 at 11:31:12AM +0900, Sergey Senozhatsky wrote:
> On (01/29/16 21:43), Byungchul Park wrote:
> [..]
> > +extern int is_console_lock(raw_spinlock_t *lock);
> > +
> >  static void __spin_lock_debug(raw_spinlock_t *lock)
> >  {
> >  	u64 i;
> > @@ -113,11 +115,19 @@ static void __spin_lock_debug(raw_spinlock_t *lock)
> >  			return;
> >  		__delay(1);
> >  	}
> > -	/* lockup suspected: */
> > -	spin_dump(lock, "lockup suspected");
> > +
> > +	/*
> > +	 * If this function is called from printk(), then we should
> > +	 * not call printk() more. Or it will cause an infinite
> > +	 * recursive cycle!
> > +	 */
> > +	if (likely(!is_console_lock(lock))) {
> > +		/* lockup suspected: */
> > +		spin_dump(lock, "lockup suspected");
> >  #ifdef CONFIG_SMP
> > -	trigger_all_cpu_backtrace();
> > +		trigger_all_cpu_backtrace();
> >  #endif
> > +	}
> 
> /* speaking in a context of printk locks only */
> 
> ... may be for a recoverable "lockup suspected" spin_dump()
> we can switch to deferred printk of the messages, which is a
> bit better than nothing. but for unrecoverable "lockup suspected"
> spin_dump() -- an actual bug (spin lock owner is not going to
> release the lock any more) -- we need something else, I think.
> the bug will neither be reported nor fixed.

Agree.

> 
> 	-ss
> 
> >  	/*
> >  	 * The trylock above was causing a livelock.  Give the lower level arch
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index 2ce8826..568ab11 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -1981,6 +1981,11 @@ asmlinkage __visible void early_printk(const char *fmt, ...)
> >  }
> >  #endif
> >  
> > +int is_console_lock(raw_spinlock_t *lock)
> > +{
> > +	return lock == &console_sem.lock;
> > +}
> > +
> >  static int __add_preferred_console(char *name, int idx, char *options,
> >  				   char *brl_options)
> >  {
> > -- 
> > 1.9.1
> > 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code
  2016-01-30  9:27 ` Ingo Molnar
@ 2016-02-02  2:34   ` Byungchul Park
  0 siblings, 0 replies; 9+ messages in thread
From: Byungchul Park @ 2016-02-02  2:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, linux-kernel, akinobu.mita, jack, sergey.senozhatsky.work,
	peter, torvalds

On Sat, Jan 30, 2016 at 10:27:43AM +0100, Ingo Molnar wrote:
> 
> * Byungchul Park <byungchul.park@lge.com> wrote:
> 
> > +
> > +	/*
> > +	 * If this function is called from printk(), then we should
> > +	 * not call printk() more. Or it will cause an infinite
> > +	 * recursive cycle!
> 
> This should be something like:
> 
> > +	 * If this function is called from within printk() then we
> > +	 * should not call printk() again, or it will recurse
> > +	*  infinitely.
> 
> Thanks,

Thank you very much. Not easy to me.

> 
> 	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-02-02  2:35 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-29 12:43 [PATCH v5] lib/spinlock_debug.c: prevent a recursive cycle in the debug code Byungchul Park
2016-01-29 12:54 ` Byungchul Park
2016-01-31 12:40   ` Sergey Senozhatsky
2016-02-01  1:45     ` Byungchul Park
2016-02-01  2:13       ` Sergey Senozhatsky
2016-01-30  9:27 ` Ingo Molnar
2016-02-02  2:34   ` Byungchul Park
2016-02-01  2:31 ` Sergey Senozhatsky
2016-02-01  6:28   ` Byungchul Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).