All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()
@ 2016-02-03  6:02 Byungchul Park
  2016-02-03  7:28 ` Ingo Molnar
  0 siblings, 1 reply; 8+ messages in thread
From: Byungchul Park @ 2016-02-03  6:02 UTC (permalink / raw)
  To: willy, akpm, mingo
  Cc: linux-kernel, akinobu.mita, jack, sergey.senozhatsky.work, peter

change from v1 to v2
- remove unnecessary overhead by the redundant spin(un)lock.

Since I faced a infinite recursive printk() bug, I've tried to propose
patches the title of which is "lib/spinlock_debug.c: prevent a recursive
cycle in the debug code". But I noticed the root problem cannot be fixed
by that, through some discussion thanks to Sergey and Peter. So I focused
on preventing the deadlock.

-----8<-----
>From 56ce4a9c4e9a089eff798fd17015f395751abb62 Mon Sep 17 00:00:00 2001
From: Byungchul Park <byungchul.park@lge.com>
Date: Wed, 3 Feb 2016 14:44:52 +0900
Subject: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()

wake_up_process() is currently protected by spinlock though it's not
necessary. Furthermore, it can cause a deadlock when it's hit from within
printk() since the wake_up_process() can printk() again.

The scenario the bad thing can happen is,

printk
  console_trylock
  console_unlock
    up_console_sem
      up
        raw_spin_lock_irqsave(&sem->lock, flags)
        __up
          wake_up_process
            try_to_wake_up
              raw_spin_lock_irqsave(&p->pi_lock)
                __spin_lock_debug
                  spin_dump
                    printk
                      console_trylock
                        raw_spin_lock_irqsave(&sem->lock, flags)

                        *** DEADLOCK ***

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 kernel/locking/semaphore.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
index b8120ab..14d0aca 100644
--- a/kernel/locking/semaphore.c
+++ b/kernel/locking/semaphore.c
@@ -37,7 +37,7 @@ static noinline void __down(struct semaphore *sem);
 static noinline int __down_interruptible(struct semaphore *sem);
 static noinline int __down_killable(struct semaphore *sem);
 static noinline int __down_timeout(struct semaphore *sem, long timeout);
-static noinline void __up(struct semaphore *sem);
+static noinline struct task_struct *__up(struct semaphore *sem);
 
 /**
  * down - acquire the semaphore
@@ -178,13 +178,23 @@ EXPORT_SYMBOL(down_timeout);
 void up(struct semaphore *sem)
 {
 	unsigned long flags;
+	struct task_struct *p = NULL;
 
 	raw_spin_lock_irqsave(&sem->lock, flags);
 	if (likely(list_empty(&sem->wait_list)))
 		sem->count++;
 	else
-		__up(sem);
+		p = __up(sem);
 	raw_spin_unlock_irqrestore(&sem->lock, flags);
+
+	/*
+	 * wake_up_process() needs not to be protected by a spinlock.
+	 * Thus move it from the protected region to here. What is
+	 * worse, this unnecessary protection can cause a deadlock by
+	 * acquiring the same sem->lock within wake_up_process().
+	 */
+	if (unlikely(p))
+		wake_up_process(p);
 }
 EXPORT_SYMBOL(up);
 
@@ -253,11 +263,11 @@ static noinline int __sched __down_timeout(struct semaphore *sem, long timeout)
 	return __down_common(sem, TASK_UNINTERRUPTIBLE, timeout);
 }
 
-static noinline void __sched __up(struct semaphore *sem)
+static noinline struct task_struct *__sched __up(struct semaphore *sem)
 {
 	struct semaphore_waiter *waiter = list_first_entry(&sem->wait_list,
 						struct semaphore_waiter, list);
 	list_del(&waiter->list);
 	waiter->up = true;
-	wake_up_process(waiter->task);
+	return waiter->task;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()
  2016-02-03  6:02 [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up() Byungchul Park
@ 2016-02-03  7:28 ` Ingo Molnar
  2016-02-03  7:42   ` Sergey Senozhatsky
  0 siblings, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2016-02-03  7:28 UTC (permalink / raw)
  To: Byungchul Park
  Cc: willy, akpm, linux-kernel, akinobu.mita, jack,
	sergey.senozhatsky.work, peter, Linus Torvalds


* Byungchul Park <byungchul.park@lge.com> wrote:

>  void up(struct semaphore *sem)
>  {
>  	unsigned long flags;
> +	struct task_struct *p = NULL;
>  
>  	raw_spin_lock_irqsave(&sem->lock, flags);
>  	if (likely(list_empty(&sem->wait_list)))
>  		sem->count++;
>  	else
> -		__up(sem);
> +		p = __up(sem);
>  	raw_spin_unlock_irqrestore(&sem->lock, flags);
> +
> +	/*
> +	 * wake_up_process() needs not to be protected by a spinlock.
> +	 * Thus move it from the protected region to here. What is
> +	 * worse, this unnecessary protection can cause a deadlock by
> +	 * acquiring the same sem->lock within wake_up_process().
> +	 */
> +	if (unlikely(p))
> +		wake_up_process(p);

So I'm not sure this is completely race free, for cases where a semaphore is 
attached to a task and is managed/destroyed on task exit.

Since we don't have a guaranteed reference to 'p' here, the task might wake up 
(via a signal) and exit (and its task struct might be freed and the semaphore 
might be freed), after we unlocked the semaphore but before we wake the task up.

So why not move printk away from semaphores? Semaphores are classical constructs 
that have legacies and are somewhat non-obvious to use, compared to modern, 
simpler locking primitives. I'd not touch their implementation, unless we are 
absolutely sure this is a safe optimization.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()
  2016-02-03  7:28 ` Ingo Molnar
@ 2016-02-03  7:42   ` Sergey Senozhatsky
  2016-02-03  8:04     ` Ingo Molnar
  2016-02-03  8:12     ` Byungchul Park
  0 siblings, 2 replies; 8+ messages in thread
From: Sergey Senozhatsky @ 2016-02-03  7:42 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Byungchul Park, willy, akpm, linux-kernel, akinobu.mita, jack,
	sergey.senozhatsky.work, peter, Linus Torvalds

On (02/03/16 08:28), Ingo Molnar wrote:
[..]
> So why not move printk away from semaphores? Semaphores are classical constructs 
> that have legacies and are somewhat non-obvious to use, compared to modern, 
> simpler locking primitives. I'd not touch their implementation, unless we are 
> absolutely sure this is a safe optimization.

semaphore's spin_lock is not the only spin lock that printk acquires. it also takes the
logbuf_lock (and different locks in console drivers (up to console driver)).

Jan Kara posted a patch that offloads printing job (console_trylock()-console_unlock())
from printk() call (when printk can offload it). so semaphore and console driver's locks
will go away (mostly) with Jan's patch. logbug spin_lock, however, will stay.

	-ss

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()
  2016-02-03  7:42   ` Sergey Senozhatsky
@ 2016-02-03  8:04     ` Ingo Molnar
  2016-02-03  8:28       ` Sergey Senozhatsky
  2016-02-03  8:12     ` Byungchul Park
  1 sibling, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2016-02-03  8:04 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Byungchul Park, willy, akpm, linux-kernel, akinobu.mita, jack,
	peter, Linus Torvalds


* Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> On (02/03/16 08:28), Ingo Molnar wrote:
> [..]
> > So why not move printk away from semaphores? Semaphores are classical constructs 
> > that have legacies and are somewhat non-obvious to use, compared to modern, 
> > simpler locking primitives. I'd not touch their implementation, unless we are 
> > absolutely sure this is a safe optimization.
> 
> semaphore's spin_lock is not the only spin lock that printk acquires. it also 
> takes the logbuf_lock (and different locks in console drivers (up to console 
> driver)).
>
> Jan Kara posted a patch that offloads printing job 
> (console_trylock()-console_unlock()) from printk() call (when printk can offload 
> it). so semaphore and console driver's locks will go away (mostly) with Jan's 
> patch. logbug spin_lock, however, will stay.

Well, but this patch of yours only affects the semaphore code, so it does not 
change the logbuf_lock situation.

Furthermore, logbuf_lock already has recursion protection:

        /*
         * Ouch, printk recursed into itself!
         */
        if (unlikely(logbuf_cpu == this_cpu)) {

so it should not be possible to re-enter the printk() logbuf_lock critical section 
from the spinlock code. (There are other ways to get the logbuf_lock - if those 
are still triggerable then they should be fixed.)

In any case, recursion protection is generally done in the debugging facilities 
trying to behave lockless.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()
  2016-02-03  7:42   ` Sergey Senozhatsky
  2016-02-03  8:04     ` Ingo Molnar
@ 2016-02-03  8:12     ` Byungchul Park
  2016-02-03  8:30       ` Sergey Senozhatsky
  1 sibling, 1 reply; 8+ messages in thread
From: Byungchul Park @ 2016-02-03  8:12 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Ingo Molnar, willy, akpm, linux-kernel, akinobu.mita, jack,
	peter, Linus Torvalds

On Wed, Feb 03, 2016 at 04:42:23PM +0900, Sergey Senozhatsky wrote:
> On (02/03/16 08:28), Ingo Molnar wrote:
> [..]
> > So why not move printk away from semaphores? Semaphores are classical constructs 
> > that have legacies and are somewhat non-obvious to use, compared to modern, 
> > simpler locking primitives. I'd not touch their implementation, unless we are 
> > absolutely sure this is a safe optimization.
> 
> semaphore's spin_lock is not the only spin lock that printk acquires. it also takes the
> logbuf_lock (and different locks in console drivers (up to console driver)).
> 
> Jan Kara posted a patch that offloads printing job (console_trylock()-console_unlock())
> from printk() call (when printk can offload it). so semaphore and console driver's locks
> will go away (mostly) with Jan's patch. logbug spin_lock, however, will stay.

It sounds good. Could you teach me how to see the patch by Jan?

> 
> 	-ss

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()
  2016-02-03  8:04     ` Ingo Molnar
@ 2016-02-03  8:28       ` Sergey Senozhatsky
  2016-02-03  9:02         ` Ingo Molnar
  0 siblings, 1 reply; 8+ messages in thread
From: Sergey Senozhatsky @ 2016-02-03  8:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Sergey Senozhatsky, Byungchul Park, willy, akpm, linux-kernel,
	akinobu.mita, jack, peter, Linus Torvalds

On (02/03/16 09:04), Ingo Molnar wrote:
> * Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:
> 
> > On (02/03/16 08:28), Ingo Molnar wrote:
> > [..]
> > > So why not move printk away from semaphores? Semaphores are classical constructs 
> > > that have legacies and are somewhat non-obvious to use, compared to modern, 
> > > simpler locking primitives. I'd not touch their implementation, unless we are 
> > > absolutely sure this is a safe optimization.
> > 
> > semaphore's spin_lock is not the only spin lock that printk acquires. it also 
> > takes the logbuf_lock (and different locks in console drivers (up to console 
> > driver)).
> >
> > Jan Kara posted a patch that offloads printing job 
> > (console_trylock()-console_unlock()) from printk() call (when printk can offload 
> > it). so semaphore and console driver's locks will go away (mostly) with Jan's 
> > patch. logbug spin_lock, however, will stay.
> 
> Well, but this patch of yours only affects the semaphore code, so it does not 
> change the logbuf_lock situation.

yes, correct. I just said for the info that there is already 'move printk away from
console_sem' work in progress. Well, the reason for that work is entirely different,
though, but this console_sem recursion and console driver's lock recursion can be
'fixed as a side effect'.

> Furthermore, logbuf_lock already has recursion protection:
> 
>         /*
>          * Ouch, printk recursed into itself!
>          */
>         if (unlikely(logbuf_cpu == this_cpu)) {

it's good, no doubt. but it doesn't work in all of the cases. a simple one is:

vprintk_emit()
...
	raw_spin_lock(&logbuf_lock);
	logbuf_cpu = this_cpu;
	...
	logbuf_cpu = UINT_MAX;
	raw_spin_unlock(&logbuf_lock);      <<  SPIN_BUG_ON
...

if raw_spin_unlock() calls SPIN_BUG_ON, then logbuf_lock recursion detection can't
help. we recurse into vprintk_emit() with logbuf_lock locked and logbuf_cpu != this_cpu.

Peter Hurley also posted the following case (I'll quote):

  serial8250_do_set_termios()
    spin_lock_irqsave()  ** claim port lock **
    ...
    serial_port_out(port, UART_LCR, ....);
      dw8250_serial_out()
        dev_err()
          vprintk_emit()
            console_trylock()
              call_console_drivers()
                serial8250_console_write()
                  spin_lock_irqsave()  ** port lock **
                  ** DEADLOCK **

	-ss

> so it should not be possible to re-enter the printk() logbuf_lock critical section 
> from the spinlock code. (There are other ways to get the logbuf_lock - if those 
> are still triggerable then they should be fixed.)
> 
> In any case, recursion protection is generally done in the debugging facilities 
> trying to behave lockless.
> 
> Thanks,
> 
> 	Ingo
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()
  2016-02-03  8:12     ` Byungchul Park
@ 2016-02-03  8:30       ` Sergey Senozhatsky
  0 siblings, 0 replies; 8+ messages in thread
From: Sergey Senozhatsky @ 2016-02-03  8:30 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Sergey Senozhatsky, Ingo Molnar, willy, akpm, linux-kernel,
	akinobu.mita, jack, peter, Linus Torvalds

On (02/03/16 17:12), Byungchul Park wrote:
> On Wed, Feb 03, 2016 at 04:42:23PM +0900, Sergey Senozhatsky wrote:
> > On (02/03/16 08:28), Ingo Molnar wrote:
> > [..]
> > > So why not move printk away from semaphores? Semaphores are classical constructs 
> > > that have legacies and are somewhat non-obvious to use, compared to modern, 
> > > simpler locking primitives. I'd not touch their implementation, unless we are 
> > > absolutely sure this is a safe optimization.
> > 
> > semaphore's spin_lock is not the only spin lock that printk acquires. it also takes the
> > logbuf_lock (and different locks in console drivers (up to console driver)).
> > 
> > Jan Kara posted a patch that offloads printing job (console_trylock()-console_unlock())
> > from printk() call (when printk can offload it). so semaphore and console driver's locks
> > will go away (mostly) with Jan's patch. logbug spin_lock, however, will stay.
> 
> It sounds good. Could you teach me how to see the patch by Jan?

sure,

	https://lkml.org/lkml/2015/10/26/16
and
	http://marc.info/?l=linux-kernel&m=145079206822115

	-ss

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up()
  2016-02-03  8:28       ` Sergey Senozhatsky
@ 2016-02-03  9:02         ` Ingo Molnar
  0 siblings, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2016-02-03  9:02 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Byungchul Park, willy, akpm, linux-kernel, akinobu.mita, jack,
	peter, Linus Torvalds


* Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> On (02/03/16 09:04), Ingo Molnar wrote:
> > * Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:
> > 
> > > On (02/03/16 08:28), Ingo Molnar wrote:
> > > [..]
> > > > So why not move printk away from semaphores? Semaphores are classical constructs 
> > > > that have legacies and are somewhat non-obvious to use, compared to modern, 
> > > > simpler locking primitives. I'd not touch their implementation, unless we are 
> > > > absolutely sure this is a safe optimization.
> > > 
> > > semaphore's spin_lock is not the only spin lock that printk acquires. it also 
> > > takes the logbuf_lock (and different locks in console drivers (up to console 
> > > driver)).
> > >
> > > Jan Kara posted a patch that offloads printing job 
> > > (console_trylock()-console_unlock()) from printk() call (when printk can offload 
> > > it). so semaphore and console driver's locks will go away (mostly) with Jan's 
> > > patch. logbug spin_lock, however, will stay.
> > 
> > Well, but this patch of yours only affects the semaphore code, so it does not 
> > change the logbuf_lock situation.
> 
> yes, correct. I just said for the info that there is already 'move printk away from
> console_sem' work in progress. Well, the reason for that work is entirely different,
> though, but this console_sem recursion and console driver's lock recursion can be
> 'fixed as a side effect'.
> 
> > Furthermore, logbuf_lock already has recursion protection:
> > 
> >         /*
> >          * Ouch, printk recursed into itself!
> >          */
> >         if (unlikely(logbuf_cpu == this_cpu)) {
> 
> it's good, no doubt. but it doesn't work in all of the cases. a simple one is:
> 
> vprintk_emit()
> ...
> 	raw_spin_lock(&logbuf_lock);
> 	logbuf_cpu = this_cpu;

Yes, I'm aware of that, and as I said:

> > (There are other ways to get the logbuf_lock - if those are still triggerable 
> > then they should be fixed.)

The proper way to fix it would be to factor out the recursion-safe logbuf_lock 
taking code into logbuf_lock()/logbuf_unlock() primitives and use those 
consistently in printk.c.

> 	...
> 	logbuf_cpu = UINT_MAX;
> 	raw_spin_unlock(&logbuf_lock);      <<  SPIN_BUG_ON
> ...
> 
> if raw_spin_unlock() calls SPIN_BUG_ON, then logbuf_lock recursion detection can't
> help. we recurse into vprintk_emit() with logbuf_lock locked and logbuf_cpu != this_cpu.

If recursion-safe logbuf_lock taking is factored out and extended to other 
functions in printk.c then this problem is solved.

> Peter Hurley also posted the following case (I'll quote):
> 
>   serial8250_do_set_termios()
>     spin_lock_irqsave()  ** claim port lock **
>     ...
>     serial_port_out(port, UART_LCR, ....);
>       dw8250_serial_out()
>         dev_err()
>           vprintk_emit()
>             console_trylock()
>               call_console_drivers()
>                 serial8250_console_write()
>                   spin_lock_irqsave()  ** port lock **
>                   ** DEADLOCK **

That too seems to be avoided if vprintk_emit() does not take logbuf_lock naively.

So I repeat my point:

> > In any case, recursion protection is generally done in the debugging facilities 
> > trying to behave lockless.

I.e. to address these deadlocks, printk() should be made fundamentally more 
robust, extending its already existing recursion protection logic to remaining 
parts of printk.c.

That was my expectation when I committed the first variant of printk() recursion 
protection code a few years ago, it just never happened.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-02-03  9:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-03  6:02 [PATCH v2] lock/semaphore: Avoid an unnecessary deadlock within up() Byungchul Park
2016-02-03  7:28 ` Ingo Molnar
2016-02-03  7:42   ` Sergey Senozhatsky
2016-02-03  8:04     ` Ingo Molnar
2016-02-03  8:28       ` Sergey Senozhatsky
2016-02-03  9:02         ` Ingo Molnar
2016-02-03  8:12     ` Byungchul Park
2016-02-03  8:30       ` Sergey Senozhatsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.