All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] signal: simplify/document lock_task_sighand() logic
@ 2014-09-22 16:44 Oleg Nesterov
  2014-09-22 16:44 ` [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand() Oleg Nesterov
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Oleg Nesterov @ 2014-09-22 16:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul E. McKenney, Peter Zijlstra, Rik van Riel, Steven Rostedt,
	Thomas Gleixner, linux-kernel

On 06/03, Steven Rostedt wrote:
>
> You know, this code could use some comments. I may send you a patch,
> because that __lock_task_sighand() is doing a lot of subtle things and
> there's not a single comment explaining it :-(

Sorry for delay.

Does 2/2 look clear enough?

Oleg.

 kernel/fork.c   |    5 ++++-
 kernel/signal.c |   43 ++++++++++++++++++++++++-------------------
 2 files changed, 28 insertions(+), 20 deletions(-)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand()
  2014-09-22 16:44 [PATCH 0/2] signal: simplify/document lock_task_sighand() logic Oleg Nesterov
@ 2014-09-22 16:44 ` Oleg Nesterov
  2014-09-22 18:58   ` Steven Rostedt
  2014-09-23 15:55   ` Peter Zijlstra
  2014-09-22 16:44 ` [PATCH 2/2] signal: document the RCU protection of ->sighand Oleg Nesterov
  2014-09-28 21:43 ` [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks Oleg Nesterov
  2 siblings, 2 replies; 19+ messages in thread
From: Oleg Nesterov @ 2014-09-22 16:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul E. McKenney, Peter Zijlstra, Rik van Riel, Steven Rostedt,
	Thomas Gleixner, linux-kernel

__lock_task_sighand() does local_irq_save() to prevent the potential
deadlock, we can use preempt_disable() with the same effect. And in
this case we can do preempt_disable/enable + rcu_read_lock/unlock only
once outside of the main loop and simplify the code. This also shaves
112 bytes from signal.o.

With this patch the main loop runs with preemption disabled, but this
should be fine because restart is very unlikely: it can only happen if
we race with de_thread() and ->sighand is shared. And the latter is only
possible if CLONE_SIGHAND was used without CLONE_THREAD, most probably
nobody does this nowadays.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/signal.c |   31 +++++++++++++------------------
 1 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f..61a1f55 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1261,30 +1261,25 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
 					   unsigned long *flags)
 {
 	struct sighand_struct *sighand;
-
+	/*
+	 * We are going to do rcu_read_unlock() under spin_lock_irqsave().
+	 * Make sure we can not be preempted after rcu_read_lock(), see
+	 * rcu_read_unlock() comment header for details.
+	 */
+	preempt_disable();
+	rcu_read_lock();
 	for (;;) {
-		/*
-		 * Disable interrupts early to avoid deadlocks.
-		 * See rcu_read_unlock() comment header for details.
-		 */
-		local_irq_save(*flags);
-		rcu_read_lock();
 		sighand = rcu_dereference(tsk->sighand);
-		if (unlikely(sighand == NULL)) {
-			rcu_read_unlock();
-			local_irq_restore(*flags);
+		if (unlikely(sighand == NULL))
 			break;
-		}
 
-		spin_lock(&sighand->siglock);
-		if (likely(sighand == tsk->sighand)) {
-			rcu_read_unlock();
+		spin_lock_irqsave(&sighand->siglock, *flags);
+		if (likely(sighand == tsk->sighand))
 			break;
-		}
-		spin_unlock(&sighand->siglock);
-		rcu_read_unlock();
-		local_irq_restore(*flags);
+		spin_unlock_irqrestore(&sighand->siglock, *flags);
 	}
+	rcu_read_unlock();
+	preempt_enable();
 
 	return sighand;
 }
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/2] signal: document the RCU protection of ->sighand
  2014-09-22 16:44 [PATCH 0/2] signal: simplify/document lock_task_sighand() logic Oleg Nesterov
  2014-09-22 16:44 ` [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand() Oleg Nesterov
@ 2014-09-22 16:44 ` Oleg Nesterov
  2014-09-22 19:00   ` Steven Rostedt
  2014-09-23 11:50   ` Rik van Riel
  2014-09-28 21:43 ` [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks Oleg Nesterov
  2 siblings, 2 replies; 19+ messages in thread
From: Oleg Nesterov @ 2014-09-22 16:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul E. McKenney, Peter Zijlstra, Rik van Riel, Steven Rostedt,
	Thomas Gleixner, linux-kernel

__cleanup_sighand() frees sighand without RCU grace period. This is
correct but this looks "obviously buggy" and constantly confuses the
readers, add the comments to explain how this works.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/fork.c   |    5 ++++-
 kernel/signal.c |   12 +++++++++++-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 1380d8a..2dd9f1d 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1017,11 +1017,14 @@ void __cleanup_sighand(struct sighand_struct *sighand)
 {
 	if (atomic_dec_and_test(&sighand->count)) {
 		signalfd_cleanup(sighand);
+		/*
+		 * sighand_cachep is SLAB_DESTROY_BY_RCU so we can free it
+		 * without an RCU grace period, see __lock_task_sighand().
+		 */
 		kmem_cache_free(sighand_cachep, sighand);
 	}
 }
 
-
 /*
  * Initialize POSIX timer handling for a thread group.
  */
diff --git a/kernel/signal.c b/kernel/signal.c
index 61a1f55..9562a4f 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1272,7 +1272,17 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
 		sighand = rcu_dereference(tsk->sighand);
 		if (unlikely(sighand == NULL))
 			break;
-
+		/*
+		 * This sighand can be already freed and even reused, but
+		 * we rely on SLAB_DESTROY_BY_RCU and sighand_ctor() which
+		 * initializes ->siglock: this slab can't go away, it has
+		 * the same object type, ->siglock can't be reinitialized.
+		 *
+		 * We need to ensure that tsk->sighand is still the same
+		 * after we take the lock, we can race with de_thread() or
+		 * __exit_signal(). In the latter case the next iteration
+		 * must see ->sighand == NULL.
+		 */
 		spin_lock_irqsave(&sighand->siglock, *flags);
 		if (likely(sighand == tsk->sighand))
 			break;
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand()
  2014-09-22 16:44 ` [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand() Oleg Nesterov
@ 2014-09-22 18:58   ` Steven Rostedt
  2014-09-22 19:11     ` Oleg Nesterov
  2014-09-23 15:55   ` Peter Zijlstra
  1 sibling, 1 reply; 19+ messages in thread
From: Steven Rostedt @ 2014-09-22 18:58 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Paul E. McKenney, Peter Zijlstra, Rik van Riel,
	Thomas Gleixner, linux-kernel

On Mon, 22 Sep 2014 18:44:37 +0200
Oleg Nesterov <oleg@redhat.com> wrote:

> __lock_task_sighand() does local_irq_save() to prevent the potential
> deadlock, we can use preempt_disable() with the same effect. And in
> this case we can do preempt_disable/enable + rcu_read_lock/unlock only
> once outside of the main loop and simplify the code. This also shaves
> 112 bytes from signal.o.
> 
> With this patch the main loop runs with preemption disabled, but this
> should be fine because restart is very unlikely: it can only happen if
> we race with de_thread() and ->sighand is shared. And the latter is only
> possible if CLONE_SIGHAND was used without CLONE_THREAD, most probably
> nobody does this nowadays.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>  kernel/signal.c |   31 +++++++++++++------------------
>  1 files changed, 13 insertions(+), 18 deletions(-)
> 
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 8f0876f..61a1f55 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -1261,30 +1261,25 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
>  					   unsigned long *flags)
>  {
>  	struct sighand_struct *sighand;
> -
> +	/*
> +	 * We are going to do rcu_read_unlock() under spin_lock_irqsave().
> +	 * Make sure we can not be preempted after rcu_read_lock(), see
> +	 * rcu_read_unlock() comment header for details.
> +	 */
> +	preempt_disable();

The sad part is, this is going to break -rt. Or is this something we
can have preempt_disable_nort() with (for the -rt kernel that is). That
is, is -rt susceptible to this deadlock as well?

-- Steve


> +	rcu_read_lock();
>  	for (;;) {
> -		/*
> -		 * Disable interrupts early to avoid deadlocks.
> -		 * See rcu_read_unlock() comment header for details.
> -		 */
> -		local_irq_save(*flags);
> -		rcu_read_lock();
>  		sighand = rcu_dereference(tsk->sighand);
> -		if (unlikely(sighand == NULL)) {
> -			rcu_read_unlock();
> -			local_irq_restore(*flags);
> +		if (unlikely(sighand == NULL))
>  			break;
> -		}
>  
> -		spin_lock(&sighand->siglock);
> -		if (likely(sighand == tsk->sighand)) {
> -			rcu_read_unlock();
> +		spin_lock_irqsave(&sighand->siglock, *flags);
> +		if (likely(sighand == tsk->sighand))
>  			break;
> -		}
> -		spin_unlock(&sighand->siglock);
> -		rcu_read_unlock();
> -		local_irq_restore(*flags);
> +		spin_unlock_irqrestore(&sighand->siglock, *flags);
>  	}
> +	rcu_read_unlock();
> +	preempt_enable();
>  
>  	return sighand;
>  }


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] signal: document the RCU protection of ->sighand
  2014-09-22 16:44 ` [PATCH 2/2] signal: document the RCU protection of ->sighand Oleg Nesterov
@ 2014-09-22 19:00   ` Steven Rostedt
  2014-09-23 11:50   ` Rik van Riel
  1 sibling, 0 replies; 19+ messages in thread
From: Steven Rostedt @ 2014-09-22 19:00 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Paul E. McKenney, Peter Zijlstra, Rik van Riel,
	Thomas Gleixner, linux-kernel

On Mon, 22 Sep 2014 18:44:40 +0200
Oleg Nesterov <oleg@redhat.com> wrote:

> __cleanup_sighand() frees sighand without RCU grace period. This is
> correct but this looks "obviously buggy" and constantly confuses the
> readers, add the comments to explain how this works.

Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>  kernel/fork.c   |    5 ++++-
>  kernel/signal.c |   12 +++++++++++-
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 1380d8a..2dd9f1d 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1017,11 +1017,14 @@ void __cleanup_sighand(struct sighand_struct *sighand)
>  {
>  	if (atomic_dec_and_test(&sighand->count)) {
>  		signalfd_cleanup(sighand);
> +		/*
> +		 * sighand_cachep is SLAB_DESTROY_BY_RCU so we can free it
> +		 * without an RCU grace period, see __lock_task_sighand().
> +		 */
>  		kmem_cache_free(sighand_cachep, sighand);
>  	}
>  }
>  
> -
>  /*
>   * Initialize POSIX timer handling for a thread group.
>   */
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 61a1f55..9562a4f 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -1272,7 +1272,17 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
>  		sighand = rcu_dereference(tsk->sighand);
>  		if (unlikely(sighand == NULL))
>  			break;
> -
> +		/*
> +		 * This sighand can be already freed and even reused, but
> +		 * we rely on SLAB_DESTROY_BY_RCU and sighand_ctor() which
> +		 * initializes ->siglock: this slab can't go away, it has
> +		 * the same object type, ->siglock can't be reinitialized.
> +		 *
> +		 * We need to ensure that tsk->sighand is still the same
> +		 * after we take the lock, we can race with de_thread() or
> +		 * __exit_signal(). In the latter case the next iteration
> +		 * must see ->sighand == NULL.
> +		 */
>  		spin_lock_irqsave(&sighand->siglock, *flags);
>  		if (likely(sighand == tsk->sighand))
>  			break;


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand()
  2014-09-22 18:58   ` Steven Rostedt
@ 2014-09-22 19:11     ` Oleg Nesterov
  2014-09-22 21:24       ` Steven Rostedt
  0 siblings, 1 reply; 19+ messages in thread
From: Oleg Nesterov @ 2014-09-22 19:11 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andrew Morton, Paul E. McKenney, Peter Zijlstra, Rik van Riel,
	Thomas Gleixner, linux-kernel

On 09/22, Steven Rostedt wrote:
>
> On Mon, 22 Sep 2014 18:44:37 +0200
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > __lock_task_sighand() does local_irq_save() to prevent the potential
> > deadlock, we can use preempt_disable() with the same effect. And in
> > this case we can do preempt_disable/enable + rcu_read_lock/unlock only
> > once outside of the main loop and simplify the code. This also shaves
> > 112 bytes from signal.o.
> >
> > With this patch the main loop runs with preemption disabled, but this
> > should be fine because restart is very unlikely: it can only happen if
> > we race with de_thread() and ->sighand is shared. And the latter is only
> > possible if CLONE_SIGHAND was used without CLONE_THREAD, most probably
> > nobody does this nowadays.
> >
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> > ---
> >  kernel/signal.c |   31 +++++++++++++------------------
> >  1 files changed, 13 insertions(+), 18 deletions(-)
> >
> > diff --git a/kernel/signal.c b/kernel/signal.c
> > index 8f0876f..61a1f55 100644
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -1261,30 +1261,25 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
> >  					   unsigned long *flags)
> >  {
> >  	struct sighand_struct *sighand;
> > -
> > +	/*
> > +	 * We are going to do rcu_read_unlock() under spin_lock_irqsave().
> > +	 * Make sure we can not be preempted after rcu_read_lock(), see
> > +	 * rcu_read_unlock() comment header for details.
> > +	 */
> > +	preempt_disable();
>
> The sad part is, this is going to break -rt.

Hmm, why??

> That
> is, is -rt susceptible to this deadlock as well?

In fact this deadlock is not really possible in any case, scheduler locks
should be fine under ->siglock (for example, signal_wake_up() is called
under this lock).

But, the comment above rcu_read_unlock() says:

	Given that the set of locks acquired by rt_mutex_unlock() might change
	at any time, a somewhat more future-proofed approach is to make sure
	that that preemption never happens ...

so this patch doesn't try to change the rules.

But perhaps we can simply remove this preempt_disable/enable?

Or. We can shift rcu_read_unlock() from lock_task_sighand() to
unlock_task_sighand(). This way we can avoid preempt_disable too, but
I'd prefer to not do this.

Oleg.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand()
  2014-09-22 19:11     ` Oleg Nesterov
@ 2014-09-22 21:24       ` Steven Rostedt
  2014-09-23 11:45         ` Rik van Riel
  2014-09-23 19:03         ` Oleg Nesterov
  0 siblings, 2 replies; 19+ messages in thread
From: Steven Rostedt @ 2014-09-22 21:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Paul E. McKenney, Peter Zijlstra, Rik van Riel,
	Thomas Gleixner, linux-kernel

On Mon, 22 Sep 2014 21:11:30 +0200
Oleg Nesterov <oleg@redhat.com> wrote:

> On 09/22, Steven Rostedt wrote:
> >
> > On Mon, 22 Sep 2014 18:44:37 +0200
> > Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > > __lock_task_sighand() does local_irq_save() to prevent the potential
> > > deadlock, we can use preempt_disable() with the same effect. And in
> > > this case we can do preempt_disable/enable + rcu_read_lock/unlock only
> > > once outside of the main loop and simplify the code. This also shaves
> > > 112 bytes from signal.o.
> > >
> > > With this patch the main loop runs with preemption disabled, but this
> > > should be fine because restart is very unlikely: it can only happen if
> > > we race with de_thread() and ->sighand is shared. And the latter is only
> > > possible if CLONE_SIGHAND was used without CLONE_THREAD, most probably
> > > nobody does this nowadays.
> > >
> > > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> > > ---
> > >  kernel/signal.c |   31 +++++++++++++------------------
> > >  1 files changed, 13 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/kernel/signal.c b/kernel/signal.c
> > > index 8f0876f..61a1f55 100644
> > > --- a/kernel/signal.c
> > > +++ b/kernel/signal.c
> > > @@ -1261,30 +1261,25 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
> > >  					   unsigned long *flags)
> > >  {
> > >  	struct sighand_struct *sighand;
> > > -
> > > +	/*
> > > +	 * We are going to do rcu_read_unlock() under spin_lock_irqsave().
> > > +	 * Make sure we can not be preempted after rcu_read_lock(), see
> > > +	 * rcu_read_unlock() comment header for details.
> > > +	 */
> > > +	preempt_disable();
> >
> > The sad part is, this is going to break -rt.
> 
> Hmm, why??

Because in -rt, siglock is a mutex.

> 
> > That
> > is, is -rt susceptible to this deadlock as well?

As siglock is a mutex, this shouldn't be a problem.

> 
> In fact this deadlock is not really possible in any case, scheduler locks
> should be fine under ->siglock (for example, signal_wake_up() is called
> under this lock).
> 
> But, the comment above rcu_read_unlock() says:
> 
> 	Given that the set of locks acquired by rt_mutex_unlock() might change
> 	at any time, a somewhat more future-proofed approach is to make sure
> 	that that preemption never happens ...

Hmm, I'm not sure we need to worry about this. As in -rt siglock is a
mutex, which is rt_mutex() itself, I highly doubt we will have
rt_mutex_unlock() grab siglock, otherwise that would cause havoc in -rt.

> 
> so this patch doesn't try to change the rules.
> 
> But perhaps we can simply remove this preempt_disable/enable?
> 
> Or. We can shift rcu_read_unlock() from lock_task_sighand() to
> unlock_task_sighand(). This way we can avoid preempt_disable too, but
> I'd prefer to not do this.

I really thing the preempt_disable/enable is not needed.

Paul, Thomas, care to comment?

-- Steve

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand()
  2014-09-22 21:24       ` Steven Rostedt
@ 2014-09-23 11:45         ` Rik van Riel
  2014-09-23 14:20           ` Peter Zijlstra
  2014-09-23 19:03         ` Oleg Nesterov
  1 sibling, 1 reply; 19+ messages in thread
From: Rik van Riel @ 2014-09-23 11:45 UTC (permalink / raw)
  To: Steven Rostedt, Oleg Nesterov
  Cc: Andrew Morton, Paul E. McKenney, Peter Zijlstra, Thomas Gleixner,
	linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 09/22/2014 05:24 PM, Steven Rostedt wrote:
> On Mon, 22 Sep 2014 21:11:30 +0200 Oleg Nesterov <oleg@redhat.com>
> wrote:

>> so this patch doesn't try to change the rules.
>> 
>> But perhaps we can simply remove this preempt_disable/enable?
>> 
>> Or. We can shift rcu_read_unlock() from lock_task_sighand() to 
>> unlock_task_sighand(). This way we can avoid preempt_disable too,
>> but I'd prefer to not do this.
> 
> I really thing the preempt_disable/enable is not needed.
> 
> Paul, Thomas, care to comment?

I suspect you are right.  On normal kernels, rcu_read_lock() will
ensure preemption is disabled.

On -rt, the locks within are all sleepable mutexes.

Either way, things should be ok.

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJUIV1SAAoJEM553pKExN6DXroIAIFFNLb1tDCHZCmW5q/so0U9
sd0MLhN/oRx9+tZuZw7RfIglTDEukMKuFI7lRvoIkH+uZ9ObRVqjGB35CWNVRaWy
wzEyjIPO9OboHTCygBdxmLgC6I9Lg3qFQP2tuwLyEeWJMFylWuDDBtHCmVnbq78G
AvUxyz9z+XuFSHLumD+DYMLkYSpMhzzCSSQ7uiOEYP+LA3a2DJuFgNxBvZWQAIu4
j7uWcWLlv3TJYkK7iTudJyXl4DqyhanILh5pHLXIv3w2I0tAycpPv73QH64Qog6S
3R5BoIrfFvfgldx1YaB26lk6jvg3jgi61fSaUx6ZoaMGBM+QBfxiNdI4G7Yn/8k=
=EKgM
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] signal: document the RCU protection of ->sighand
  2014-09-22 16:44 ` [PATCH 2/2] signal: document the RCU protection of ->sighand Oleg Nesterov
  2014-09-22 19:00   ` Steven Rostedt
@ 2014-09-23 11:50   ` Rik van Riel
  1 sibling, 0 replies; 19+ messages in thread
From: Rik van Riel @ 2014-09-23 11:50 UTC (permalink / raw)
  To: Oleg Nesterov, Andrew Morton
  Cc: Paul E. McKenney, Peter Zijlstra, Steven Rostedt,
	Thomas Gleixner, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 09/22/2014 12:44 PM, Oleg Nesterov wrote:
> __cleanup_sighand() frees sighand without RCU grace period. This
> is correct but this looks "obviously buggy" and constantly confuses
> the readers, add the comments to explain how this works.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Reviewed-by: Rik van Riel <riel@redhat.com>

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJUIV57AAoJEM553pKExN6DknIH/0uxa6E8h0hOHE3qSU6EGxXp
0MF7cKifnaaV/tI+4w7MY1afknMjgiuSRoSZnw+1aTCkxhQKVfxboOt4cOrs/kQ5
pgBwCYvYGEtFS40XAmeZ7emxh7jZXpM7yGEzjsfCNZfPqWf78OBYiV093aix3gPj
lkUiKp+Lociux7PbOw+ADkvu4VE7PuhKt/Ya4JLJcxr+TODazeyuwfch/Id2Sc4y
GhMm34+poJE8e3go3LrsIOr8BIffyyXxRVCrZVMduLuH+/oFGZa3D/eKZX/hLX8L
WXAaf4/xNlPDHYuSoQX5pjTPAgua11DJQ0kW35zJ3EubXWdNfTLUGVhNVQgw7h0=
=h5ZA
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand()
  2014-09-23 11:45         ` Rik van Riel
@ 2014-09-23 14:20           ` Peter Zijlstra
  2014-09-23 14:30             ` Steven Rostedt
  0 siblings, 1 reply; 19+ messages in thread
From: Peter Zijlstra @ 2014-09-23 14:20 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Steven Rostedt, Oleg Nesterov, Andrew Morton, Paul E. McKenney,
	Thomas Gleixner, linux-kernel

On Tue, Sep 23, 2014 at 07:45:22AM -0400, Rik van Riel wrote:

> > I really thing the preempt_disable/enable is not needed.
> > 
> > Paul, Thomas, care to comment?
> 
> I suspect you are right.  On normal kernels, rcu_read_lock() will
> ensure preemption is disabled.
> 
> On -rt, the locks within are all sleepable mutexes.
> 
> Either way, things should be ok.

But with CONFIG_PREEMPT we get preemptible RCU but not the
spinlock->rt_mutex conversion.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand()
  2014-09-23 14:20           ` Peter Zijlstra
@ 2014-09-23 14:30             ` Steven Rostedt
  0 siblings, 0 replies; 19+ messages in thread
From: Steven Rostedt @ 2014-09-23 14:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rik van Riel, Oleg Nesterov, Andrew Morton, Paul E. McKenney,
	Thomas Gleixner, linux-kernel

On Tue, 23 Sep 2014 16:20:37 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Tue, Sep 23, 2014 at 07:45:22AM -0400, Rik van Riel wrote:
> 
> > > I really thing the preempt_disable/enable is not needed.
> > > 
> > > Paul, Thomas, care to comment?
> > 
> > I suspect you are right.  On normal kernels, rcu_read_lock() will
> > ensure preemption is disabled.
> > 
> > On -rt, the locks within are all sleepable mutexes.
> > 
> > Either way, things should be ok.
> 
> But with CONFIG_PREEMPT we get preemptible RCU but not the
> spinlock->rt_mutex conversion.

Right, I was about to say that. That is, rcu_read_lock() in mainline
does not always disable preemption (See CONFIG_PREEMPT_RCU).

But I still think this should be safe. I don't see the siglock ever
being taking in the scheduler or by rt_mutex_unlock().

-- Steve

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand()
  2014-09-22 16:44 ` [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand() Oleg Nesterov
  2014-09-22 18:58   ` Steven Rostedt
@ 2014-09-23 15:55   ` Peter Zijlstra
  1 sibling, 0 replies; 19+ messages in thread
From: Peter Zijlstra @ 2014-09-23 15:55 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Paul E. McKenney, Rik van Riel, Steven Rostedt,
	Thomas Gleixner, linux-kernel

On Mon, Sep 22, 2014 at 06:44:37PM +0200, Oleg Nesterov wrote:
> __lock_task_sighand() does local_irq_save() to prevent the potential
> deadlock, we can use preempt_disable() with the same effect. And in
> this case we can do preempt_disable/enable + rcu_read_lock/unlock only
> once outside of the main loop and simplify the code. This also shaves
> 112 bytes from signal.o.
> 
> With this patch the main loop runs with preemption disabled, but this
> should be fine because restart is very unlikely: it can only happen if
> we race with de_thread() and ->sighand is shared. And the latter is only
> possible if CLONE_SIGHAND was used without CLONE_THREAD, most probably
> nobody does this nowadays.

If its unlikely to repeat, it shouldn't matter either way and we can
keep the preempt_disable() inside the loop to guarantee better worst
case behaviour.

And I suppose we can fix -rt, like Steve already mentioned, there's the
preempt_disable_nort() thing for just such cases.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand()
  2014-09-22 21:24       ` Steven Rostedt
  2014-09-23 11:45         ` Rik van Riel
@ 2014-09-23 19:03         ` Oleg Nesterov
  2014-09-24  8:36           ` Paul E. McKenney
  1 sibling, 1 reply; 19+ messages in thread
From: Oleg Nesterov @ 2014-09-23 19:03 UTC (permalink / raw)
  To: Steven Rostedt, Paul E. McKenney
  Cc: Andrew Morton, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	linux-kernel

On 09/22, Steven Rostedt wrote:
>
> On Mon, 22 Sep 2014 21:11:30 +0200
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > > > @@ -1261,30 +1261,25 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
> > > >  					   unsigned long *flags)
> > > >  {
> > > >  	struct sighand_struct *sighand;
> > > > -
> > > > +	/*
> > > > +	 * We are going to do rcu_read_unlock() under spin_lock_irqsave().
> > > > +	 * Make sure we can not be preempted after rcu_read_lock(), see
> > > > +	 * rcu_read_unlock() comment header for details.
> > > > +	 */
> > > > +	preempt_disable();
> > >
> > > The sad part is, this is going to break -rt.
> >
> > Hmm, why??
>
> Because in -rt, siglock is a mutex.

Yes, thanks... I thougt that -rt should handle this somehow, we have
more examples of preempt_disable() + spin_lock().

OK, let's forger this patch. It was supposed to be a cleanup, it should
not disturb -rt.

> > In fact this deadlock is not really possible in any case, scheduler locks
> > should be fine under ->siglock (for example, signal_wake_up() is called
> > under this lock).
> >
> > But, the comment above rcu_read_unlock() says:
> >
> > 	Given that the set of locks acquired by rt_mutex_unlock() might change
> > 	at any time, a somewhat more future-proofed approach is to make sure
> > 	that that preemption never happens ...
>
> Hmm, I'm not sure we need to worry about this. As in -rt siglock is a
> mutex, which is rt_mutex() itself, I highly doubt we will have
> rt_mutex_unlock() grab siglock, otherwise that would cause havoc in -rt.

Yes. And, the changelog in a841796f "signal: align __lock_task_sighand() irq
disabling and RCU" says:

	It is therefore possible that this RCU read-side critical
	section will be preempted and later RCU priority boosted, which means
	that rcu_read_unlock() will call rt_mutex_unlock() in order to deboost
	itself, but with interrupts disabled. This results in lockdep splats
	...
	It is quite possible that a better long-term fix is to make rt_mutex_unlock()
	disable irqs when acquiring the rt_mutex structure's ->wait_lock.

but this doesn't look right, raw_spin_lock(&lock->wait_lock) should be
fine with irqs disabled or I am totally confused. rt_mutex_adjust_prio()
does _irqsave/irqrestore, so this can't enable interrupts.

Paul, will you agree if we turn it into

struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
					   unsigned long *flags)
{
	struct sighand_struct *sighand;

	rcu_read_lock();
	for (;;) {
		sighand = rcu_dereference(tsk->sighand);
		if (unlikely(sighand == NULL))
			break;

		spin_lock_irqsave(&sighand->siglock, *flags);
		if (likely(sighand == tsk->sighand))
			break;
		spin_unlock_irqrestore(&sighand->siglock, *flags);
	}
	/*
	 * On the succesfull return we hold ->siglock. According to comment
	 * above rcu_read_unlock() this is against the rules, but scheduler
	 * locks are fine under this lock, signal_wake_up() takes them too.
	 */
	rcu_read_unlock();

	return sighand;
}

?

Or I can leave this code alone, this is the minor cleanup. Just to me this
sequence

	local_irq_save();
	rcu_read_lock();
	spin_lock();

looks a bit confusing/annoying even with the comment.

Oleg.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand()
  2014-09-23 19:03         ` Oleg Nesterov
@ 2014-09-24  8:36           ` Paul E. McKenney
  0 siblings, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2014-09-24  8:36 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Steven Rostedt, Andrew Morton, Peter Zijlstra, Rik van Riel,
	Thomas Gleixner, linux-kernel

On Tue, Sep 23, 2014 at 09:03:48PM +0200, Oleg Nesterov wrote:
> On 09/22, Steven Rostedt wrote:
> >
> > On Mon, 22 Sep 2014 21:11:30 +0200
> > Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > > > > @@ -1261,30 +1261,25 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
> > > > >  					   unsigned long *flags)
> > > > >  {
> > > > >  	struct sighand_struct *sighand;
> > > > > -
> > > > > +	/*
> > > > > +	 * We are going to do rcu_read_unlock() under spin_lock_irqsave().
> > > > > +	 * Make sure we can not be preempted after rcu_read_lock(), see
> > > > > +	 * rcu_read_unlock() comment header for details.
> > > > > +	 */
> > > > > +	preempt_disable();
> > > >
> > > > The sad part is, this is going to break -rt.
> > >
> > > Hmm, why??
> >
> > Because in -rt, siglock is a mutex.
> 
> Yes, thanks... I thougt that -rt should handle this somehow, we have
> more examples of preempt_disable() + spin_lock().
> 
> OK, let's forger this patch. It was supposed to be a cleanup, it should
> not disturb -rt.
> 
> > > In fact this deadlock is not really possible in any case, scheduler locks
> > > should be fine under ->siglock (for example, signal_wake_up() is called
> > > under this lock).
> > >
> > > But, the comment above rcu_read_unlock() says:
> > >
> > > 	Given that the set of locks acquired by rt_mutex_unlock() might change
> > > 	at any time, a somewhat more future-proofed approach is to make sure
> > > 	that that preemption never happens ...
> >
> > Hmm, I'm not sure we need to worry about this. As in -rt siglock is a
> > mutex, which is rt_mutex() itself, I highly doubt we will have
> > rt_mutex_unlock() grab siglock, otherwise that would cause havoc in -rt.
> 
> Yes. And, the changelog in a841796f "signal: align __lock_task_sighand() irq
> disabling and RCU" says:
> 
> 	It is therefore possible that this RCU read-side critical
> 	section will be preempted and later RCU priority boosted, which means
> 	that rcu_read_unlock() will call rt_mutex_unlock() in order to deboost
> 	itself, but with interrupts disabled. This results in lockdep splats
> 	...
> 	It is quite possible that a better long-term fix is to make rt_mutex_unlock()
> 	disable irqs when acquiring the rt_mutex structure's ->wait_lock.
> 
> but this doesn't look right, raw_spin_lock(&lock->wait_lock) should be
> fine with irqs disabled or I am totally confused. rt_mutex_adjust_prio()
> does _irqsave/irqrestore, so this can't enable interrupts.
> 
> Paul, will you agree if we turn it into

If you guys continue the guarantee of no deadlock, I am OK with this change.

							Thanx, Paul

> struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
> 					   unsigned long *flags)
> {
> 	struct sighand_struct *sighand;
> 
> 	rcu_read_lock();
> 	for (;;) {
> 		sighand = rcu_dereference(tsk->sighand);
> 		if (unlikely(sighand == NULL))
> 			break;
> 
> 		spin_lock_irqsave(&sighand->siglock, *flags);
> 		if (likely(sighand == tsk->sighand))
> 			break;
> 		spin_unlock_irqrestore(&sighand->siglock, *flags);
> 	}
> 	/*
> 	 * On the succesfull return we hold ->siglock. According to comment
> 	 * above rcu_read_unlock() this is against the rules, but scheduler
> 	 * locks are fine under this lock, signal_wake_up() takes them too.
> 	 */
> 	rcu_read_unlock();
> 
> 	return sighand;
> }
> 
> ?
> 
> Or I can leave this code alone, this is the minor cleanup. Just to me this
> sequence
> 
> 	local_irq_save();
> 	rcu_read_lock();
> 	spin_lock();
> 
> looks a bit confusing/annoying even with the comment.
> 
> Oleg.
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks
  2014-09-22 16:44 [PATCH 0/2] signal: simplify/document lock_task_sighand() logic Oleg Nesterov
  2014-09-22 16:44 ` [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand() Oleg Nesterov
  2014-09-22 16:44 ` [PATCH 2/2] signal: document the RCU protection of ->sighand Oleg Nesterov
@ 2014-09-28 21:43 ` Oleg Nesterov
  2014-09-28 21:44   ` [PATCH v2 1/2] signal: document the RCU protection of ->sighand Oleg Nesterov
                     ` (2 more replies)
  2 siblings, 3 replies; 19+ messages in thread
From: Oleg Nesterov @ 2014-09-28 21:43 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Andrew Morton, Peter Zijlstra, Rik van Riel, Steven Rostedt,
	Thomas Gleixner, linux-kernel

Paul, could you take these 2 doc patches? Assuming that you agree
with the comments, of course.

On 09/24, Paul E. McKenney wrote:
>
> On Tue, Sep 23, 2014 at 09:03:48PM +0200, Oleg Nesterov wrote:
> >
> > Paul, will you agree if we turn it into
> > ...
> >      /*
> >       * On the succesfull return we hold ->siglock. According to comment
> >       * above rcu_read_unlock() this is against the rules, but scheduler
> >       * locks are fine under this lock, signal_wake_up() takes them too.
> >       */
> >      rcu_read_unlock();
>
> If you guys continue the guarantee of no deadlock, I am OK with this change.

Heh. Contrary to what I said (and you all were agree ;), this deadlock
is actually possible, so we can not remove the deadlock-avoidance from
__lock_task_sighand(). And I do not see how we can cleanup this code
because preempt_disable() + spin_lock() is not -rt friendly.

I think this deserves a bit of documentation, see 2/2. Perhaps this is
just me, but imo the current comment is a bit misleading.

"if the caller of rcu_read_unlock() already holds one of these locks ..."
is not a problem in fact. I mean, pi_lock or rq->lock are special enough,
nobody should ever call the outermost rcu_read_unlock() with these locks
held. rt_mutex->wait_lock should be fine too, also because ->boost_mtx
is private to rcu_boost() and rcu_read_unlock_special().

But. They can race with each other, and that is why rcu_read_unlock()
under (say) ->siglock can actually lead to deadlock. And only because
rt_mutex->wait_lock doesn't disable irqs. Or I am totally confused.

Perhaps we can change rtmutex.c to use raw_spin_lock_irqsave(), or do
something else...

Oleg.

 include/linux/rcupdate.h |    4 +++-
 kernel/fork.c            |    5 ++++-
 kernel/signal.c          |   12 +++++++++++-
 3 files changed, 18 insertions(+), 3 deletions(-)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/2] signal: document the RCU protection of ->sighand
  2014-09-28 21:43 ` [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks Oleg Nesterov
@ 2014-09-28 21:44   ` Oleg Nesterov
  2014-09-28 21:44   ` [PATCH v2 2/2] rcu: more info about potential deadlocks with rcu_read_unlock() Oleg Nesterov
  2014-10-23 19:56   ` [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks Oleg Nesterov
  2 siblings, 0 replies; 19+ messages in thread
From: Oleg Nesterov @ 2014-09-28 21:44 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Andrew Morton, Peter Zijlstra, Rik van Riel, Steven Rostedt,
	Thomas Gleixner, linux-kernel

__cleanup_sighand() frees sighand without RCU grace period. This is
correct but this looks "obviously buggy" and constantly confuses the
readers, add the comments to explain how this works.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
---
 kernel/fork.c   |    5 ++++-
 kernel/signal.c |   12 +++++++++++-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 1380d8a..2dd9f1d 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1017,11 +1017,14 @@ void __cleanup_sighand(struct sighand_struct *sighand)
 {
 	if (atomic_dec_and_test(&sighand->count)) {
 		signalfd_cleanup(sighand);
+		/*
+		 * sighand_cachep is SLAB_DESTROY_BY_RCU so we can free it
+		 * without an RCU grace period, see __lock_task_sighand().
+		 */
 		kmem_cache_free(sighand_cachep, sighand);
 	}
 }
 
-
 /*
  * Initialize POSIX timer handling for a thread group.
  */
diff --git a/kernel/signal.c b/kernel/signal.c
index 8f0876f..c187133 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1275,7 +1275,17 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
 			local_irq_restore(*flags);
 			break;
 		}
-
+		/*
+		 * This sighand can be already freed and even reused, but
+		 * we rely on SLAB_DESTROY_BY_RCU and sighand_ctor() which
+		 * initializes ->siglock: this slab can't go away, it has
+		 * the same object type, ->siglock can't be reinitialized.
+		 *
+		 * We need to ensure that tsk->sighand is still the same
+		 * after we take the lock, we can race with de_thread() or
+		 * __exit_signal(). In the latter case the next iteration
+		 * must see ->sighand == NULL.
+		 */
 		spin_lock(&sighand->siglock);
 		if (likely(sighand == tsk->sighand)) {
 			rcu_read_unlock();
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/2] rcu: more info about potential deadlocks with rcu_read_unlock()
  2014-09-28 21:43 ` [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks Oleg Nesterov
  2014-09-28 21:44   ` [PATCH v2 1/2] signal: document the RCU protection of ->sighand Oleg Nesterov
@ 2014-09-28 21:44   ` Oleg Nesterov
  2014-10-23 19:56   ` [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks Oleg Nesterov
  2 siblings, 0 replies; 19+ messages in thread
From: Oleg Nesterov @ 2014-09-28 21:44 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Andrew Morton, Peter Zijlstra, Rik van Riel, Steven Rostedt,
	Thomas Gleixner, linux-kernel

The comment above rcu_read_unlock() explains the potential deadlock
if the caller holds one of the locks taken by rt_mutex_unlock() paths,
but it is not clear from this documentation that any lock which can
be taken from interrupt can lead to deadlock as well and we need to
take rt_mutex_lock() into account too.

The problem is that rt_mutex_lock() takes wait_lock without disabling
irqs, and thus an interrupt taking some LOCK can obviously race with
rcu_read_unlock_special() called with the same LOCK held.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/rcupdate.h |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index d231aa1..5abcfda 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -863,7 +863,9 @@ static inline void rcu_read_lock(void)
  * Unfortunately, this function acquires the scheduler's runqueue and
  * priority-inheritance spinlocks.  This means that deadlock could result
  * if the caller of rcu_read_unlock() already holds one of these locks or
- * any lock that is ever acquired while holding them.
+ * any lock that is ever acquired while holding them; or any lock which
+ * can be taken from interrupt context because rcu_boost()->rt_mutex_lock()
+ * does not disable irqs while taking ->wait_lock.
  *
  * That said, RCU readers are never priority boosted unless they were
  * preempted.  Therefore, one way to avoid deadlock is to make sure
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks
  2014-09-28 21:43 ` [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks Oleg Nesterov
  2014-09-28 21:44   ` [PATCH v2 1/2] signal: document the RCU protection of ->sighand Oleg Nesterov
  2014-09-28 21:44   ` [PATCH v2 2/2] rcu: more info about potential deadlocks with rcu_read_unlock() Oleg Nesterov
@ 2014-10-23 19:56   ` Oleg Nesterov
  2014-10-23 20:29     ` Paul E. McKenney
  2 siblings, 1 reply; 19+ messages in thread
From: Oleg Nesterov @ 2014-10-23 19:56 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Andrew Morton, Peter Zijlstra, Rik van Riel, Steven Rostedt,
	Thomas Gleixner, linux-kernel

Ping ;)

Paul, should I resend or you do not think this can go via your rcu
tree?

On 09/28, Oleg Nesterov wrote:
>
> Paul, could you take these 2 doc patches? Assuming that you agree
> with the comments, of course.
>
> On 09/24, Paul E. McKenney wrote:
> >
> > On Tue, Sep 23, 2014 at 09:03:48PM +0200, Oleg Nesterov wrote:
> > >
> > > Paul, will you agree if we turn it into
> > > ...
> > >      /*
> > >       * On the succesfull return we hold ->siglock. According to comment
> > >       * above rcu_read_unlock() this is against the rules, but scheduler
> > >       * locks are fine under this lock, signal_wake_up() takes them too.
> > >       */
> > >      rcu_read_unlock();
> >
> > If you guys continue the guarantee of no deadlock, I am OK with this change.
>
> Heh. Contrary to what I said (and you all were agree ;), this deadlock
> is actually possible, so we can not remove the deadlock-avoidance from
> __lock_task_sighand(). And I do not see how we can cleanup this code
> because preempt_disable() + spin_lock() is not -rt friendly.
>
> I think this deserves a bit of documentation, see 2/2. Perhaps this is
> just me, but imo the current comment is a bit misleading.
>
> "if the caller of rcu_read_unlock() already holds one of these locks ..."
> is not a problem in fact. I mean, pi_lock or rq->lock are special enough,
> nobody should ever call the outermost rcu_read_unlock() with these locks
> held. rt_mutex->wait_lock should be fine too, also because ->boost_mtx
> is private to rcu_boost() and rcu_read_unlock_special().
>
> But. They can race with each other, and that is why rcu_read_unlock()
> under (say) ->siglock can actually lead to deadlock. And only because
> rt_mutex->wait_lock doesn't disable irqs. Or I am totally confused.
>
> Perhaps we can change rtmutex.c to use raw_spin_lock_irqsave(), or do
> something else...
>
> Oleg.
>
>  include/linux/rcupdate.h |    4 +++-
>  kernel/fork.c            |    5 ++++-
>  kernel/signal.c          |   12 +++++++++++-
>  3 files changed, 18 insertions(+), 3 deletions(-)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks
  2014-10-23 19:56   ` [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks Oleg Nesterov
@ 2014-10-23 20:29     ` Paul E. McKenney
  0 siblings, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2014-10-23 20:29 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Peter Zijlstra, Rik van Riel, Steven Rostedt,
	Thomas Gleixner, linux-kernel

On Thu, Oct 23, 2014 at 09:56:04PM +0200, Oleg Nesterov wrote:
> Ping ;)
> 
> Paul, should I resend or you do not think this can go via your rcu
> tree?

Please accept my apologies for missing these!

They are now queued for 3.19.

							Thanx, Paul

> On 09/28, Oleg Nesterov wrote:
> >
> > Paul, could you take these 2 doc patches? Assuming that you agree
> > with the comments, of course.
> >
> > On 09/24, Paul E. McKenney wrote:
> > >
> > > On Tue, Sep 23, 2014 at 09:03:48PM +0200, Oleg Nesterov wrote:
> > > >
> > > > Paul, will you agree if we turn it into
> > > > ...
> > > >      /*
> > > >       * On the succesfull return we hold ->siglock. According to comment
> > > >       * above rcu_read_unlock() this is against the rules, but scheduler
> > > >       * locks are fine under this lock, signal_wake_up() takes them too.
> > > >       */
> > > >      rcu_read_unlock();
> > >
> > > If you guys continue the guarantee of no deadlock, I am OK with this change.
> >
> > Heh. Contrary to what I said (and you all were agree ;), this deadlock
> > is actually possible, so we can not remove the deadlock-avoidance from
> > __lock_task_sighand(). And I do not see how we can cleanup this code
> > because preempt_disable() + spin_lock() is not -rt friendly.
> >
> > I think this deserves a bit of documentation, see 2/2. Perhaps this is
> > just me, but imo the current comment is a bit misleading.
> >
> > "if the caller of rcu_read_unlock() already holds one of these locks ..."
> > is not a problem in fact. I mean, pi_lock or rq->lock are special enough,
> > nobody should ever call the outermost rcu_read_unlock() with these locks
> > held. rt_mutex->wait_lock should be fine too, also because ->boost_mtx
> > is private to rcu_boost() and rcu_read_unlock_special().
> >
> > But. They can race with each other, and that is why rcu_read_unlock()
> > under (say) ->siglock can actually lead to deadlock. And only because
> > rt_mutex->wait_lock doesn't disable irqs. Or I am totally confused.
> >
> > Perhaps we can change rtmutex.c to use raw_spin_lock_irqsave(), or do
> > something else...
> >
> > Oleg.
> >
> >  include/linux/rcupdate.h |    4 +++-
> >  kernel/fork.c            |    5 ++++-
> >  kernel/signal.c          |   12 +++++++++++-
> >  3 files changed, 18 insertions(+), 3 deletions(-)
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2014-10-23 20:33 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-22 16:44 [PATCH 0/2] signal: simplify/document lock_task_sighand() logic Oleg Nesterov
2014-09-22 16:44 ` [PATCH 1/2] signal: simplify deadlock-avoidance in lock_task_sighand() Oleg Nesterov
2014-09-22 18:58   ` Steven Rostedt
2014-09-22 19:11     ` Oleg Nesterov
2014-09-22 21:24       ` Steven Rostedt
2014-09-23 11:45         ` Rik van Riel
2014-09-23 14:20           ` Peter Zijlstra
2014-09-23 14:30             ` Steven Rostedt
2014-09-23 19:03         ` Oleg Nesterov
2014-09-24  8:36           ` Paul E. McKenney
2014-09-23 15:55   ` Peter Zijlstra
2014-09-22 16:44 ` [PATCH 2/2] signal: document the RCU protection of ->sighand Oleg Nesterov
2014-09-22 19:00   ` Steven Rostedt
2014-09-23 11:50   ` Rik van Riel
2014-09-28 21:43 ` [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks Oleg Nesterov
2014-09-28 21:44   ` [PATCH v2 1/2] signal: document the RCU protection of ->sighand Oleg Nesterov
2014-09-28 21:44   ` [PATCH v2 2/2] rcu: more info about potential deadlocks with rcu_read_unlock() Oleg Nesterov
2014-10-23 19:56   ` [PATCH v2 0/2] document ->sighand protection, rcu_read_unlock() deadlocks Oleg Nesterov
2014-10-23 20:29     ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.