linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* lock_task_sighand() && rcu_boost()
@ 2014-05-03 16:11 Oleg Nesterov
  2014-05-04 18:01 ` Paul E. McKenney
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2014-05-03 16:11 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

Paul,

I just noticed by accident that __lock_task_sighand() looks ugly and
mysterious ;) And I am puzzled.

a841796f11c90d53 "signal: align __lock_task_sighand() irq disabling and RCU"
says:

	The __lock_task_sighand() function calls rcu_read_lock() with interrupts
	and preemption enabled, but later calls rcu_read_unlock() with interrupts
	disabled.  It is therefore possible that this RCU read-side critical
	section will be preempted and later RCU priority boosted, which means that
	rcu_read_unlock() will call rt_mutex_unlock() in order to deboost itself, but
	with interrupts disabled. This results in lockdep splats ...

OK, if we can't rcu_read_unlock() with irqs disabled, then we can at least
cleanup it (and document the problem). Say,

	struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
						   unsigned long *flags)
	{
		struct sighand_struct *sighand;

		rcu_read_lock();
		for (;;) {
			sighand = rcu_dereference(tsk->sighand);
			if (unlikely(sighand == NULL))
				break;

			spin_lock_irqsave(&sighand->siglock, *flags);
			/*
			 * We delay rcu_read_unlock() till unlock_task_sighand()
			 * to avoid rt_mutex_unlock(current->rcu_boost_mutex) with
			 * irqs disabled.
			 */
			if (likely(sighand == tsk->sighand))
				return sighand;
			spin_unlock_irqrestore(&sighand->siglock, *flags);
		}
		rcu_read_unlock();

		return sighand;	/* NULL */
	}

and add rcu_read_unlock() into unlock_task_sighand().

But. I simply can't understand why lockdep should complain? Why it is bad
to lock/unlock ->wait_lock with irqs disabled?

wakeup_next_waiter() and rt_mutex_adjust_prio() should be fine, they start
with _irqsave().

The changelog also says:

	It is quite possible that a better long-term fix is to make rt_mutex_unlock()
	disable irqs when acquiring the rt_mutex structure's ->wait_lock.

and if it is actually bad, then how the change above can fix the problem?

Help!

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: lock_task_sighand() && rcu_boost()
  2014-05-03 16:11 lock_task_sighand() && rcu_boost() Oleg Nesterov
@ 2014-05-04 18:01 ` Paul E. McKenney
  2014-05-04 19:17   ` Oleg Nesterov
  0 siblings, 1 reply; 10+ messages in thread
From: Paul E. McKenney @ 2014-05-04 18:01 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On Sat, May 03, 2014 at 06:11:33PM +0200, Oleg Nesterov wrote:
> Paul,
> 
> I just noticed by accident that __lock_task_sighand() looks ugly and
> mysterious ;) And I am puzzled.
> 
> a841796f11c90d53 "signal: align __lock_task_sighand() irq disabling and RCU"
> says:
> 
> 	The __lock_task_sighand() function calls rcu_read_lock() with interrupts
> 	and preemption enabled, but later calls rcu_read_unlock() with interrupts
> 	disabled.  It is therefore possible that this RCU read-side critical
> 	section will be preempted and later RCU priority boosted, which means that
> 	rcu_read_unlock() will call rt_mutex_unlock() in order to deboost itself, but
> 	with interrupts disabled. This results in lockdep splats ...
> 
> OK, if we can't rcu_read_unlock() with irqs disabled, then we can at least
> cleanup it (and document the problem).

Just to clarify (probably unnecessarily), it is OK to invoke rcu_read_unlock()
with irqs disabled, but only if preemption has been disabled throughout
the entire RCU read-side critical section.

>                                        Say,
> 
> 	struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
> 						   unsigned long *flags)
> 	{
> 		struct sighand_struct *sighand;
> 
> 		rcu_read_lock();
> 		for (;;) {
> 			sighand = rcu_dereference(tsk->sighand);
> 			if (unlikely(sighand == NULL))
> 				break;
> 
> 			spin_lock_irqsave(&sighand->siglock, *flags);
> 			/*
> 			 * We delay rcu_read_unlock() till unlock_task_sighand()
> 			 * to avoid rt_mutex_unlock(current->rcu_boost_mutex) with
> 			 * irqs disabled.
> 			 */
> 			if (likely(sighand == tsk->sighand))
> 				return sighand;
> 			spin_unlock_irqrestore(&sighand->siglock, *flags);
> 		}
> 		rcu_read_unlock();
> 
> 		return sighand;	/* NULL */
> 	}
> 
> and add rcu_read_unlock() into unlock_task_sighand().

That should also work.

> But. I simply can't understand why lockdep should complain? Why it is bad
> to lock/unlock ->wait_lock with irqs disabled?

Well, lockdep doesn't -always- complain, and some cases are OK.

The problem is that if the RCU read-side critical section has been
preempted, and if this task gets RCU priority-boosted in the meantime,
then the task will need to acquire scheduler rq and pi locks at
rcu_read_unlock() time.  If the reason that interrupts are disabled at
rcu_read_unlock() time is that either rq or pi locks are held (or some
other locks are held that are normally acquired while holding rq or
pi locks), then we can deadlock.  And lockdep will of course complain.

If I recall corectly, at one point, the ->siglock lock was acquired
while holding the rq locks, which would have resulted in lockdep
complaints.

> wakeup_next_waiter() and rt_mutex_adjust_prio() should be fine, they start
> with _irqsave().

Hmmm...  A better description of the bad case might be as follows:

	Deadlock can occur if you have an RCU read-side critical
	section that is anywhere preemptible, and where the outermost
	rcu_read_unlock() is invoked while holding and lock acquired
	by either wakeup_next_waiter() or rt_mutex_adjust_prio(),
	or while holding any lock that is ever acquired while holding
	one of those locks.

Does that help?

Avoiding this bad case could be a bit ugly, as it is a dynamic set
of locks that is acquired while holding any lock acquired by either
wakeup_next_waiter() or rt_mutex_adjust_prio().  So I simplified the
rule by prohibiting invoking rcu_read_unlock() with irqs disabled
if the RCU read-side critical section had ever been preemptible.

Perhaps it is time to start using the more complex but more accurate
rule, though keeping it all straight could be tough.  Plus RCU priority
boosting happens rarely, so there is no guarantee that lockdep will
catch problems in a reasonable timeframe.

> The changelog also says:
> 
> 	It is quite possible that a better long-term fix is to make rt_mutex_unlock()
> 	disable irqs when acquiring the rt_mutex structure's ->wait_lock.
> 
> and if it is actually bad, then how the change above can fix the problem?

At this point, I am not seeing how it would help.  Color me confused.  :-/

							Thanx, Paul

> Help!
> 
> Oleg.
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: lock_task_sighand() && rcu_boost()
  2014-05-04 18:01 ` Paul E. McKenney
@ 2014-05-04 19:17   ` Oleg Nesterov
  2014-05-04 22:38     ` Paul E. McKenney
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2014-05-04 19:17 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On 05/04, Paul E. McKenney wrote:
>
> On Sat, May 03, 2014 at 06:11:33PM +0200, Oleg Nesterov wrote:
> >
> > OK, if we can't rcu_read_unlock() with irqs disabled, then we can at least
> > cleanup it (and document the problem).
>
> Just to clarify (probably unnecessarily), it is OK to invoke rcu_read_unlock()
> with irqs disabled, but only if preemption has been disabled throughout
> the entire RCU read-side critical section.

Yes, yes, I understand, thanks.

> > and add rcu_read_unlock() into unlock_task_sighand().
>
> That should also work.

OK.

> > But. I simply can't understand why lockdep should complain? Why it is bad
> > to lock/unlock ->wait_lock with irqs disabled?
>
> Well, lockdep doesn't -always- complain, and some cases are OK.
>
> The problem is that if the RCU read-side critical section has been
> preempted, and if this task gets RCU priority-boosted in the meantime,
> then the task will need to acquire scheduler rq and pi locks at
> rcu_read_unlock() time.

Yes,

> If the reason that interrupts are disabled at
> rcu_read_unlock() time is that either rq or pi locks are held (or some
> other locks are held that are normally acquired while holding rq or
> pi locks), then we can deadlock.  And lockdep will of course complain.

Yes. but not in this case?

> If I recall corectly, at one point, the ->siglock lock was acquired
> while holding the rq locks, which would have resulted in lockdep
> complaints.

No, this must not be possible. signal_wake_up_state() was always called
under ->siglock and it does wake_up_state() which takes rq/pi locks.

And if lock_task_sighand() is preempted after rcu_read_lock(), then the
caller doesn't hold any lock.

So perhaps we can revert a841796f11c90d53 ?

Otherwise please see below.

> Hmmm...  A better description of the bad case might be as follows:
>
> 	Deadlock can occur if you have an RCU read-side critical
> 	section that is anywhere preemptible, and where the outermost
> 	rcu_read_unlock() is invoked while holding and lock acquired
> 	by either wakeup_next_waiter() or rt_mutex_adjust_prio(),
> 	or while holding any lock that is ever acquired while holding
> 	one of those locks.
>
> Does that help?
>
> Avoiding this bad case could be a bit ugly, as it is a dynamic set
> of locks that is acquired while holding any lock acquired by either
> wakeup_next_waiter() or rt_mutex_adjust_prio().  So I simplified the
> rule by prohibiting invoking rcu_read_unlock() with irqs disabled
> if the RCU read-side critical section had ever been preemptible.

OK, if you prefer to enforce this rule even if (say) lock_task_sighand()
is fine, then it needs the comment. And a cleanup ;)

We can move rcu_read_unlock() into unlock_task_sighand() as I suggested
before, or we can simply add preempt_disable/enable into lock_(),

	struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
						   unsigned long *flags)
	{
		struct sighand_struct *sighand;
		/*
		 * COMMENT TO EXPLAIN WHY
		 */
		preempt_disable();
		rcu_read_lock();
		for (;;) {
			sighand = rcu_dereference(tsk->sighand);
			if (unlikely(sighand == NULL))
				break;

			spin_lock_irqsave(&sighand->siglock, *flags);
			if (likely(sighand == tsk->sighand))
				break;
			spin_unlock_irqrestore(&sighand->siglock, *flags);
		}
		rcu_read_unlock();
		preempt_enable();

		return sighand;
	}

The only problem is the "COMMENT" above. Perhaps the "prohibit invoking
rcu_read_unlock() with irqs disabled if ..." rule should documented
near/above rcu_read_unlock() ? In this case that COMMENT could simply
say "see the comment above rcu_read_unlock()".

What do you think?

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: lock_task_sighand() && rcu_boost()
  2014-05-04 19:17   ` Oleg Nesterov
@ 2014-05-04 22:38     ` Paul E. McKenney
  2014-05-05 13:26       ` Oleg Nesterov
  0 siblings, 1 reply; 10+ messages in thread
From: Paul E. McKenney @ 2014-05-04 22:38 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On Sun, May 04, 2014 at 09:17:57PM +0200, Oleg Nesterov wrote:
> On 05/04, Paul E. McKenney wrote:
> >
> > On Sat, May 03, 2014 at 06:11:33PM +0200, Oleg Nesterov wrote:
> > >
> > > OK, if we can't rcu_read_unlock() with irqs disabled, then we can at least
> > > cleanup it (and document the problem).
> >
> > Just to clarify (probably unnecessarily), it is OK to invoke rcu_read_unlock()
> > with irqs disabled, but only if preemption has been disabled throughout
> > the entire RCU read-side critical section.
> 
> Yes, yes, I understand, thanks.
> 
> > > and add rcu_read_unlock() into unlock_task_sighand().
> >
> > That should also work.
> 
> OK.
> 
> > > But. I simply can't understand why lockdep should complain? Why it is bad
> > > to lock/unlock ->wait_lock with irqs disabled?
> >
> > Well, lockdep doesn't -always- complain, and some cases are OK.
> >
> > The problem is that if the RCU read-side critical section has been
> > preempted, and if this task gets RCU priority-boosted in the meantime,
> > then the task will need to acquire scheduler rq and pi locks at
> > rcu_read_unlock() time.
> 
> Yes,
> 
> > If the reason that interrupts are disabled at
> > rcu_read_unlock() time is that either rq or pi locks are held (or some
> > other locks are held that are normally acquired while holding rq or
> > pi locks), then we can deadlock.  And lockdep will of course complain.
> 
> Yes. but not in this case?
> 
> > If I recall corectly, at one point, the ->siglock lock was acquired
> > while holding the rq locks, which would have resulted in lockdep
> > complaints.
> 
> No, this must not be possible. signal_wake_up_state() was always called
> under ->siglock and it does wake_up_state() which takes rq/pi locks.
> 
> And if lock_task_sighand() is preempted after rcu_read_lock(), then the
> caller doesn't hold any lock.
> 
> So perhaps we can revert a841796f11c90d53 ?

Or just update it, your choice.

> Otherwise please see below.
> 
> > Hmmm...  A better description of the bad case might be as follows:
> >
> > 	Deadlock can occur if you have an RCU read-side critical
> > 	section that is anywhere preemptible, and where the outermost
> > 	rcu_read_unlock() is invoked while holding and lock acquired
> > 	by either wakeup_next_waiter() or rt_mutex_adjust_prio(),
> > 	or while holding any lock that is ever acquired while holding
> > 	one of those locks.
> >
> > Does that help?
> >
> > Avoiding this bad case could be a bit ugly, as it is a dynamic set
> > of locks that is acquired while holding any lock acquired by either
> > wakeup_next_waiter() or rt_mutex_adjust_prio().  So I simplified the
> > rule by prohibiting invoking rcu_read_unlock() with irqs disabled
> > if the RCU read-side critical section had ever been preemptible.
> 
> OK, if you prefer to enforce this rule even if (say) lock_task_sighand()
> is fine, then it needs the comment. And a cleanup ;)

Please see below for a proposed comment.  Thinking more about it, I list
both rules and leave the choice to the caller.  Please see the end of
this email for a patch adding a comment to rcu_read_unlock().

> We can move rcu_read_unlock() into unlock_task_sighand() as I suggested
> before, or we can simply add preempt_disable/enable into lock_(),
> 
> 	struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
> 						   unsigned long *flags)
> 	{
> 		struct sighand_struct *sighand;
> 		/*
> 		 * COMMENT TO EXPLAIN WHY
> 		 */
> 		preempt_disable();
> 		rcu_read_lock();
> 		for (;;) {
> 			sighand = rcu_dereference(tsk->sighand);
> 			if (unlikely(sighand == NULL))
> 				break;
> 
> 			spin_lock_irqsave(&sighand->siglock, *flags);
> 			if (likely(sighand == tsk->sighand))
> 				break;
> 			spin_unlock_irqrestore(&sighand->siglock, *flags);
> 		}
> 		rcu_read_unlock();
> 		preempt_enable();
> 
> 		return sighand;
> 	}
> 
> The only problem is the "COMMENT" above. Perhaps the "prohibit invoking
> rcu_read_unlock() with irqs disabled if ..." rule should documented
> near/above rcu_read_unlock() ? In this case that COMMENT could simply
> say "see the comment above rcu_read_unlock()".
> 
> What do you think?

Looks good to me!

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index ca6fe55913b7..17ac3c63415f 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -884,6 +884,27 @@ static inline void rcu_read_lock(void)
 /**
  * rcu_read_unlock() - marks the end of an RCU read-side critical section.
  *
+ * In most situations, rcu_read_unlock() is immune from deadlock.
+ * However, in kernels built with CONFIG_RCU_BOOST, rcu_read_unlock()
+ * is responsible for deboosting, which it does via rt_mutex_unlock().
+ * However, this function acquires the scheduler's runqueue and
+ * priority-inheritance spinlocks.  Thus, deadlock could result if the
+ * caller of rcu_read_unlock() already held one of these locks or any lock
+ * acquired while holding them.
+ *
+ * That said, RCU readers are never priority boosted unless they were
+ * preempted.  Therefore, one way to avoid deadlock is to make sure
+ * that preemption never happens within any RCU read-side critical
+ * section whose outermost rcu_read_unlock() is called with one of
+ * rt_mutex_unlock()'s locks held.
+ *
+ * Given that the set of locks acquired by rt_mutex_unlock() might change
+ * at any time, a somewhat more future-proofed approach is to make sure that
+ * that preemption never happens within any RCU read-side critical
+ * section whose outermost rcu_read_unlock() is called with one of
+ * irqs disabled.  This approach relies on the fact that rt_mutex_unlock()
+ * currently only acquires irq-disabled locks.
+ *
  * See rcu_read_lock() for more information.
  */
 static inline void rcu_read_unlock(void)


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: lock_task_sighand() && rcu_boost()
  2014-05-04 22:38     ` Paul E. McKenney
@ 2014-05-05 13:26       ` Oleg Nesterov
  2014-05-05 15:26         ` Paul E. McKenney
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2014-05-05 13:26 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On 05/04, Paul E. McKenney wrote:
>
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -884,6 +884,27 @@ static inline void rcu_read_lock(void)
>  /**
>   * rcu_read_unlock() - marks the end of an RCU read-side critical section.
>   *
> + * In most situations, rcu_read_unlock() is immune from deadlock.
> + * However, in kernels built with CONFIG_RCU_BOOST, rcu_read_unlock()
> + * is responsible for deboosting, which it does via rt_mutex_unlock().
> + * However, this function acquires the scheduler's runqueue and
> + * priority-inheritance spinlocks.  Thus, deadlock could result if the
> + * caller of rcu_read_unlock() already held one of these locks or any lock
> + * acquired while holding them.
> + *
> + * That said, RCU readers are never priority boosted unless they were
> + * preempted.  Therefore, one way to avoid deadlock is to make sure
> + * that preemption never happens within any RCU read-side critical
> + * section whose outermost rcu_read_unlock() is called with one of
> + * rt_mutex_unlock()'s locks held.
> + *
> + * Given that the set of locks acquired by rt_mutex_unlock() might change
> + * at any time, a somewhat more future-proofed approach is to make sure that
> + * that preemption never happens within any RCU read-side critical
> + * section whose outermost rcu_read_unlock() is called with one of
> + * irqs disabled.  This approach relies on the fact that rt_mutex_unlock()
> + * currently only acquires irq-disabled locks.
> + *
>   * See rcu_read_lock() for more information.
>   */
>  static inline void rcu_read_unlock(void)

Great! And I agree with "might change at any time" part.

I'll update lock_task_sighand() after you push this change (or please feel
free to do this yourself). Cleanup is not that important, of course, but a
short comment referring the documentation above can help another reader to
understand the "unnecessary" local_irq_save/preempt_disable calls.

Thanks Paul.

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: lock_task_sighand() && rcu_boost()
  2014-05-05 13:26       ` Oleg Nesterov
@ 2014-05-05 15:26         ` Paul E. McKenney
  2014-05-05 16:47           ` Oleg Nesterov
  0 siblings, 1 reply; 10+ messages in thread
From: Paul E. McKenney @ 2014-05-05 15:26 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On Mon, May 05, 2014 at 03:26:59PM +0200, Oleg Nesterov wrote:
> On 05/04, Paul E. McKenney wrote:
> >
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -884,6 +884,27 @@ static inline void rcu_read_lock(void)
> >  /**
> >   * rcu_read_unlock() - marks the end of an RCU read-side critical section.
> >   *
> > + * In most situations, rcu_read_unlock() is immune from deadlock.
> > + * However, in kernels built with CONFIG_RCU_BOOST, rcu_read_unlock()
> > + * is responsible for deboosting, which it does via rt_mutex_unlock().
> > + * However, this function acquires the scheduler's runqueue and
> > + * priority-inheritance spinlocks.  Thus, deadlock could result if the
> > + * caller of rcu_read_unlock() already held one of these locks or any lock
> > + * acquired while holding them.
> > + *
> > + * That said, RCU readers are never priority boosted unless they were
> > + * preempted.  Therefore, one way to avoid deadlock is to make sure
> > + * that preemption never happens within any RCU read-side critical
> > + * section whose outermost rcu_read_unlock() is called with one of
> > + * rt_mutex_unlock()'s locks held.
> > + *
> > + * Given that the set of locks acquired by rt_mutex_unlock() might change
> > + * at any time, a somewhat more future-proofed approach is to make sure that
> > + * that preemption never happens within any RCU read-side critical
> > + * section whose outermost rcu_read_unlock() is called with one of
> > + * irqs disabled.  This approach relies on the fact that rt_mutex_unlock()
> > + * currently only acquires irq-disabled locks.
> > + *
> >   * See rcu_read_lock() for more information.
> >   */
> >  static inline void rcu_read_unlock(void)
> 
> Great! And I agree with "might change at any time" part.
> 
> I'll update lock_task_sighand() after you push this change (or please feel
> free to do this yourself). Cleanup is not that important, of course, but a
> short comment referring the documentation above can help another reader to
> understand the "unnecessary" local_irq_save/preempt_disable calls.
> 
> Thanks Paul.

Does the patch below cover it?

							Thanx, Paul

------------------------------------------------------------------------

signal: Explain local_irq_save() call

The explicit local_irq_save() in __lock_task_sighand() is needed to avoid
a potential deadlock condition, as noted in a841796f11c90d53 (signal:
align __lock_task_sighand() irq disabling and RCU).  However, someone
reading the code might be forgiven for concluding that this separate
local_irq_save() was completely unnecessary.  This commit therefore adds
a comment referencing the shiny new block comment on rcu_read_unlock().

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/signal.c b/kernel/signal.c
index 6ea13c09ae56..513e8c252aa4 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1288,6 +1288,10 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
 	struct sighand_struct *sighand;
 
 	for (;;) {
+		/*
+		 * Disable interrupts early to avoid deadlocks.
+		 * See rcu_read_unlock comment header for details.
+		 */
 		local_irq_save(*flags);
 		rcu_read_lock();
 		sighand = rcu_dereference(tsk->sighand);


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: lock_task_sighand() && rcu_boost()
  2014-05-05 15:26         ` Paul E. McKenney
@ 2014-05-05 16:47           ` Oleg Nesterov
  2014-05-05 18:53             ` [PATCH] signal: Simplify __lock_task_sighand() Oleg Nesterov
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2014-05-05 16:47 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On 05/05, Paul E. McKenney wrote:
>
> Does the patch below cover it?

Yes, thanks.

Acked-by: Oleg Nesterov <oleg@redhat.com>

> signal: Explain local_irq_save() call
> 
> The explicit local_irq_save() in __lock_task_sighand() is needed to avoid
> a potential deadlock condition, as noted in a841796f11c90d53 (signal:
> align __lock_task_sighand() irq disabling and RCU).  However, someone
> reading the code might be forgiven for concluding that this separate
> local_irq_save() was completely unnecessary.  This commit therefore adds
> a comment referencing the shiny new block comment on rcu_read_unlock().
> 
> Reported-by: Oleg Nesterov <oleg@redhat.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 6ea13c09ae56..513e8c252aa4 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -1288,6 +1288,10 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
>  	struct sighand_struct *sighand;
>  
>  	for (;;) {
> +		/*
> +		 * Disable interrupts early to avoid deadlocks.
> +		 * See rcu_read_unlock comment header for details.
> +		 */
>  		local_irq_save(*flags);
>  		rcu_read_lock();
>  		sighand = rcu_dereference(tsk->sighand);
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] signal: Simplify __lock_task_sighand()
  2014-05-05 16:47           ` Oleg Nesterov
@ 2014-05-05 18:53             ` Oleg Nesterov
  2014-05-05 19:55               ` Oleg Nesterov
  2014-05-05 20:56               ` Paul E. McKenney
  0 siblings, 2 replies; 10+ messages in thread
From: Oleg Nesterov @ 2014-05-05 18:53 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On 05/05, Oleg Nesterov wrote:
>
> On 05/05, Paul E. McKenney wrote:
> >
> > Does the patch below cover it?
>
> Yes, thanks.
>
> Acked-by: Oleg Nesterov <oleg@redhat.com>

Yes, but please consider the cleanup below, on top of your change.

This is subjective of course, but imho the code looks better without
the extra unlock/restore inside the loop.

-------------------------------------------------------------------------------
Subject: [PATCH] signal: Simplify __lock_task_sighand()

__lock_task_sighand() does local_irq_save() to prevent the potential
deadlock, we can use preempt_disable() with the same effect. And in
this case we can do preempt_disable/enable + rcu_read_lock/unlock only
once outside of the main loop and simplify the code. Also shaves 112
bytes from signal.o.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/signal.c |   31 +++++++++++++------------------
 1 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 4368370..03a0fd4 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1260,30 +1260,25 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
 					   unsigned long *flags)
 {
 	struct sighand_struct *sighand;
-
+	/*
+	 * We are going to do rcu_read_unlock() under spin_lock_irqsave().
+	 * Make sure we can not be preempted after rcu_read_unlock(), see
+	 * rcu_read_unlock comment header for details.
+	 */
+	preempt_disable();
+	rcu_read_lock();
 	for (;;) {
-		/*
-		 * Disable interrupts early to avoid deadlocks.
-		 * See rcu_read_unlock comment header for details.
-		 */
-		local_irq_save(*flags);
-		rcu_read_lock();
 		sighand = rcu_dereference(tsk->sighand);
-		if (unlikely(sighand == NULL)) {
-			rcu_read_unlock();
-			local_irq_restore(*flags);
+		if (unlikely(sighand == NULL))
 			break;
-		}
 
-		spin_lock(&sighand->siglock);
-		if (likely(sighand == tsk->sighand)) {
-			rcu_read_unlock();
+		spin_lock_irqsave(&sighand->siglock, *flags);
+		if (likely(sighand == tsk->sighand))
 			break;
-		}
-		spin_unlock(&sighand->siglock);
-		rcu_read_unlock();
-		local_irq_restore(*flags);
+		spin_unlock_irqrestore(&sighand->siglock, *flags);
 	}
+	rcu_read_unlock();
+	preempt_enable();
 
 	return sighand;
 }
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] signal: Simplify __lock_task_sighand()
  2014-05-05 18:53             ` [PATCH] signal: Simplify __lock_task_sighand() Oleg Nesterov
@ 2014-05-05 19:55               ` Oleg Nesterov
  2014-05-05 20:56               ` Paul E. McKenney
  1 sibling, 0 replies; 10+ messages in thread
From: Oleg Nesterov @ 2014-05-05 19:55 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On 05/05, Oleg Nesterov wrote:
>
> Yes, but please consider the cleanup below, on top of your change.
>
> This is subjective of course, but imho the code looks better without
> the extra unlock/restore inside the loop.
>
> -------------------------------------------------------------------------------
> Subject: [PATCH] signal: Simplify __lock_task_sighand()
> 
> __lock_task_sighand() does local_irq_save() to prevent the potential
> deadlock, we can use preempt_disable() with the same effect. And in
> this case we can do preempt_disable/enable + rcu_read_lock/unlock only
> once outside of the main loop and simplify the code. Also shaves 112
> bytes from signal.o.
>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>  kernel/signal.c |   31 +++++++++++++------------------
>  1 files changed, 13 insertions(+), 18 deletions(-)
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 4368370..03a0fd4 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -1260,30 +1260,25 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
>  					   unsigned long *flags)
>  {
>  	struct sighand_struct *sighand;
> -
> +	/*
> +	 * We are going to do rcu_read_unlock() under spin_lock_irqsave().
> +	 * Make sure we can not be preempted after rcu_read_unlock(), see
                                                   ^^^^^^^^^^^^^^^
Argh, typo in comment. I meant rcu_read_lock() of course.

I'll send v2 tomorrow unless you dislike this change.

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] signal: Simplify __lock_task_sighand()
  2014-05-05 18:53             ` [PATCH] signal: Simplify __lock_task_sighand() Oleg Nesterov
  2014-05-05 19:55               ` Oleg Nesterov
@ 2014-05-05 20:56               ` Paul E. McKenney
  1 sibling, 0 replies; 10+ messages in thread
From: Paul E. McKenney @ 2014-05-05 20:56 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, rostedt, bigeasy, tglx

On Mon, May 05, 2014 at 08:53:08PM +0200, Oleg Nesterov wrote:
> On 05/05, Oleg Nesterov wrote:
> >
> > On 05/05, Paul E. McKenney wrote:
> > >
> > > Does the patch below cover it?
> >
> > Yes, thanks.
> >
> > Acked-by: Oleg Nesterov <oleg@redhat.com>
> 
> Yes, but please consider the cleanup below, on top of your change.
> 
> This is subjective of course, but imho the code looks better without
> the extra unlock/restore inside the loop.

My only concern is that this might degrade real-time latency, but that
mmight just be my paranoia speaking.  Adding Steven, Sebastian, and
Thomas on CC for their thoughts.

Other than that possible issue, I do agree that your change makes the
code simpler.

							Thanx, Paul

> -------------------------------------------------------------------------------
> Subject: [PATCH] signal: Simplify __lock_task_sighand()
> 
> __lock_task_sighand() does local_irq_save() to prevent the potential
> deadlock, we can use preempt_disable() with the same effect. And in
> this case we can do preempt_disable/enable + rcu_read_lock/unlock only
> once outside of the main loop and simplify the code. Also shaves 112
> bytes from signal.o.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>  kernel/signal.c |   31 +++++++++++++------------------
>  1 files changed, 13 insertions(+), 18 deletions(-)
> 
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 4368370..03a0fd4 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -1260,30 +1260,25 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk,
>  					   unsigned long *flags)
>  {
>  	struct sighand_struct *sighand;
> -
> +	/*
> +	 * We are going to do rcu_read_unlock() under spin_lock_irqsave().
> +	 * Make sure we can not be preempted after rcu_read_unlock(), see
> +	 * rcu_read_unlock comment header for details.
> +	 */
> +	preempt_disable();
> +	rcu_read_lock();
>  	for (;;) {
> -		/*
> -		 * Disable interrupts early to avoid deadlocks.
> -		 * See rcu_read_unlock comment header for details.
> -		 */
> -		local_irq_save(*flags);
> -		rcu_read_lock();
>  		sighand = rcu_dereference(tsk->sighand);
> -		if (unlikely(sighand == NULL)) {
> -			rcu_read_unlock();
> -			local_irq_restore(*flags);
> +		if (unlikely(sighand == NULL))
>  			break;
> -		}
> 
> -		spin_lock(&sighand->siglock);
> -		if (likely(sighand == tsk->sighand)) {
> -			rcu_read_unlock();
> +		spin_lock_irqsave(&sighand->siglock, *flags);
> +		if (likely(sighand == tsk->sighand))
>  			break;
> -		}
> -		spin_unlock(&sighand->siglock);
> -		rcu_read_unlock();
> -		local_irq_restore(*flags);
> +		spin_unlock_irqrestore(&sighand->siglock, *flags);
>  	}
> +	rcu_read_unlock();
> +	preempt_enable();
> 
>  	return sighand;
>  }
> -- 
> 1.5.5.1
> 
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-05-05 20:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-03 16:11 lock_task_sighand() && rcu_boost() Oleg Nesterov
2014-05-04 18:01 ` Paul E. McKenney
2014-05-04 19:17   ` Oleg Nesterov
2014-05-04 22:38     ` Paul E. McKenney
2014-05-05 13:26       ` Oleg Nesterov
2014-05-05 15:26         ` Paul E. McKenney
2014-05-05 16:47           ` Oleg Nesterov
2014-05-05 18:53             ` [PATCH] signal: Simplify __lock_task_sighand() Oleg Nesterov
2014-05-05 19:55               ` Oleg Nesterov
2014-05-05 20:56               ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).