* [PATCH] kernel/signal: Remove no longer required irqsave/restore @ 2018-05-04 14:40 Sebastian Andrzej Siewior 2018-05-04 16:59 ` Eric W. Biederman 2018-06-07 20:21 ` [tip:core/urgent] signal: " tip-bot for Anna-Maria Gleixner 0 siblings, 2 replies; 17+ messages in thread From: Sebastian Andrzej Siewior @ 2018-05-04 14:40 UTC (permalink / raw) To: linux-kernel Cc: tglx, Eric W. Biederman, Paul E . McKenney, Anna-Maria Gleixner, Sebastian Andrzej Siewior From: Anna-Maria Gleixner <anna-maria@linutronix.de> Commit a841796f11c9 ("signal: align __lock_task_sighand() irq disabling and RCU") introduced a rcu read side critical section with interrupts disabled. The changelog suggested that a better long-term fix would be "to make rt_mutex_unlock() disable irqs when acquiring the rt_mutex structure's ->wait_lock". This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > wait_lock irq safe") for different reason. Therefore revert commit a841796f11c9 ("signal: align > __lock_task_sighand() irq disabling and RCU") as the interrupt disable dance is not longer required. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- kernel/signal.c | 24 +++++++----------------- 1 file changed, 7 insertions(+), 17 deletions(-) diff --git a/kernel/signal.c b/kernel/signal.c index c6e4c83dc090..16b87c54d027 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1244,19 +1244,12 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk, { struct sighand_struct *sighand; + rcu_read_lock(); for (;;) { - /* - * Disable interrupts early to avoid deadlocks. - * See rcu_read_unlock() comment header for details. - */ - local_irq_save(*flags); - rcu_read_lock(); sighand = rcu_dereference(tsk->sighand); - if (unlikely(sighand == NULL)) { - rcu_read_unlock(); - local_irq_restore(*flags); + if (unlikely(sighand == NULL)) break; - } + /* * This sighand can be already freed and even reused, but * we rely on SLAB_TYPESAFE_BY_RCU and sighand_ctor() which @@ -1268,15 +1261,12 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk, * __exit_signal(). In the latter case the next iteration * must see ->sighand == NULL. */ - spin_lock(&sighand->siglock); - if (likely(sighand == tsk->sighand)) { - rcu_read_unlock(); + spin_lock_irqsave(&sighand->siglock, *flags); + if (likely(sighand == tsk->sighand)) break; - } - spin_unlock(&sighand->siglock); - rcu_read_unlock(); - local_irq_restore(*flags); + spin_unlock_irqrestore(&sighand->siglock, *flags); } + rcu_read_unlock(); return sighand; } -- 2.17.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-04 14:40 [PATCH] kernel/signal: Remove no longer required irqsave/restore Sebastian Andrzej Siewior @ 2018-05-04 16:59 ` Eric W. Biederman 2018-05-04 17:04 ` Sebastian Andrzej Siewior 2018-06-07 20:21 ` [tip:core/urgent] signal: " tip-bot for Anna-Maria Gleixner 1 sibling, 1 reply; 17+ messages in thread From: Eric W. Biederman @ 2018-05-04 16:59 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: linux-kernel, tglx, Paul E . McKenney, Anna-Maria Gleixner Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > From: Anna-Maria Gleixner <anna-maria@linutronix.de> > > Commit a841796f11c9 ("signal: align __lock_task_sighand() irq disabling and > RCU") introduced a rcu read side critical section with interrupts > disabled. The changelog suggested that a better long-term fix would be "to > make rt_mutex_unlock() disable irqs when acquiring the rt_mutex structure's > ->wait_lock". > > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > > wait_lock irq safe") for different reason. Which tree has this change been made in? I am not finding the commit you mention above in Linus's tree. > Therefore revert commit a841796f11c9 ("signal: align > > __lock_task_sighand() irq disabling and RCU") as the interrupt disable > dance is not longer required. Which tree is change aimed at? Eric > Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > --- > kernel/signal.c | 24 +++++++----------------- > 1 file changed, 7 insertions(+), 17 deletions(-) > > diff --git a/kernel/signal.c b/kernel/signal.c > index c6e4c83dc090..16b87c54d027 100644 > --- a/kernel/signal.c > +++ b/kernel/signal.c > @@ -1244,19 +1244,12 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk, > { > struct sighand_struct *sighand; > > + rcu_read_lock(); > for (;;) { > - /* > - * Disable interrupts early to avoid deadlocks. > - * See rcu_read_unlock() comment header for details. > - */ > - local_irq_save(*flags); > - rcu_read_lock(); > sighand = rcu_dereference(tsk->sighand); > - if (unlikely(sighand == NULL)) { > - rcu_read_unlock(); > - local_irq_restore(*flags); > + if (unlikely(sighand == NULL)) > break; > - } > + > /* > * This sighand can be already freed and even reused, but > * we rely on SLAB_TYPESAFE_BY_RCU and sighand_ctor() which > @@ -1268,15 +1261,12 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk, > * __exit_signal(). In the latter case the next iteration > * must see ->sighand == NULL. > */ > - spin_lock(&sighand->siglock); > - if (likely(sighand == tsk->sighand)) { > - rcu_read_unlock(); > + spin_lock_irqsave(&sighand->siglock, *flags); > + if (likely(sighand == tsk->sighand)) > break; > - } > - spin_unlock(&sighand->siglock); > - rcu_read_unlock(); > - local_irq_restore(*flags); > + spin_unlock_irqrestore(&sighand->siglock, *flags); > } > + rcu_read_unlock(); > > return sighand; > } ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-04 16:59 ` Eric W. Biederman @ 2018-05-04 17:04 ` Sebastian Andrzej Siewior 2018-05-04 17:17 ` Eric W. Biederman 0 siblings, 1 reply; 17+ messages in thread From: Sebastian Andrzej Siewior @ 2018-05-04 17:04 UTC (permalink / raw) To: Eric W. Biederman Cc: linux-kernel, tglx, Paul E . McKenney, Anna-Maria Gleixner On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote: > Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > > From: Anna-Maria Gleixner <anna-maria@linutronix.de> … > > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > > > wait_lock irq safe") for different reason. > > Which tree has this change been made in? I am not finding the commit > you mention above in Linus's tree. I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make wait_lock irq safe"). > > Therefore revert commit a841796f11c9 ("signal: align > > > __lock_task_sighand() irq disabling and RCU") as the interrupt disable > > dance is not longer required. > > Which tree is change aimed at? tip -> Linus' tree > Eric Sebastian ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-04 17:04 ` Sebastian Andrzej Siewior @ 2018-05-04 17:17 ` Eric W. Biederman 2018-05-04 17:52 ` Paul E. McKenney 0 siblings, 1 reply; 17+ messages in thread From: Eric W. Biederman @ 2018-05-04 17:17 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: linux-kernel, tglx, Paul E . McKenney, Anna-Maria Gleixner Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote: >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: >> > From: Anna-Maria Gleixner <anna-maria@linutronix.de> > … >> > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > >> > wait_lock irq safe") for different reason. >> >> Which tree has this change been made in? I am not finding the commit >> you mention above in Linus's tree. > > I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make > wait_lock irq safe"). Can you fix that in your patch description and can you also up the description of rcu_read_unlock? If we don't need to jump through hoops it looks very reasonable to remove this unnecessary logic. But we should fix the description in rcu_read_unlock that still says we need these hoops. Eric ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-04 17:17 ` Eric W. Biederman @ 2018-05-04 17:52 ` Paul E. McKenney 2018-05-04 19:03 ` Eric W. Biederman 0 siblings, 1 reply; 17+ messages in thread From: Paul E. McKenney @ 2018-05-04 17:52 UTC (permalink / raw) To: Eric W. Biederman Cc: Sebastian Andrzej Siewior, linux-kernel, tglx, Anna-Maria Gleixner On Fri, May 04, 2018 at 12:17:20PM -0500, Eric W. Biederman wrote: > Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > > > On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote: > >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > >> > From: Anna-Maria Gleixner <anna-maria@linutronix.de> > > … > >> > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > > >> > wait_lock irq safe") for different reason. > >> > >> Which tree has this change been made in? I am not finding the commit > >> you mention above in Linus's tree. > > > > I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make > > wait_lock irq safe"). > > Can you fix that in your patch description and can you also up the > description of rcu_read_unlock? > > If we don't need to jump through hoops it looks very reasonable to > remove this unnecessary logic. But we should fix the description > in rcu_read_unlock that still says we need these hoops. The hoops are still required for rcu_read_lock(), otherwise you get deadlocks between the scheduler and RCU in PREEMPT=y kernels. What happens with this patch (if I understand it correctly) is that the signal code now uses a different way of jumping through the hoops. But the hoops are still jumped through. Thanx, Paul ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-04 17:52 ` Paul E. McKenney @ 2018-05-04 19:03 ` Eric W. Biederman 2018-05-04 19:45 ` Paul E. McKenney 0 siblings, 1 reply; 17+ messages in thread From: Eric W. Biederman @ 2018-05-04 19:03 UTC (permalink / raw) To: Paul E. McKenney Cc: Sebastian Andrzej Siewior, linux-kernel, tglx, Anna-Maria Gleixner "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes: > On Fri, May 04, 2018 at 12:17:20PM -0500, Eric W. Biederman wrote: >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: >> >> > On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote: >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: >> >> > From: Anna-Maria Gleixner <anna-maria@linutronix.de> >> > … >> >> > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > >> >> > wait_lock irq safe") for different reason. >> >> >> >> Which tree has this change been made in? I am not finding the commit >> >> you mention above in Linus's tree. >> > >> > I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make >> > wait_lock irq safe"). >> >> Can you fix that in your patch description and can you also up the >> description of rcu_read_unlock? >> >> If we don't need to jump through hoops it looks very reasonable to >> remove this unnecessary logic. But we should fix the description >> in rcu_read_unlock that still says we need these hoops. > > The hoops are still required for rcu_read_lock(), otherwise you > get deadlocks between the scheduler and RCU in PREEMPT=y kernels. > What happens with this patch (if I understand it correctly) is that the > signal code now uses a different way of jumping through the hoops. > But the hoops are still jumped through. The patch changes: local_irq_disable(); rcu_read_lock(); spin_lock(); rcu_read_unlock(); to: rcu_read_lock(); spin_lock_irq(); rcu_read_unlock(); Now that I have a chance to relfect on it the fact that the patern that is being restored does not work is scary. As the failure has nothing to do with lock ordering and people won't realize what is going on. Especially since the common rcu modes won't care. So is it true that taking spin_lock_irq before calling rcu_read_unlock is a problem because of rt_mutex_unlock()? Or has b4abf91047cf ("rtmutex: Make wait_lock irq safe") actually fixed that and we can correct the documentation of rcu_read_unlock() ? And fix __lock_task_sighand? Eric ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-04 19:03 ` Eric W. Biederman @ 2018-05-04 19:45 ` Paul E. McKenney 2018-05-04 20:08 ` Eric W. Biederman 0 siblings, 1 reply; 17+ messages in thread From: Paul E. McKenney @ 2018-05-04 19:45 UTC (permalink / raw) To: Eric W. Biederman Cc: Sebastian Andrzej Siewior, linux-kernel, tglx, Anna-Maria Gleixner On Fri, May 04, 2018 at 02:03:04PM -0500, Eric W. Biederman wrote: > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes: > > > On Fri, May 04, 2018 at 12:17:20PM -0500, Eric W. Biederman wrote: > >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > >> > >> > On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote: > >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > >> >> > From: Anna-Maria Gleixner <anna-maria@linutronix.de> > >> > … > >> >> > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > > >> >> > wait_lock irq safe") for different reason. > >> >> > >> >> Which tree has this change been made in? I am not finding the commit > >> >> you mention above in Linus's tree. > >> > > >> > I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make > >> > wait_lock irq safe"). > >> > >> Can you fix that in your patch description and can you also up the > >> description of rcu_read_unlock? > >> > >> If we don't need to jump through hoops it looks very reasonable to > >> remove this unnecessary logic. But we should fix the description > >> in rcu_read_unlock that still says we need these hoops. > > > > The hoops are still required for rcu_read_lock(), otherwise you > > get deadlocks between the scheduler and RCU in PREEMPT=y kernels. > > What happens with this patch (if I understand it correctly) is that the > > signal code now uses a different way of jumping through the hoops. > > But the hoops are still jumped through. > > The patch changes: > > local_irq_disable(); > rcu_read_lock(); > spin_lock(); > rcu_read_unlock(); > > to: > > rcu_read_lock(); > spin_lock_irq(); > rcu_read_unlock(); > > Now that I have a chance to relfect on it the fact that the patern > that is being restored does not work is scary. As the failure has > nothing to do with lock ordering and people won't realize what is going > on. Especially since the common rcu modes won't care. > > So is it true that taking spin_lock_irq before calling rcu_read_unlock > is a problem because of rt_mutex_unlock()? Or has b4abf91047cf ("rtmutex: Make > wait_lock irq safe") actually fixed that and we can correct the > documentation of rcu_read_unlock() ? And fix __lock_task_sighand? The problem is that the thing taking the lock might be the scheduler, or one of the locks taken while the scheduler's pi and rq locks are held. This occurs only with RCU-preempt. Here is what can happen: o A task does rcu_read_lock(). o That task is preempted. o That task stays preempted for a long time, and is therefore priority boosted. This boosting involves a high-priority RCU kthread creating an rt_mutex, pretending that the preempted task already holds it, and then acquiring it. o The task awakens, acquires the scheduler's rq lock, and then does rcu_read_unlock(). o Because the task has been priority boosted, __rcu_read_unlock() invokes the rcu_read_unlock_special() slowpath, which does (as you say) rt_mutex_unlock() to deboost. The deboosting can cause the scheduler to acquire the rq and pi locks, which results in deadlock. In contrast, holding these scheduler locks across the entirety of the RCU-preempt read-side critical section is harmless because then the critical section cannot be preempted, which means that priority boosting cannot happen, which means that there will be no need to deboost at rcu_read_unlock() time. This restriction has not changed, and as far as I can see is inherent in the fact that RCU uses the scheduler and the scheduler uses RCU. There is going to be an odd corner case in there somewhere! Thanx, Paul ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-04 19:45 ` Paul E. McKenney @ 2018-05-04 20:08 ` Eric W. Biederman 2018-05-04 20:37 ` Paul E. McKenney 0 siblings, 1 reply; 17+ messages in thread From: Eric W. Biederman @ 2018-05-04 20:08 UTC (permalink / raw) To: Paul E. McKenney Cc: Sebastian Andrzej Siewior, linux-kernel, tglx, Anna-Maria Gleixner "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes: > On Fri, May 04, 2018 at 02:03:04PM -0500, Eric W. Biederman wrote: >> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes: >> >> > On Fri, May 04, 2018 at 12:17:20PM -0500, Eric W. Biederman wrote: >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: >> >> >> >> > On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote: >> >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: >> >> >> > From: Anna-Maria Gleixner <anna-maria@linutronix.de> >> >> > … >> >> >> > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > >> >> >> > wait_lock irq safe") for different reason. >> >> >> >> >> >> Which tree has this change been made in? I am not finding the commit >> >> >> you mention above in Linus's tree. >> >> > >> >> > I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make >> >> > wait_lock irq safe"). >> >> >> >> Can you fix that in your patch description and can you also up the >> >> description of rcu_read_unlock? >> >> >> >> If we don't need to jump through hoops it looks very reasonable to >> >> remove this unnecessary logic. But we should fix the description >> >> in rcu_read_unlock that still says we need these hoops. >> > >> > The hoops are still required for rcu_read_lock(), otherwise you >> > get deadlocks between the scheduler and RCU in PREEMPT=y kernels. >> > What happens with this patch (if I understand it correctly) is that the >> > signal code now uses a different way of jumping through the hoops. >> > But the hoops are still jumped through. >> >> The patch changes: >> >> local_irq_disable(); >> rcu_read_lock(); >> spin_lock(); >> rcu_read_unlock(); >> >> to: >> >> rcu_read_lock(); >> spin_lock_irq(); >> rcu_read_unlock(); >> >> Now that I have a chance to relfect on it the fact that the patern >> that is being restored does not work is scary. As the failure has >> nothing to do with lock ordering and people won't realize what is going >> on. Especially since the common rcu modes won't care. >> >> So is it true that taking spin_lock_irq before calling rcu_read_unlock >> is a problem because of rt_mutex_unlock()? Or has b4abf91047cf ("rtmutex: Make >> wait_lock irq safe") actually fixed that and we can correct the >> documentation of rcu_read_unlock() ? And fix __lock_task_sighand? > > The problem is that the thing taking the lock might be the scheduler, > or one of the locks taken while the scheduler's pi and rq locks are > held. This occurs only with RCU-preempt. > > Here is what can happen: > > o A task does rcu_read_lock(). > > o That task is preempted. > > o That task stays preempted for a long time, and is therefore > priority boosted. This boosting involves a high-priority RCU > kthread creating an rt_mutex, pretending that the preempted task > already holds it, and then acquiring it. > > o The task awakens, acquires the scheduler's rq lock, and > then does rcu_read_unlock(). > > o Because the task has been priority boosted, __rcu_read_unlock() > invokes the rcu_read_unlock_special() slowpath, which does > (as you say) rt_mutex_unlock() to deboost. The deboosting > can cause the scheduler to acquire the rq and pi locks, which > results in deadlock. > > In contrast, holding these scheduler locks across the entirety of the > RCU-preempt read-side critical section is harmless because then the > critical section cannot be preempted, which means that priority boosting > cannot happen, which means that there will be no need to deboost at > rcu_read_unlock() time. > > This restriction has not changed, and as far as I can see is inherent > in the fact that RCU uses the scheduler and the scheduler uses RCU. > There is going to be an odd corner case in there somewhere! However if I read things correctly b4abf91047cf ("rtmutex: Make wait_lock irq safe") did change this. In particular it changed things so that it is only the scheduler locks that matter, not any old lock that disabled interrupts. This was done by disabling disabling interrupts when taking the wait_lock. The rcu_read_unlock documentation states: * In most situations, rcu_read_unlock() is immune from deadlock. * However, in kernels built with CONFIG_RCU_BOOST, rcu_read_unlock() * is responsible for deboosting, which it does via rt_mutex_unlock(). * Unfortunately, this function acquires the scheduler's runqueue and * priority-inheritance spinlocks. This means that deadlock could result * if the caller of rcu_read_unlock() already holds one of these locks or * any lock that is ever acquired while holding them; or any lock which * can be taken from interrupt context because rcu_boost()->rt_mutex_lock() * does not disable irqs while taking ->wait_lock. So we can now remove the clause: * ; or any lock which * can be taken from interrupt context because rcu_boost()->rt_mutex_lock() * does not disable irqs while taking ->wait_lock. Without the any lock that disabled interrupts restriction it is now safe to not worry about the issues with the scheduler locks and the rt_mutex Which does make it safe to not worry about these crazy complexities in lock_task_sighand. Paul do you agree or is the patch unsafe? Eric ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-04 20:08 ` Eric W. Biederman @ 2018-05-04 20:37 ` Paul E. McKenney 2018-05-05 4:38 ` Eric W. Biederman 0 siblings, 1 reply; 17+ messages in thread From: Paul E. McKenney @ 2018-05-04 20:37 UTC (permalink / raw) To: Eric W. Biederman Cc: Sebastian Andrzej Siewior, linux-kernel, tglx, Anna-Maria Gleixner On Fri, May 04, 2018 at 03:08:40PM -0500, Eric W. Biederman wrote: > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes: > > > On Fri, May 04, 2018 at 02:03:04PM -0500, Eric W. Biederman wrote: > >> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes: > >> > >> > On Fri, May 04, 2018 at 12:17:20PM -0500, Eric W. Biederman wrote: > >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > >> >> > >> >> > On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote: > >> >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > >> >> >> > From: Anna-Maria Gleixner <anna-maria@linutronix.de> > >> >> > … > >> >> >> > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > > >> >> >> > wait_lock irq safe") for different reason. > >> >> >> > >> >> >> Which tree has this change been made in? I am not finding the commit > >> >> >> you mention above in Linus's tree. > >> >> > > >> >> > I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make > >> >> > wait_lock irq safe"). > >> >> > >> >> Can you fix that in your patch description and can you also up the > >> >> description of rcu_read_unlock? > >> >> > >> >> If we don't need to jump through hoops it looks very reasonable to > >> >> remove this unnecessary logic. But we should fix the description > >> >> in rcu_read_unlock that still says we need these hoops. > >> > > >> > The hoops are still required for rcu_read_lock(), otherwise you > >> > get deadlocks between the scheduler and RCU in PREEMPT=y kernels. > >> > What happens with this patch (if I understand it correctly) is that the > >> > signal code now uses a different way of jumping through the hoops. > >> > But the hoops are still jumped through. > >> > >> The patch changes: > >> > >> local_irq_disable(); > >> rcu_read_lock(); > >> spin_lock(); > >> rcu_read_unlock(); > >> > >> to: > >> > >> rcu_read_lock(); > >> spin_lock_irq(); > >> rcu_read_unlock(); > >> > >> Now that I have a chance to relfect on it the fact that the patern > >> that is being restored does not work is scary. As the failure has > >> nothing to do with lock ordering and people won't realize what is going > >> on. Especially since the common rcu modes won't care. > >> > >> So is it true that taking spin_lock_irq before calling rcu_read_unlock > >> is a problem because of rt_mutex_unlock()? Or has b4abf91047cf ("rtmutex: Make > >> wait_lock irq safe") actually fixed that and we can correct the > >> documentation of rcu_read_unlock() ? And fix __lock_task_sighand? > > > > The problem is that the thing taking the lock might be the scheduler, > > or one of the locks taken while the scheduler's pi and rq locks are > > held. This occurs only with RCU-preempt. > > > > Here is what can happen: > > > > o A task does rcu_read_lock(). > > > > o That task is preempted. > > > > o That task stays preempted for a long time, and is therefore > > priority boosted. This boosting involves a high-priority RCU > > kthread creating an rt_mutex, pretending that the preempted task > > already holds it, and then acquiring it. > > > > o The task awakens, acquires the scheduler's rq lock, and > > then does rcu_read_unlock(). > > > > o Because the task has been priority boosted, __rcu_read_unlock() > > invokes the rcu_read_unlock_special() slowpath, which does > > (as you say) rt_mutex_unlock() to deboost. The deboosting > > can cause the scheduler to acquire the rq and pi locks, which > > results in deadlock. > > > > In contrast, holding these scheduler locks across the entirety of the > > RCU-preempt read-side critical section is harmless because then the > > critical section cannot be preempted, which means that priority boosting > > cannot happen, which means that there will be no need to deboost at > > rcu_read_unlock() time. > > > > This restriction has not changed, and as far as I can see is inherent > > in the fact that RCU uses the scheduler and the scheduler uses RCU. > > There is going to be an odd corner case in there somewhere! > > However if I read things correctly b4abf91047cf ("rtmutex: Make > wait_lock irq safe") did change this. > > In particular it changed things so that it is only the scheduler locks > that matter, not any old lock that disabled interrupts. This was done > by disabling disabling interrupts when taking the wait_lock. > > The rcu_read_unlock documentation states: > > * In most situations, rcu_read_unlock() is immune from deadlock. > * However, in kernels built with CONFIG_RCU_BOOST, rcu_read_unlock() > * is responsible for deboosting, which it does via rt_mutex_unlock(). > * Unfortunately, this function acquires the scheduler's runqueue and > * priority-inheritance spinlocks. This means that deadlock could result > * if the caller of rcu_read_unlock() already holds one of these locks or > * any lock that is ever acquired while holding them; or any lock which > * can be taken from interrupt context because rcu_boost()->rt_mutex_lock() > * does not disable irqs while taking ->wait_lock. > > So we can now remove the clause: > * ; or any lock which > * can be taken from interrupt context because rcu_boost()->rt_mutex_lock() > * does not disable irqs while taking ->wait_lock. > > Without the any lock that disabled interrupts restriction it is now safe > to not worry about the issues with the scheduler locks and the rt_mutex > Which does make it safe to not worry about these crazy complexities in > lock_task_sighand. > > Paul do you agree or is the patch unsafe? Ah, I thought you were trying to get rid of all but the first line of that paragraph, not just the final clause. Apologies for my confusion! It looks plausible, but the patch should be stress-tested on a preemptible kernel with priority boosting enabled. Has that been done? (Me, I would run rcutorture scenario TREE03 for an extended time period on b4abf91047cf with your patch applied. But what testing have you done already?) Thanx, Paul ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-04 20:37 ` Paul E. McKenney @ 2018-05-05 4:38 ` Eric W. Biederman 2018-05-05 5:25 ` Paul E. McKenney 0 siblings, 1 reply; 17+ messages in thread From: Eric W. Biederman @ 2018-05-05 4:38 UTC (permalink / raw) To: Paul E. McKenney Cc: Sebastian Andrzej Siewior, linux-kernel, tglx, Anna-Maria Gleixner "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes: > On Fri, May 04, 2018 at 03:08:40PM -0500, Eric W. Biederman wrote: >> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes: >> >> > On Fri, May 04, 2018 at 02:03:04PM -0500, Eric W. Biederman wrote: >> >> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes: >> >> >> >> > On Fri, May 04, 2018 at 12:17:20PM -0500, Eric W. Biederman wrote: >> >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: >> >> >> >> >> >> > On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote: >> >> >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: >> >> >> >> > From: Anna-Maria Gleixner <anna-maria@linutronix.de> >> >> >> > … >> >> >> >> > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > >> >> >> >> > wait_lock irq safe") for different reason. >> >> >> >> >> >> >> >> Which tree has this change been made in? I am not finding the commit >> >> >> >> you mention above in Linus's tree. >> >> >> > >> >> >> > I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make >> >> >> > wait_lock irq safe"). >> >> >> >> >> >> Can you fix that in your patch description and can you also up the >> >> >> description of rcu_read_unlock? >> >> >> >> >> >> If we don't need to jump through hoops it looks very reasonable to >> >> >> remove this unnecessary logic. But we should fix the description >> >> >> in rcu_read_unlock that still says we need these hoops. >> >> > >> >> > The hoops are still required for rcu_read_lock(), otherwise you >> >> > get deadlocks between the scheduler and RCU in PREEMPT=y kernels. >> >> > What happens with this patch (if I understand it correctly) is that the >> >> > signal code now uses a different way of jumping through the hoops. >> >> > But the hoops are still jumped through. >> >> >> >> The patch changes: >> >> >> >> local_irq_disable(); >> >> rcu_read_lock(); >> >> spin_lock(); >> >> rcu_read_unlock(); >> >> >> >> to: >> >> >> >> rcu_read_lock(); >> >> spin_lock_irq(); >> >> rcu_read_unlock(); >> >> >> >> Now that I have a chance to relfect on it the fact that the patern >> >> that is being restored does not work is scary. As the failure has >> >> nothing to do with lock ordering and people won't realize what is going >> >> on. Especially since the common rcu modes won't care. >> >> >> >> So is it true that taking spin_lock_irq before calling rcu_read_unlock >> >> is a problem because of rt_mutex_unlock()? Or has b4abf91047cf ("rtmutex: Make >> >> wait_lock irq safe") actually fixed that and we can correct the >> >> documentation of rcu_read_unlock() ? And fix __lock_task_sighand? >> > >> > The problem is that the thing taking the lock might be the scheduler, >> > or one of the locks taken while the scheduler's pi and rq locks are >> > held. This occurs only with RCU-preempt. >> > >> > Here is what can happen: >> > >> > o A task does rcu_read_lock(). >> > >> > o That task is preempted. >> > >> > o That task stays preempted for a long time, and is therefore >> > priority boosted. This boosting involves a high-priority RCU >> > kthread creating an rt_mutex, pretending that the preempted task >> > already holds it, and then acquiring it. >> > >> > o The task awakens, acquires the scheduler's rq lock, and >> > then does rcu_read_unlock(). >> > >> > o Because the task has been priority boosted, __rcu_read_unlock() >> > invokes the rcu_read_unlock_special() slowpath, which does >> > (as you say) rt_mutex_unlock() to deboost. The deboosting >> > can cause the scheduler to acquire the rq and pi locks, which >> > results in deadlock. >> > >> > In contrast, holding these scheduler locks across the entirety of the >> > RCU-preempt read-side critical section is harmless because then the >> > critical section cannot be preempted, which means that priority boosting >> > cannot happen, which means that there will be no need to deboost at >> > rcu_read_unlock() time. >> > >> > This restriction has not changed, and as far as I can see is inherent >> > in the fact that RCU uses the scheduler and the scheduler uses RCU. >> > There is going to be an odd corner case in there somewhere! >> >> However if I read things correctly b4abf91047cf ("rtmutex: Make >> wait_lock irq safe") did change this. >> >> In particular it changed things so that it is only the scheduler locks >> that matter, not any old lock that disabled interrupts. This was done >> by disabling disabling interrupts when taking the wait_lock. >> >> The rcu_read_unlock documentation states: >> >> * In most situations, rcu_read_unlock() is immune from deadlock. >> * However, in kernels built with CONFIG_RCU_BOOST, rcu_read_unlock() >> * is responsible for deboosting, which it does via rt_mutex_unlock(). >> * Unfortunately, this function acquires the scheduler's runqueue and >> * priority-inheritance spinlocks. This means that deadlock could result >> * if the caller of rcu_read_unlock() already holds one of these locks or >> * any lock that is ever acquired while holding them; or any lock which >> * can be taken from interrupt context because rcu_boost()->rt_mutex_lock() >> * does not disable irqs while taking ->wait_lock. >> >> So we can now remove the clause: >> * ; or any lock which >> * can be taken from interrupt context because rcu_boost()->rt_mutex_lock() >> * does not disable irqs while taking ->wait_lock. >> >> Without the any lock that disabled interrupts restriction it is now safe >> to not worry about the issues with the scheduler locks and the rt_mutex >> Which does make it safe to not worry about these crazy complexities in >> lock_task_sighand. >> >> Paul do you agree or is the patch unsafe? > > Ah, I thought you were trying to get rid of all but the first line of > that paragraph, not just the final clause. Apologies for my confusion! > > It looks plausible, but the patch should be stress-tested on a preemptible > kernel with priority boosting enabled. Has that been done? > > (Me, I would run rcutorture scenario TREE03 for an extended time period > on b4abf91047cf with your patch applied. But what testing have you > done already?) Not my patch. I was just the reviewer asking for some obvious cleanups. So people won't get confused. Sebastian? Can you tell Paul what testing you have done? Eric ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-05 4:38 ` Eric W. Biederman @ 2018-05-05 5:25 ` Paul E. McKenney 2018-05-05 5:56 ` Thomas Gleixner 0 siblings, 1 reply; 17+ messages in thread From: Paul E. McKenney @ 2018-05-05 5:25 UTC (permalink / raw) To: Eric W. Biederman Cc: Sebastian Andrzej Siewior, linux-kernel, tglx, Anna-Maria Gleixner On Fri, May 04, 2018 at 11:38:37PM -0500, Eric W. Biederman wrote: > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes: > > > On Fri, May 04, 2018 at 03:08:40PM -0500, Eric W. Biederman wrote: > >> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes: > >> > >> > On Fri, May 04, 2018 at 02:03:04PM -0500, Eric W. Biederman wrote: > >> >> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes: > >> >> > >> >> > On Fri, May 04, 2018 at 12:17:20PM -0500, Eric W. Biederman wrote: > >> >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > >> >> >> > >> >> >> > On 2018-05-04 11:59:08 [-0500], Eric W. Biederman wrote: > >> >> >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > >> >> >> >> > From: Anna-Maria Gleixner <anna-maria@linutronix.de> > >> >> >> > … > >> >> >> >> > This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > > >> >> >> >> > wait_lock irq safe") for different reason. > >> >> >> >> > >> >> >> >> Which tree has this change been made in? I am not finding the commit > >> >> >> >> you mention above in Linus's tree. > >> >> >> > > >> >> >> > I'm sorry, it should have been commit b4abf91047cf ("rtmutex: Make > >> >> >> > wait_lock irq safe"). > >> >> >> > >> >> >> Can you fix that in your patch description and can you also up the > >> >> >> description of rcu_read_unlock? > >> >> >> > >> >> >> If we don't need to jump through hoops it looks very reasonable to > >> >> >> remove this unnecessary logic. But we should fix the description > >> >> >> in rcu_read_unlock that still says we need these hoops. > >> >> > > >> >> > The hoops are still required for rcu_read_lock(), otherwise you > >> >> > get deadlocks between the scheduler and RCU in PREEMPT=y kernels. > >> >> > What happens with this patch (if I understand it correctly) is that the > >> >> > signal code now uses a different way of jumping through the hoops. > >> >> > But the hoops are still jumped through. > >> >> > >> >> The patch changes: > >> >> > >> >> local_irq_disable(); > >> >> rcu_read_lock(); > >> >> spin_lock(); > >> >> rcu_read_unlock(); > >> >> > >> >> to: > >> >> > >> >> rcu_read_lock(); > >> >> spin_lock_irq(); > >> >> rcu_read_unlock(); > >> >> > >> >> Now that I have a chance to relfect on it the fact that the patern > >> >> that is being restored does not work is scary. As the failure has > >> >> nothing to do with lock ordering and people won't realize what is going > >> >> on. Especially since the common rcu modes won't care. > >> >> > >> >> So is it true that taking spin_lock_irq before calling rcu_read_unlock > >> >> is a problem because of rt_mutex_unlock()? Or has b4abf91047cf ("rtmutex: Make > >> >> wait_lock irq safe") actually fixed that and we can correct the > >> >> documentation of rcu_read_unlock() ? And fix __lock_task_sighand? > >> > > >> > The problem is that the thing taking the lock might be the scheduler, > >> > or one of the locks taken while the scheduler's pi and rq locks are > >> > held. This occurs only with RCU-preempt. > >> > > >> > Here is what can happen: > >> > > >> > o A task does rcu_read_lock(). > >> > > >> > o That task is preempted. > >> > > >> > o That task stays preempted for a long time, and is therefore > >> > priority boosted. This boosting involves a high-priority RCU > >> > kthread creating an rt_mutex, pretending that the preempted task > >> > already holds it, and then acquiring it. > >> > > >> > o The task awakens, acquires the scheduler's rq lock, and > >> > then does rcu_read_unlock(). > >> > > >> > o Because the task has been priority boosted, __rcu_read_unlock() > >> > invokes the rcu_read_unlock_special() slowpath, which does > >> > (as you say) rt_mutex_unlock() to deboost. The deboosting > >> > can cause the scheduler to acquire the rq and pi locks, which > >> > results in deadlock. > >> > > >> > In contrast, holding these scheduler locks across the entirety of the > >> > RCU-preempt read-side critical section is harmless because then the > >> > critical section cannot be preempted, which means that priority boosting > >> > cannot happen, which means that there will be no need to deboost at > >> > rcu_read_unlock() time. > >> > > >> > This restriction has not changed, and as far as I can see is inherent > >> > in the fact that RCU uses the scheduler and the scheduler uses RCU. > >> > There is going to be an odd corner case in there somewhere! > >> > >> However if I read things correctly b4abf91047cf ("rtmutex: Make > >> wait_lock irq safe") did change this. > >> > >> In particular it changed things so that it is only the scheduler locks > >> that matter, not any old lock that disabled interrupts. This was done > >> by disabling disabling interrupts when taking the wait_lock. > >> > >> The rcu_read_unlock documentation states: > >> > >> * In most situations, rcu_read_unlock() is immune from deadlock. > >> * However, in kernels built with CONFIG_RCU_BOOST, rcu_read_unlock() > >> * is responsible for deboosting, which it does via rt_mutex_unlock(). > >> * Unfortunately, this function acquires the scheduler's runqueue and > >> * priority-inheritance spinlocks. This means that deadlock could result > >> * if the caller of rcu_read_unlock() already holds one of these locks or > >> * any lock that is ever acquired while holding them; or any lock which > >> * can be taken from interrupt context because rcu_boost()->rt_mutex_lock() > >> * does not disable irqs while taking ->wait_lock. > >> > >> So we can now remove the clause: > >> * ; or any lock which > >> * can be taken from interrupt context because rcu_boost()->rt_mutex_lock() > >> * does not disable irqs while taking ->wait_lock. > >> > >> Without the any lock that disabled interrupts restriction it is now safe > >> to not worry about the issues with the scheduler locks and the rt_mutex > >> Which does make it safe to not worry about these crazy complexities in > >> lock_task_sighand. > >> > >> Paul do you agree or is the patch unsafe? > > > > Ah, I thought you were trying to get rid of all but the first line of > > that paragraph, not just the final clause. Apologies for my confusion! > > > > It looks plausible, but the patch should be stress-tested on a preemptible > > kernel with priority boosting enabled. Has that been done? > > > > (Me, I would run rcutorture scenario TREE03 for an extended time period > > on b4abf91047cf with your patch applied. And with lockdep enabled, which TREE03 does not do by default. > > But what testing have you > > done already?) > > Not my patch. I was just the reviewer asking for some obvious cleanups. > So people won't get confused. Sebastian? Can you tell Paul what > testing you have done? Ah, apologies for my confusion! Thanx, Paul ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-05 5:25 ` Paul E. McKenney @ 2018-05-05 5:56 ` Thomas Gleixner 2018-05-08 13:42 ` Anna-Maria Gleixner 0 siblings, 1 reply; 17+ messages in thread From: Thomas Gleixner @ 2018-05-05 5:56 UTC (permalink / raw) To: Paul E. McKenney Cc: Eric W. Biederman, Sebastian Andrzej Siewior, linux-kernel, Anna-Maria Gleixner On Fri, 4 May 2018, Paul E. McKenney wrote: > On Fri, May 04, 2018 at 11:38:37PM -0500, Eric W. Biederman wrote: > > > (Me, I would run rcutorture scenario TREE03 for an extended time period > > > on b4abf91047cf with your patch applied. > > And with lockdep enabled, which TREE03 does not do by default. Will run that again to make sure. Thanks, tglx ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-05 5:56 ` Thomas Gleixner @ 2018-05-08 13:42 ` Anna-Maria Gleixner 2018-05-08 14:53 ` Paul E. McKenney 0 siblings, 1 reply; 17+ messages in thread From: Anna-Maria Gleixner @ 2018-05-08 13:42 UTC (permalink / raw) To: Thomas Gleixner Cc: Paul E. McKenney, Eric W. Biederman, Sebastian Andrzej Siewior, linux-kernel On Sat, 5 May 2018, Thomas Gleixner wrote: > On Fri, 4 May 2018, Paul E. McKenney wrote: > > On Fri, May 04, 2018 at 11:38:37PM -0500, Eric W. Biederman wrote: > > > > (Me, I would run rcutorture scenario TREE03 for an extended time period > > > > on b4abf91047cf with your patch applied. > > > > And with lockdep enabled, which TREE03 does not do by default. > > Will run that again to make sure. > I ran the rcutorture scenario TREE03 for 4 hours with the above described setup. It was successful and without any lockdep splats. Anna-Maria ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-08 13:42 ` Anna-Maria Gleixner @ 2018-05-08 14:53 ` Paul E. McKenney 2018-05-08 15:49 ` Anna-Maria Gleixner 0 siblings, 1 reply; 17+ messages in thread From: Paul E. McKenney @ 2018-05-08 14:53 UTC (permalink / raw) To: Anna-Maria Gleixner Cc: Thomas Gleixner, Eric W. Biederman, Sebastian Andrzej Siewior, linux-kernel On Tue, May 08, 2018 at 03:42:25PM +0200, Anna-Maria Gleixner wrote: > On Sat, 5 May 2018, Thomas Gleixner wrote: > > > On Fri, 4 May 2018, Paul E. McKenney wrote: > > > On Fri, May 04, 2018 at 11:38:37PM -0500, Eric W. Biederman wrote: > > > > > (Me, I would run rcutorture scenario TREE03 for an extended time period > > > > > on b4abf91047cf with your patch applied. > > > > > > And with lockdep enabled, which TREE03 does not do by default. > > > > Will run that again to make sure. > > I ran the rcutorture scenario TREE03 for 4 hours with the above > described setup. It was successful and without any lockdep splats. Thank you for the testing, Anna-Maria! If you give them a Tested-by, I will give them an ack. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-08 14:53 ` Paul E. McKenney @ 2018-05-08 15:49 ` Anna-Maria Gleixner 2018-05-08 16:53 ` Paul E. McKenney 0 siblings, 1 reply; 17+ messages in thread From: Anna-Maria Gleixner @ 2018-05-08 15:49 UTC (permalink / raw) To: Paul E. McKenney Cc: Thomas Gleixner, Eric W. Biederman, Sebastian Andrzej Siewior, linux-kernel On Tue, 8 May 2018, Paul E. McKenney wrote: > On Tue, May 08, 2018 at 03:42:25PM +0200, Anna-Maria Gleixner wrote: > > On Sat, 5 May 2018, Thomas Gleixner wrote: > > > > > On Fri, 4 May 2018, Paul E. McKenney wrote: > > > > On Fri, May 04, 2018 at 11:38:37PM -0500, Eric W. Biederman wrote: > > > > > > (Me, I would run rcutorture scenario TREE03 for an extended time period > > > > > > on b4abf91047cf with your patch applied. > > > > > > > > And with lockdep enabled, which TREE03 does not do by default. > > > > > > Will run that again to make sure. > > > > I ran the rcutorture scenario TREE03 for 4 hours with the above > > described setup. It was successful and without any lockdep splats. > > Thank you for the testing, Anna-Maria! If you give them a Tested-by, > I will give them an ack. ;-) > If it is ok to give a Tested-by to the patch I wrote, I will do this to get your ack :) Tested-by: Anna-Maria Gleixner <anna-maria@linutronix.de> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] kernel/signal: Remove no longer required irqsave/restore 2018-05-08 15:49 ` Anna-Maria Gleixner @ 2018-05-08 16:53 ` Paul E. McKenney 0 siblings, 0 replies; 17+ messages in thread From: Paul E. McKenney @ 2018-05-08 16:53 UTC (permalink / raw) To: Anna-Maria Gleixner Cc: Thomas Gleixner, Eric W. Biederman, Sebastian Andrzej Siewior, linux-kernel On Tue, May 08, 2018 at 05:49:25PM +0200, Anna-Maria Gleixner wrote: > On Tue, 8 May 2018, Paul E. McKenney wrote: > > > On Tue, May 08, 2018 at 03:42:25PM +0200, Anna-Maria Gleixner wrote: > > > On Sat, 5 May 2018, Thomas Gleixner wrote: > > > > > > > On Fri, 4 May 2018, Paul E. McKenney wrote: > > > > > On Fri, May 04, 2018 at 11:38:37PM -0500, Eric W. Biederman wrote: > > > > > > > (Me, I would run rcutorture scenario TREE03 for an extended time period > > > > > > > on b4abf91047cf with your patch applied. > > > > > > > > > > And with lockdep enabled, which TREE03 does not do by default. > > > > > > > > Will run that again to make sure. > > > > > > I ran the rcutorture scenario TREE03 for 4 hours with the above > > > described setup. It was successful and without any lockdep splats. > > > > Thank you for the testing, Anna-Maria! If you give them a Tested-by, > > I will give them an ack. ;-) > > > > If it is ok to give a Tested-by to the patch I wrote, I will do this > to get your ack :) > > Tested-by: Anna-Maria Gleixner <anna-maria@linutronix.de> It wasn't even all that early in the morning for me, was it? ;-) Anyway: Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> I am sure that you have already considered adding your test results to the commit log as an alternative to applying a Tested-by to your own patch. Thanx, Paul ^ permalink raw reply [flat|nested] 17+ messages in thread
* [tip:core/urgent] signal: Remove no longer required irqsave/restore 2018-05-04 14:40 [PATCH] kernel/signal: Remove no longer required irqsave/restore Sebastian Andrzej Siewior 2018-05-04 16:59 ` Eric W. Biederman @ 2018-06-07 20:21 ` tip-bot for Anna-Maria Gleixner 1 sibling, 0 replies; 17+ messages in thread From: tip-bot for Anna-Maria Gleixner @ 2018-06-07 20:21 UTC (permalink / raw) To: linux-tip-commits Cc: mingo, ebiederm, bigeasy, paulmck, linux-kernel, tglx, anna-maria, hpa Commit-ID: e79e0f38083e607da5d7b493e7a0f78ba38d788e Gitweb: https://git.kernel.org/tip/e79e0f38083e607da5d7b493e7a0f78ba38d788e Author: Anna-Maria Gleixner <anna-maria@linutronix.de> AuthorDate: Fri, 4 May 2018 16:40:14 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 7 Jun 2018 22:18:55 +0200 signal: Remove no longer required irqsave/restore Commit a841796f11c9 ("signal: align __lock_task_sighand() irq disabling and RCU") introduced a rcu read side critical section with interrupts disabled. The changelog suggested that a better long-term fix would be "to make rt_mutex_unlock() disable irqs when acquiring the rt_mutex structure's ->wait_lock". This long-term fix has been made in commit 4abf91047cf ("rtmutex: Make > wait_lock irq safe") for different reason. Therefore revert commit a841796f11c9 ("signal: align > __lock_task_sighand() irq disabling and RCU") as the interrupt disable dance is not longer required. Testing was done over an extensive period with RCU torture, especially with TREE03 as requested by Paul. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Link: https://lkml.kernel.org/r/20180504144014.5378-1-bigeasy@linutronix.de --- kernel/signal.c | 24 +++++++----------------- 1 file changed, 7 insertions(+), 17 deletions(-) diff --git a/kernel/signal.c b/kernel/signal.c index 0f865d67415d..8d8a940422a8 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1244,19 +1244,12 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk, { struct sighand_struct *sighand; + rcu_read_lock(); for (;;) { - /* - * Disable interrupts early to avoid deadlocks. - * See rcu_read_unlock() comment header for details. - */ - local_irq_save(*flags); - rcu_read_lock(); sighand = rcu_dereference(tsk->sighand); - if (unlikely(sighand == NULL)) { - rcu_read_unlock(); - local_irq_restore(*flags); + if (unlikely(sighand == NULL)) break; - } + /* * This sighand can be already freed and even reused, but * we rely on SLAB_TYPESAFE_BY_RCU and sighand_ctor() which @@ -1268,15 +1261,12 @@ struct sighand_struct *__lock_task_sighand(struct task_struct *tsk, * __exit_signal(). In the latter case the next iteration * must see ->sighand == NULL. */ - spin_lock(&sighand->siglock); - if (likely(sighand == tsk->sighand)) { - rcu_read_unlock(); + spin_lock_irqsave(&sighand->siglock, *flags); + if (likely(sighand == tsk->sighand)) break; - } - spin_unlock(&sighand->siglock); - rcu_read_unlock(); - local_irq_restore(*flags); + spin_unlock_irqrestore(&sighand->siglock, *flags); } + rcu_read_unlock(); return sighand; } ^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2018-06-07 20:22 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-05-04 14:40 [PATCH] kernel/signal: Remove no longer required irqsave/restore Sebastian Andrzej Siewior 2018-05-04 16:59 ` Eric W. Biederman 2018-05-04 17:04 ` Sebastian Andrzej Siewior 2018-05-04 17:17 ` Eric W. Biederman 2018-05-04 17:52 ` Paul E. McKenney 2018-05-04 19:03 ` Eric W. Biederman 2018-05-04 19:45 ` Paul E. McKenney 2018-05-04 20:08 ` Eric W. Biederman 2018-05-04 20:37 ` Paul E. McKenney 2018-05-05 4:38 ` Eric W. Biederman 2018-05-05 5:25 ` Paul E. McKenney 2018-05-05 5:56 ` Thomas Gleixner 2018-05-08 13:42 ` Anna-Maria Gleixner 2018-05-08 14:53 ` Paul E. McKenney 2018-05-08 15:49 ` Anna-Maria Gleixner 2018-05-08 16:53 ` Paul E. McKenney 2018-06-07 20:21 ` [tip:core/urgent] signal: " tip-bot for Anna-Maria Gleixner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.