linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RT] futex/rtmutex: Cure RT double blocking issue
@ 2017-05-09 15:11 Thomas Gleixner
  2017-05-09 15:49 ` Steven Rostedt
  2017-05-11  2:25 ` Wanpeng Li
  0 siblings, 2 replies; 7+ messages in thread
From: Thomas Gleixner @ 2017-05-09 15:11 UTC (permalink / raw)
  To: LKML
  Cc: Sebastian Sewior, Peter Zijlstra, Steven Rostedt, linux-rt-users,
	Engleder Gerhard

RT has a problem when the wait on a futex/rtmutex got interrupted by a
timeout or a signal. task->pi_blocked_on is still set when returning from
rt_mutex_wait_proxy_lock(). The task must acquire the hash bucket lock
after this.

If the hash bucket lock is contended then the
BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
task_blocks_on_rt_mutex() will trigger.

This can be avoided by clearing task->pi_blocked_on in the return path of
rt_mutex_wait_proxy_lock() which removes the task from the boosting chain
of the rtmutex. That's correct because the task is not longer blocked on
it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Engleder Gerhard <eg@keba.com>
---
 kernel/locking/rtmutex.c |   17 +++++++++++++++++
 1 file changed, 17 insertions(+)

--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -2380,6 +2380,7 @@ int rt_mutex_wait_proxy_lock(struct rt_m
 			       struct hrtimer_sleeper *to,
 			       struct rt_mutex_waiter *waiter)
 {
+	struct task_struct *tsk = current;
 	int ret;
 
 	raw_spin_lock_irq(&lock->wait_lock);
@@ -2389,6 +2390,22 @@ int rt_mutex_wait_proxy_lock(struct rt_m
 	/* sleep on the mutex */
 	ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter, NULL);
 
+	/*
+	 * RT has a problem here when the wait got interrupted by a timeout
+	 * or a signal. task->pi_blocked_on is still set. The task must
+	 * acquire the hash bucket lock when returning from this function.
+	 *
+	 * If the hash bucket lock is contended then the
+	 * BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
+	 * task_blocks_on_rt_mutex() will trigger. This can be avoided by
+	 * clearing task->pi_blocked_on which removes the task from the
+	 * boosting chain of the rtmutex. That's correct because the task
+	 * is not longer blocked on it.
+	 */
+	raw_spin_lock(&tsk->pi_lock);
+	tsk->pi_blocked_on = NULL;
+	raw_spin_unlock(&tsk->pi_lock);
+
 	raw_spin_unlock_irq(&lock->wait_lock);
 
 	return ret;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RT] futex/rtmutex: Cure RT double blocking issue
  2017-05-09 15:11 [PATCH RT] futex/rtmutex: Cure RT double blocking issue Thomas Gleixner
@ 2017-05-09 15:49 ` Steven Rostedt
  2017-05-11  2:25 ` Wanpeng Li
  1 sibling, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2017-05-09 15:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Sebastian Sewior, Peter Zijlstra, linux-rt-users, Engleder Gerhard

On Tue, 9 May 2017 17:11:10 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> RT has a problem when the wait on a futex/rtmutex got interrupted by a
> timeout or a signal. task->pi_blocked_on is still set when returning from
> rt_mutex_wait_proxy_lock(). The task must acquire the hash bucket lock
> after this.
> 
> If the hash bucket lock is contended then the
> BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
> task_blocks_on_rt_mutex() will trigger.
> 
> This can be avoided by clearing task->pi_blocked_on in the return path of
> rt_mutex_wait_proxy_lock() which removes the task from the boosting chain
> of the rtmutex. That's correct because the task is not longer blocked on

 s/not/no/

> it.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Reported-by: Engleder Gerhard <eg@keba.com>
> ---
>  kernel/locking/rtmutex.c |   17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -2380,6 +2380,7 @@ int rt_mutex_wait_proxy_lock(struct rt_m
>  			       struct hrtimer_sleeper *to,
>  			       struct rt_mutex_waiter *waiter)
>  {
> +	struct task_struct *tsk = current;
>  	int ret;
>  
>  	raw_spin_lock_irq(&lock->wait_lock);
> @@ -2389,6 +2390,22 @@ int rt_mutex_wait_proxy_lock(struct rt_m
>  	/* sleep on the mutex */
>  	ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter, NULL);
>  
> +	/*
> +	 * RT has a problem here when the wait got interrupted by a timeout
> +	 * or a signal. task->pi_blocked_on is still set. The task must
> +	 * acquire the hash bucket lock when returning from this function.
> +	 *
> +	 * If the hash bucket lock is contended then the
> +	 * BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
> +	 * task_blocks_on_rt_mutex() will trigger. This can be avoided by
> +	 * clearing task->pi_blocked_on which removes the task from the
> +	 * boosting chain of the rtmutex. That's correct because the task
> +	 * is not longer blocked on it.

  s/not/no/


I looked at the users of pi_blocked_on, and this appears to be fine.
I don't see it used by remove_waiter() where it clears it at the start.

Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve


> +	 */
> +	raw_spin_lock(&tsk->pi_lock);
> +	tsk->pi_blocked_on = NULL;
> +	raw_spin_unlock(&tsk->pi_lock);
> +
>  	raw_spin_unlock_irq(&lock->wait_lock);
>  
>  	return ret;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RT] futex/rtmutex: Cure RT double blocking issue
  2017-05-09 15:11 [PATCH RT] futex/rtmutex: Cure RT double blocking issue Thomas Gleixner
  2017-05-09 15:49 ` Steven Rostedt
@ 2017-05-11  2:25 ` Wanpeng Li
  2017-05-11  7:31   ` Thomas Gleixner
  1 sibling, 1 reply; 7+ messages in thread
From: Wanpeng Li @ 2017-05-11  2:25 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Sebastian Sewior, Peter Zijlstra, Steven Rostedt,
	linux-rt-users, Engleder Gerhard

2017-05-09 23:11 GMT+08:00 Thomas Gleixner <tglx@linutronix.de>:
> RT has a problem when the wait on a futex/rtmutex got interrupted by a
> timeout or a signal. task->pi_blocked_on is still set when returning from
> rt_mutex_wait_proxy_lock(). The task must acquire the hash bucket lock
> after this.
>
> If the hash bucket lock is contended then the
> BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
> task_blocks_on_rt_mutex() will trigger.
>
> This can be avoided by clearing task->pi_blocked_on in the return path of
> rt_mutex_wait_proxy_lock() which removes the task from the boosting chain
> of the rtmutex. That's correct because the task is not longer blocked on
> it.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Reported-by: Engleder Gerhard <eg@keba.com>
> ---
>  kernel/locking/rtmutex.c |   17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
>
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -2380,6 +2380,7 @@ int rt_mutex_wait_proxy_lock(struct rt_m
>                                struct hrtimer_sleeper *to,
>                                struct rt_mutex_waiter *waiter)
>  {
> +       struct task_struct *tsk = current;
>         int ret;
>
>         raw_spin_lock_irq(&lock->wait_lock);
> @@ -2389,6 +2390,22 @@ int rt_mutex_wait_proxy_lock(struct rt_m
>         /* sleep on the mutex */
>         ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter, NULL);

Why not check the ret value to avoid lock/unlock tsk->pi_lock when
acquires the rt_mutex successfully?

Regards,
Wanpeng Li

>
> +       /*
> +        * RT has a problem here when the wait got interrupted by a timeout
> +        * or a signal. task->pi_blocked_on is still set. The task must
> +        * acquire the hash bucket lock when returning from this function.
> +        *
> +        * If the hash bucket lock is contended then the
> +        * BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
> +        * task_blocks_on_rt_mutex() will trigger. This can be avoided by
> +        * clearing task->pi_blocked_on which removes the task from the
> +        * boosting chain of the rtmutex. That's correct because the task
> +        * is not longer blocked on it.
> +        */
> +       raw_spin_lock(&tsk->pi_lock);
> +       tsk->pi_blocked_on = NULL;
> +       raw_spin_unlock(&tsk->pi_lock);
> +
>         raw_spin_unlock_irq(&lock->wait_lock);
>
>         return ret;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RT] futex/rtmutex: Cure RT double blocking issue
  2017-05-11  2:25 ` Wanpeng Li
@ 2017-05-11  7:31   ` Thomas Gleixner
  2017-05-11 15:20     ` [PATCH RT v2] " Sebastian Sewior
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2017-05-11  7:31 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: LKML, Sebastian Sewior, Peter Zijlstra, Steven Rostedt,
	linux-rt-users, Engleder Gerhard

On Thu, 11 May 2017, Wanpeng Li wrote:
> 2017-05-09 23:11 GMT+08:00 Thomas Gleixner <tglx@linutronix.de>:
> > RT has a problem when the wait on a futex/rtmutex got interrupted by a
> > timeout or a signal. task->pi_blocked_on is still set when returning from
> > rt_mutex_wait_proxy_lock(). The task must acquire the hash bucket lock
> > after this.
> >
> > If the hash bucket lock is contended then the
> > BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
> > task_blocks_on_rt_mutex() will trigger.
> >
> > This can be avoided by clearing task->pi_blocked_on in the return path of
> > rt_mutex_wait_proxy_lock() which removes the task from the boosting chain
> > of the rtmutex. That's correct because the task is not longer blocked on
> > it.
> >
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > Reported-by: Engleder Gerhard <eg@keba.com>
> > ---
> >  kernel/locking/rtmutex.c |   17 +++++++++++++++++
> >  1 file changed, 17 insertions(+)
> >
> > --- a/kernel/locking/rtmutex.c
> > +++ b/kernel/locking/rtmutex.c
> > @@ -2380,6 +2380,7 @@ int rt_mutex_wait_proxy_lock(struct rt_m
> >                                struct hrtimer_sleeper *to,
> >                                struct rt_mutex_waiter *waiter)
> >  {
> > +       struct task_struct *tsk = current;
> >         int ret;
> >
> >         raw_spin_lock_irq(&lock->wait_lock);
> > @@ -2389,6 +2390,22 @@ int rt_mutex_wait_proxy_lock(struct rt_m
> >         /* sleep on the mutex */
> >         ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter, NULL);
> 
> Why not check the ret value to avoid lock/unlock tsk->pi_lock when
> acquires the rt_mutex successfully?

Make sense.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH RT v2] futex/rtmutex: Cure RT double blocking issue
  2017-05-11  7:31   ` Thomas Gleixner
@ 2017-05-11 15:20     ` Sebastian Sewior
  2017-05-11 15:54       ` Steven Rostedt
  0 siblings, 1 reply; 7+ messages in thread
From: Sebastian Sewior @ 2017-05-11 15:20 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Wanpeng Li, LKML, Peter Zijlstra, Steven Rostedt, linux-rt-users,
	Engleder Gerhard

RT has a problem when the wait on a futex/rtmutex got interrupted by a
timeout or a signal. task->pi_blocked_on is still set when returning from
rt_mutex_wait_proxy_lock(). The task must acquire the hash bucket lock
after this.

If the hash bucket lock is contended then the
BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
task_blocks_on_rt_mutex() will trigger.

This can be avoided by clearing task->pi_blocked_on in the return path of
rt_mutex_wait_proxy_lock() which removes the task from the boosting chain
of the rtmutex. That's correct because the task is not longer blocked on
it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Engleder Gerhard <eg@keba.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 v1…v2: reset ->pi_blocked_on only in the error case.

 kernel/locking/rtmutex.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 314fc65a35b1..4675f1197f33 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -2400,6 +2400,7 @@ int rt_mutex_wait_proxy_lock(struct rt_mutex *lock,
 			       struct hrtimer_sleeper *to,
 			       struct rt_mutex_waiter *waiter)
 {
+	struct task_struct *tsk = current;
 	int ret;
 
 	raw_spin_lock_irq(&lock->wait_lock);
@@ -2409,6 +2410,24 @@ int rt_mutex_wait_proxy_lock(struct rt_mutex *lock,
 	/* sleep on the mutex */
 	ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter, NULL);
 
+	/*
+	 * RT has a problem here when the wait got interrupted by a timeout
+	 * or a signal. task->pi_blocked_on is still set. The task must
+	 * acquire the hash bucket lock when returning from this function.
+	 *
+	 * If the hash bucket lock is contended then the
+	 * BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
+	 * task_blocks_on_rt_mutex() will trigger. This can be avoided by
+	 * clearing task->pi_blocked_on which removes the task from the
+	 * boosting chain of the rtmutex. That's correct because the task
+	 * is not longer blocked on it.
+	 */
+	if (ret) {
+		raw_spin_lock(&tsk->pi_lock);
+		tsk->pi_blocked_on = NULL;
+		raw_spin_unlock(&tsk->pi_lock);
+	}
+
 	raw_spin_unlock_irq(&lock->wait_lock);
 
 	return ret;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH RT v2] futex/rtmutex: Cure RT double blocking issue
  2017-05-11 15:20     ` [PATCH RT v2] " Sebastian Sewior
@ 2017-05-11 15:54       ` Steven Rostedt
  2017-05-11 16:05         ` Sebastian Sewior
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2017-05-11 15:54 UTC (permalink / raw)
  To: Sebastian Sewior
  Cc: Thomas Gleixner, Wanpeng Li, LKML, Peter Zijlstra,
	linux-rt-users, Engleder Gerhard

On Thu, 11 May 2017 17:20:54 +0200
Sebastian Sewior <bigeasy@linutronix.de> wrote:

This is the same patch that Thomas wrote, right? Shouldn't this start
with:

From: Thomas Gleixner <tglx@linutronix.de>

?

-- Steve


> RT has a problem when the wait on a futex/rtmutex got interrupted by a
> timeout or a signal. task->pi_blocked_on is still set when returning from
> rt_mutex_wait_proxy_lock(). The task must acquire the hash bucket lock
> after this.
> 
> If the hash bucket lock is contended then the
> BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
> task_blocks_on_rt_mutex() will trigger.
> 
> This can be avoided by clearing task->pi_blocked_on in the return path of
> rt_mutex_wait_proxy_lock() which removes the task from the boosting chain
> of the rtmutex. That's correct because the task is not longer blocked on
> it.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Reported-by: Engleder Gerhard <eg@keba.com>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  v1…v2: reset ->pi_blocked_on only in the error case.
> 
>  kernel/locking/rtmutex.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
> index 314fc65a35b1..4675f1197f33 100644
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -2400,6 +2400,7 @@ int rt_mutex_wait_proxy_lock(struct rt_mutex *lock,
>  			       struct hrtimer_sleeper *to,
>  			       struct rt_mutex_waiter *waiter)
>  {
> +	struct task_struct *tsk = current;
>  	int ret;
>  
>  	raw_spin_lock_irq(&lock->wait_lock);
> @@ -2409,6 +2410,24 @@ int rt_mutex_wait_proxy_lock(struct rt_mutex *lock,
>  	/* sleep on the mutex */
>  	ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter, NULL);
>  
> +	/*
> +	 * RT has a problem here when the wait got interrupted by a timeout
> +	 * or a signal. task->pi_blocked_on is still set. The task must
> +	 * acquire the hash bucket lock when returning from this function.
> +	 *
> +	 * If the hash bucket lock is contended then the
> +	 * BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
> +	 * task_blocks_on_rt_mutex() will trigger. This can be avoided by
> +	 * clearing task->pi_blocked_on which removes the task from the
> +	 * boosting chain of the rtmutex. That's correct because the task
> +	 * is not longer blocked on it.
> +	 */
> +	if (ret) {
> +		raw_spin_lock(&tsk->pi_lock);
> +		tsk->pi_blocked_on = NULL;
> +		raw_spin_unlock(&tsk->pi_lock);
> +	}
> +
>  	raw_spin_unlock_irq(&lock->wait_lock);
>  
>  	return ret;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RT v2] futex/rtmutex: Cure RT double blocking issue
  2017-05-11 15:54       ` Steven Rostedt
@ 2017-05-11 16:05         ` Sebastian Sewior
  0 siblings, 0 replies; 7+ messages in thread
From: Sebastian Sewior @ 2017-05-11 16:05 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Thomas Gleixner, Wanpeng Li, LKML, Peter Zijlstra,
	linux-rt-users, Engleder Gerhard

On 2017-05-11 11:54:47 [-0400], Steven Rostedt wrote:
> This is the same patch that Thomas wrote, right? Shouldn't this start
> with:
> 
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> ?

correct. It made its way properly into the patch queue, I just managed
to get it wrong while sending it to the list…

> -- Steve

Sebastian

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-05-11 16:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-09 15:11 [PATCH RT] futex/rtmutex: Cure RT double blocking issue Thomas Gleixner
2017-05-09 15:49 ` Steven Rostedt
2017-05-11  2:25 ` Wanpeng Li
2017-05-11  7:31   ` Thomas Gleixner
2017-05-11 15:20     ` [PATCH RT v2] " Sebastian Sewior
2017-05-11 15:54       ` Steven Rostedt
2017-05-11 16:05         ` Sebastian Sewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).