linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock
@ 2007-05-31  0:49 john stultz
  2007-05-31  1:29 ` Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: john stultz @ 2007-05-31  0:49 UTC (permalink / raw)
  To: lkml; +Cc: Ingo Molnar, Thomas Gleixner, Steven Rostedt, Sripathi Kodi

All,
	So we've been seeing PI mutex deadlocks with a few of our applications
using the -rt kernel. After narrowing things down, we were finding that
the applications were indirectly calling futex_unlock_pi(), which on
occasion would return -EFAULT, which is promptly ignored by glibc. This
would cause the application to continue on as if it has unlocked the
mutex, until it tried to reacquire it and deadlock.

In looking into why the kernel was returning -EFAULT, I found the
following:

...
retry_locked:
	/*
	 * To avoid races, try to do the TID -> 0 atomic transition
	 * again. If it succeeds then we can return without waking
	 * anyone else up:
	 */
	if (!(uval & FUTEX_OWNER_DIED)) {
		pagefault_disable();
		uval = futex_atomic_cmpxchg_inatomic(uaddr, current->pid, 0);
		pagefault_enable();
	}

	if (unlikely(uval == -EFAULT))
		goto pi_faulted;
...[snip]...
pi_faulted:
	/*
	 * We have to r/w  *(int __user *)uaddr, but we can't modify it
	 * non-atomically.  Therefore, if get_user below is not
	 * enough, we need to handle the fault ourselves, while
	 * still holding the mmap_sem.
	 */
	if (attempt++) {
		ret = futex_handle_fault((unsigned long)uaddr, fshared,
					 attempt);
		if (ret)
			goto out_unlock;
		goto retry_locked;
	}


Should we fault through normal causes, on the second round we call
futex_handle_fault, which faults in the address, and we then jump back
to retry_locked. However, since uval is -EFAULT from the last cmpxchg,
it &s w/ FUTEX_OWNER_DIED so we don't enter the first conditional to try
to cmpxchg again. So since uval is still -EFAULT, we loop back to
pi_faulted! This will loop until futex_handle_fault() bombs out because
attempt is too big and we return -EFAULT.

I *think* this is a possible quick fix here, but I'm no futex guru, so I
wanted to run it by folks for review.

Big thanks to Sripathi and Angela Lin for helping debug this, and Steven
for suggesting a cleaner fix then what I first tried.

thanks
-john

Avoid futex_unlock_pi returning -EFAULT (which results in deadlock), by
clearing uval before jumping to retry_locked.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
---
diff --git a/kernel/futex.c b/kernel/futex.c
index b7ce15c..9969b36 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2011,6 +2011,7 @@ pi_faulted:
 					 attempt);
 		if (ret)
 			goto out_unlock;
+		uval = 0;
 		goto retry_locked;
 	}
 



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock
  2007-05-31  0:49 [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock john stultz
@ 2007-05-31  1:29 ` Steven Rostedt
  2007-05-31 14:53   ` Ulrich Drepper
  2007-05-31  2:52 ` [PATCH RT] fix faulting bomb in futex_unlock_pi64 Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2007-05-31  1:29 UTC (permalink / raw)
  To: john stultz; +Cc: lkml, Ingo Molnar, Thomas Gleixner, Sripathi Kodi

On Wed, 2007-05-30 at 17:49 -0700, john stultz wrote:

> ...
> retry_locked:
> 	/*
> 	 * To avoid races, try to do the TID -> 0 atomic transition
> 	 * again. If it succeeds then we can return without waking
> 	 * anyone else up:
> 	 */
> 	if (!(uval & FUTEX_OWNER_DIED)) {
> 		pagefault_disable();
> 		uval = futex_atomic_cmpxchg_inatomic(uaddr, current->pid, 0);
> 		pagefault_enable();
> 	}

My question is to all the futex gurus out there.

This code is in futex_unlock_pi.  Can the owner of the mutex really die?
Isn't the owner the one doing the unlock?

-- Steve



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH RT] fix faulting bomb in futex_unlock_pi64
  2007-05-31  0:49 [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock john stultz
  2007-05-31  1:29 ` Steven Rostedt
@ 2007-05-31  2:52 ` Steven Rostedt
  2007-05-31  3:18   ` Steven Rostedt
  2007-05-31 17:20   ` Ulrich Drepper
  2007-05-31 14:24 ` [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock Ingo Molnar
  2007-07-31 23:53 ` [RESEND] " john stultz
  3 siblings, 2 replies; 14+ messages in thread
From: Steven Rostedt @ 2007-05-31  2:52 UTC (permalink / raw)
  To: john stultz
  Cc: lkml, Ingo Molnar, Thomas Gleixner, Sripathi Kodi, Ulrich Drepper

[CC'ing Uli because John forgot to with the original patch]

As John found with futex_unlock_pi (which that patch should apply nicely
to the -rt tree), the code can get screwed up with faulting in at the
cmpxchg in futex_unlock_pi code.

>From John's original email:

----
In looking into why the kernel was returning -EFAULT, I found the
following:

...
retry_locked:
        /*
         * To avoid races, try to do the TID -> 0 atomic transition
         * again. If it succeeds then we can return without waking
         * anyone else up:
         */
        if (!(uval & FUTEX_OWNER_DIED)) {
                pagefault_disable();
                uval = futex_atomic_cmpxchg_inatomic(uaddr,
current->pid, 0);
                pagefault_enable();
        }

        if (unlikely(uval == -EFAULT))
                goto pi_faulted;
...[snip]...
pi_faulted:
        /*
         * We have to r/w  *(int __user *)uaddr, but we can't modify it
         * non-atomically.  Therefore, if get_user below is not
         * enough, we need to handle the fault ourselves, while
         * still holding the mmap_sem.
         */
        if (attempt++) {
                ret = futex_handle_fault((unsigned long)uaddr, fshared,
                                         attempt);
                if (ret)
                        goto out_unlock;
                goto retry_locked;
        }
----


We see that uval is -EFAULT when jumping to pi_faulted. But when it goes
back to retry_locked, the uval is still -EFAULT. Which just happens to
have the FUTEX_OWNER_DIED bit set.  So we don't do the
futex_atomic_cmpxchg_inatomic call, and go to the next conditional which
just so happens to be a check of uval for -EFAULT. And guess what? it
still is!!

Way to go John in finding this!!

But, the -rt kernel has pretty much the same code for the
futex_unlock_pi64, and it has the same bug. Hence this email.

This patch is the same fix that John posted, but simply for the
futex_unlock_pi64 that exists in the RT kernel and not the mainline.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Index: linux-2.6-rt-work/kernel/futex.c
===================================================================
--- linux-2.6-rt-work.orig/kernel/futex.c	2007-05-30 22:29:04.000000000 -0400
+++ linux-2.6-rt-work/kernel/futex.c	2007-05-30 22:32:01.000000000 -0400
@@ -3503,6 +3503,7 @@ pi_faulted:
 			ret = -EFAULT;
 			goto out_unlock;
 		}
+		uval = 0;
 		goto retry_locked;
 	}
 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] fix faulting bomb in futex_unlock_pi64
  2007-05-31  2:52 ` [PATCH RT] fix faulting bomb in futex_unlock_pi64 Steven Rostedt
@ 2007-05-31  3:18   ` Steven Rostedt
  2007-05-31 17:20   ` Ulrich Drepper
  1 sibling, 0 replies; 14+ messages in thread
From: Steven Rostedt @ 2007-05-31  3:18 UTC (permalink / raw)
  To: john stultz
  Cc: lkml, Ingo Molnar, Thomas Gleixner, Sripathi Kodi, Ulrich Drepper

Ingo,

This patch contains both John's and mine for -rt. Might as well make it
a single patch to pull in.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>

---
Index: linux-2.6.21-rt9/kernel/futex.c
===================================================================
--- linux-2.6.21-rt9.orig/kernel/futex.c	2007-05-30 23:12:34.000000000 -0400
+++ linux-2.6.21-rt9/kernel/futex.c	2007-05-30 23:17:05.000000000 -0400
@@ -1839,6 +1839,7 @@ static int futex_lock_pi(u32 __user *uad
 			ret = -EFAULT;
 			goto out_unlock_release_sem;
 		}
+		uval = 0;
 		goto retry_locked;
 	}
 
@@ -3503,6 +3504,7 @@ pi_faulted:
 			ret = -EFAULT;
 			goto out_unlock;
 		}
+		uval = 0;
 		goto retry_locked;
 	}
 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock
  2007-05-31  0:49 [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock john stultz
  2007-05-31  1:29 ` Steven Rostedt
  2007-05-31  2:52 ` [PATCH RT] fix faulting bomb in futex_unlock_pi64 Steven Rostedt
@ 2007-05-31 14:24 ` Ingo Molnar
  2007-05-31 14:50   ` john stultz
  2007-07-31 23:53 ` [RESEND] " john stultz
  3 siblings, 1 reply; 14+ messages in thread
From: Ingo Molnar @ 2007-05-31 14:24 UTC (permalink / raw)
  To: john stultz
  Cc: lkml, Thomas Gleixner, Steven Rostedt, Sripathi Kodi, Oleg Nesterov


* john stultz <johnstul@us.ibm.com> wrote:

> 	So we've been seeing PI mutex deadlocks with a few of our 
> applications using the -rt kernel. After narrowing things down, we 
> were finding that the applications were indirectly calling 
> futex_unlock_pi(), which on occasion would return -EFAULT, which is 
> promptly ignored by glibc. This would cause the application to 
> continue on as if it has unlocked the mutex, until it tried to 
> reacquire it and deadlock.

> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -2011,6 +2011,7 @@ pi_faulted:
>  					 attempt);
>  		if (ret)
>  			goto out_unlock;
> +		uval = 0;
>  		goto retry_locked;

this looks good to me. Oleg has posted a few more fixes as well - have 
you tried those too?

	Ingo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock
  2007-05-31 14:24 ` [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock Ingo Molnar
@ 2007-05-31 14:50   ` john stultz
  2007-05-31 14:55     ` Ingo Molnar
  0 siblings, 1 reply; 14+ messages in thread
From: john stultz @ 2007-05-31 14:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: lkml, Thomas Gleixner, Steven Rostedt, Sripathi Kodi, Oleg Nesterov

On Thu, 2007-05-31 at 16:24 +0200, Ingo Molnar wrote:
> * john stultz <johnstul@us.ibm.com> wrote:
> 
> > 	So we've been seeing PI mutex deadlocks with a few of our 
> > applications using the -rt kernel. After narrowing things down, we 
> > were finding that the applications were indirectly calling 
> > futex_unlock_pi(), which on occasion would return -EFAULT, which is 
> > promptly ignored by glibc. This would cause the application to 
> > continue on as if it has unlocked the mutex, until it tried to 
> > reacquire it and deadlock.
> 
> > --- a/kernel/futex.c
> > +++ b/kernel/futex.c
> > @@ -2011,6 +2011,7 @@ pi_faulted:
> >  					 attempt);
> >  		if (ret)
> >  			goto out_unlock;
> > +		uval = 0;
> >  		goto retry_locked;
> 
> this looks good to me. Oleg has posted a few more fixes as well - have 
> you tried those too?

Do you have a URL or subject line for the discussion? 
I'll take a look and give them a whirl.

thanks
-john


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock
  2007-05-31  1:29 ` Steven Rostedt
@ 2007-05-31 14:53   ` Ulrich Drepper
  0 siblings, 0 replies; 14+ messages in thread
From: Ulrich Drepper @ 2007-05-31 14:53 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: john stultz, lkml, Ingo Molnar, Thomas Gleixner, Sripathi Kodi

On 5/30/07, Steven Rostedt <rostedt@goodmis.org> wrote:
> >       if (!(uval & FUTEX_OWNER_DIED)) {
> >               pagefault_disable();
> >               uval = futex_atomic_cmpxchg_inatomic(uaddr, current->pid, 0);
> >               pagefault_enable();
> >       }
> [...]
> This code is in futex_unlock_pi.  Can the owner of the mutex really die?
> Isn't the owner the one doing the unlock?

This is part of the implementation of robust PI futexes.  The
semantics is that if the owner of a robust futex dies the futex is
marked with FUTEX_OWNER_DIED.  The next locking thread then has to
clear that bit before calling pthread_mutex_unlock.  If the bit is
still set while unlocking the mutex is permanently marked unusable.  I
cannot say right now whether this is the semantics implemented above.
I'll have to check the glibc test suite whether we check for that
semantic.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock
  2007-05-31 14:50   ` john stultz
@ 2007-05-31 14:55     ` Ingo Molnar
  2007-05-31 16:48       ` john stultz
  0 siblings, 1 reply; 14+ messages in thread
From: Ingo Molnar @ 2007-05-31 14:55 UTC (permalink / raw)
  To: john stultz
  Cc: lkml, Thomas Gleixner, Steven Rostedt, Sripathi Kodi,
	Oleg Nesterov, Alexey Kuznetsov


* john stultz <johnstul@us.ibm.com> wrote:

> > > --- a/kernel/futex.c
> > > +++ b/kernel/futex.c
> > > @@ -2011,6 +2011,7 @@ pi_faulted:
> > >  					 attempt);
> > >  		if (ret)
> > >  			goto out_unlock;
> > > +		uval = 0;
> > >  		goto retry_locked;
> > 
> > this looks good to me. Oleg has posted a few more fixes as well - 
> > have you tried those too?
> 
> Do you have a URL or subject line for the discussion? I'll take a look 
> and give them a whirl.

i meant to say Alexey, not Oleg :-) Here's the thread:

   http://lkml.org/lkml/2007/5/7/129

i've Cc:-ed Alexey too.

	Ingo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock
  2007-05-31 14:55     ` Ingo Molnar
@ 2007-05-31 16:48       ` john stultz
  0 siblings, 0 replies; 14+ messages in thread
From: john stultz @ 2007-05-31 16:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: lkml, Thomas Gleixner, Steven Rostedt, Sripathi Kodi,
	Oleg Nesterov, Alexey Kuznetsov

On Thu, 2007-05-31 at 16:55 +0200, Ingo Molnar wrote:
> * john stultz <johnstul@us.ibm.com> wrote:
> 
> > > > --- a/kernel/futex.c
> > > > +++ b/kernel/futex.c
> > > > @@ -2011,6 +2011,7 @@ pi_faulted:
> > > >  					 attempt);
> > > >  		if (ret)
> > > >  			goto out_unlock;
> > > > +		uval = 0;
> > > >  		goto retry_locked;
> > > 
> > > this looks good to me. Oleg has posted a few more fixes as well - 
> > > have you tried those too?
> > 
> > Do you have a URL or subject line for the discussion? I'll take a look 
> > and give them a whirl.
> 
> i meant to say Alexey, not Oleg :-) Here's the thread:
> 
>    http://lkml.org/lkml/2007/5/7/129
> 
> i've Cc:-ed Alexey too.

I haven't had a chance to run it, but reviewing at Alexey's changes,
they do not look to be sufficient to the issue we were hitting. We still
need to clear uval to avoid  -EFAULT &'ing w/ FUTEX_OWNER_DIED.


Here's the patch against -rt9 (which also contains Alexey's fix)


diff -rup 2.6-rt/kernel/futex.c 2.6-rtnew/kernel/futex.c
--- 2.6-rt/kernel/futex.c	2007-05-31 06:38:00.000000000 -0500
+++ 2.6-rtnew/kernel/futex.c	2007-05-31 06:43:08.000000000 -0500
@@ -1954,6 +1954,7 @@ pi_faulted:
 			ret = -EFAULT;
 			goto out_unlock;
 		}
+		uval = 0;
 		goto retry_locked;
 	}
 
@@ -3503,6 +3504,7 @@ pi_faulted:
 			ret = -EFAULT;
 			goto out_unlock;
 		}
+		uval = 0;
 		goto retry_locked;
 	}
 




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RT] fix faulting bomb in futex_unlock_pi64
  2007-05-31  2:52 ` [PATCH RT] fix faulting bomb in futex_unlock_pi64 Steven Rostedt
  2007-05-31  3:18   ` Steven Rostedt
@ 2007-05-31 17:20   ` Ulrich Drepper
  1 sibling, 0 replies; 14+ messages in thread
From: Ulrich Drepper @ 2007-05-31 17:20 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: john stultz, lkml, Ingo Molnar, Thomas Gleixner, Sripathi Kodi

On 5/30/07, Steven Rostedt <rostedt@goodmis.org> wrote:
> But, the -rt kernel has pretty much the same code for the
> futex_unlock_pi64, and it has the same bug.

Why would you use futex_unlock_pi64?

In fact, the whole 64-bit futex patch should be reviewed and cut down
to the bare minimum. I know, I was asking for it loudly but it became
to Jakub and me after adding the private futexes that all these
orthogonal extensions don't make sense.  So far we need 64-bit futexes
only for some rwlocks and here only a few operations (wait, wake).  We
don't need 64-bit extended to PI, we don't need requeue.

Yes, if we had started out with 64-bit futexes on 64-bit platforms,
that would be fine.  But as it is today making 64-bit a new dimension
is not needed.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RESEND] [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock
  2007-05-31  0:49 [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock john stultz
                   ` (2 preceding siblings ...)
  2007-05-31 14:24 ` [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock Ingo Molnar
@ 2007-07-31 23:53 ` john stultz
  2007-08-01  0:00   ` Steven Rostedt
                     ` (2 more replies)
  3 siblings, 3 replies; 14+ messages in thread
From: john stultz @ 2007-07-31 23:53 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: lkml, Thomas Gleixner, Steven Rostedt, Sripathi Kodi

On Wed, 2007-05-30 at 17:49 -0700, john stultz wrote:
> All,
> 	So we've been seeing PI mutex deadlocks with a few of our applications
> using the -rt kernel. After narrowing things down, we were finding that
> the applications were indirectly calling futex_unlock_pi(), which on
> occasion would return -EFAULT, which is promptly ignored by glibc. This
> would cause the application to continue on as if it has unlocked the
> mutex, until it tried to reacquire it and deadlock.
> 
> In looking into why the kernel was returning -EFAULT, I found the
> following:
> 
> ...
> retry_locked:
> 	/*
> 	 * To avoid races, try to do the TID -> 0 atomic transition
> 	 * again. If it succeeds then we can return without waking
> 	 * anyone else up:
> 	 */
> 	if (!(uval & FUTEX_OWNER_DIED)) {
> 		pagefault_disable();
> 		uval = futex_atomic_cmpxchg_inatomic(uaddr, current->pid, 0);
> 		pagefault_enable();
> 	}
> 
> 	if (unlikely(uval == -EFAULT))
> 		goto pi_faulted;
> ...[snip]...
> pi_faulted:
> 	/*
> 	 * We have to r/w  *(int __user *)uaddr, but we can't modify it
> 	 * non-atomically.  Therefore, if get_user below is not
> 	 * enough, we need to handle the fault ourselves, while
> 	 * still holding the mmap_sem.
> 	 */
> 	if (attempt++) {
> 		ret = futex_handle_fault((unsigned long)uaddr, fshared,
> 					 attempt);
> 		if (ret)
> 			goto out_unlock;
> 		goto retry_locked;
> 	}
> 
> 
> Should we fault through normal causes, on the second round we call
> futex_handle_fault, which faults in the address, and we then jump back
> to retry_locked. However, since uval is -EFAULT from the last cmpxchg,
> it &s w/ FUTEX_OWNER_DIED so we don't enter the first conditional to try
> to cmpxchg again. So since uval is still -EFAULT, we loop back to
> pi_faulted! This will loop until futex_handle_fault() bombs out because
> attempt is too big and we return -EFAULT.
> 
> I *think* this is a possible quick fix here, but I'm no futex guru, so I
> wanted to run it by folks for review.
> 
> Big thanks to Sripathi and Angela Lin for helping debug this, and Steven
> for suggesting a cleaner fix then what I first tried.


Hey Ingo,
	I sent this fix for this back in the 2.6.21-rt9 era, and I thought to
picked it up, but I think it arrived amid some futex churn and
apparently has been dropped or was just lost.

Some of our testers have been reporting futex deadlocks and this fix
resolves it, so I believe it to still be needed.

thanks
-john

Avoid futex_unlock_pi returning -EFAULT (which results in deadlock), by
clearing uval before jumping to retry_locked.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Index: 2.6-rt/kernel/futex.c
===================================================================
--- 2.6-rt.orig/kernel/futex.c	2007-07-31 16:48:41.000000000 -0700
+++ 2.6-rt/kernel/futex.c	2007-07-31 16:48:48.000000000 -0700
@@ -1673,6 +1673,7 @@ pi_faulted:
 					 attempt);
 		if (ret)
 			goto out;
+		uval = 0;
 		goto retry_unlocked;
 	}
 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND] [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock
  2007-07-31 23:53 ` [RESEND] " john stultz
@ 2007-08-01  0:00   ` Steven Rostedt
  2007-08-01  1:41   ` David Miller
  2007-08-06  7:20   ` Ingo Molnar
  2 siblings, 0 replies; 14+ messages in thread
From: Steven Rostedt @ 2007-08-01  0:00 UTC (permalink / raw)
  To: john stultz; +Cc: Ingo Molnar, lkml, Thomas Gleixner, Sripathi Kodi



--
On Tue, 31 Jul 2007, john stultz wrote:

> On Wed, 2007-05-30 at 17:49 -0700, john stultz wrote:
>
> Hey Ingo,
> 	I sent this fix for this back in the 2.6.21-rt9 era, and I thought to
> picked it up, but I think it arrived amid some futex churn and
> apparently has been dropped or was just lost.
>
> Some of our testers have been reporting futex deadlocks and this fix
> resolves it, so I believe it to still be needed.

Heh, I remember this bug.  Wasn't there another patch that modified this.
I almost definitly remember this getting into the patch queue, but it must
have been dropped later.

John, thanks for looking into this deeper!

Acked-by: Steven Rostedt <rostedt@goodmis.org>

>
> thanks
> -john
>
> Avoid futex_unlock_pi returning -EFAULT (which results in deadlock), by
> clearing uval before jumping to retry_locked.
>
> Signed-off-by: John Stultz <johnstul@us.ibm.com>
> Index: 2.6-rt/kernel/futex.c
> ===================================================================
> --- 2.6-rt.orig/kernel/futex.c	2007-07-31 16:48:41.000000000 -0700
> +++ 2.6-rt/kernel/futex.c	2007-07-31 16:48:48.000000000 -0700
> @@ -1673,6 +1673,7 @@ pi_faulted:
>  					 attempt);
>  		if (ret)
>  			goto out;
> +		uval = 0;
>  		goto retry_unlocked;
>  	}
>
>
>
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND] [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock
  2007-07-31 23:53 ` [RESEND] " john stultz
  2007-08-01  0:00   ` Steven Rostedt
@ 2007-08-01  1:41   ` David Miller
  2007-08-06  7:20   ` Ingo Molnar
  2 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2007-08-01  1:41 UTC (permalink / raw)
  To: johnstul; +Cc: mingo, linux-kernel, tglx, rostedt, sripathik

From: john stultz <johnstul@us.ibm.com>
Date: Tue, 31 Jul 2007 16:53:39 -0700

> 	I sent this fix for this back in the 2.6.21-rt9 era, and I thought to
> picked it up, but I think it arrived amid some futex churn and
> apparently has been dropped or was just lost.
> 
> Some of our testers have been reporting futex deadlocks and this fix
> resolves it, so I believe it to still be needed.

This bug seems to exist in mainline too.

Sparc users have been seeing something that smells like this
bug, I will ask them to test this fix out.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RESEND] [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock
  2007-07-31 23:53 ` [RESEND] " john stultz
  2007-08-01  0:00   ` Steven Rostedt
  2007-08-01  1:41   ` David Miller
@ 2007-08-06  7:20   ` Ingo Molnar
  2 siblings, 0 replies; 14+ messages in thread
From: Ingo Molnar @ 2007-08-06  7:20 UTC (permalink / raw)
  To: john stultz
  Cc: lkml, Thomas Gleixner, Steven Rostedt, Sripathi Kodi, Alexey Kuznetsov


* john stultz <johnstul@us.ibm.com> wrote:

> Hey Ingo,
> 	I sent this fix for this back in the 2.6.21-rt9 era, and I thought to
> picked it up, but I think it arrived amid some futex churn and
> apparently has been dropped or was just lost.
> 
> Some of our testers have been reporting futex deadlocks and this fix 
> resolves it, so I believe it to still be needed.

hmm ... so the many rounds of pi-futex fixes from Alexey and Thomas did 
not resolve that deadlock?

	Ingo

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2007-08-06  7:21 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-05-31  0:49 [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock john stultz
2007-05-31  1:29 ` Steven Rostedt
2007-05-31 14:53   ` Ulrich Drepper
2007-05-31  2:52 ` [PATCH RT] fix faulting bomb in futex_unlock_pi64 Steven Rostedt
2007-05-31  3:18   ` Steven Rostedt
2007-05-31 17:20   ` Ulrich Drepper
2007-05-31 14:24 ` [BUG] futex_unlock_pi() hurts my brain and may cause application deadlock Ingo Molnar
2007-05-31 14:50   ` john stultz
2007-05-31 14:55     ` Ingo Molnar
2007-05-31 16:48       ` john stultz
2007-07-31 23:53 ` [RESEND] " john stultz
2007-08-01  0:00   ` Steven Rostedt
2007-08-01  1:41   ` David Miller
2007-08-06  7:20   ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).