All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
@ 2014-10-17 16:38 Catalin Marinas
  2014-10-18  6:54 ` Mike Galbraith
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Catalin Marinas @ 2014-10-17 16:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Matteo Franchin, Davidlohr Bueso, Linus Torvalds, Darren Hart,
	Thomas Gleixner, Peter Zijlstra, Ingo Molnar, Paul E. McKenney

Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's
nothing to wake up) changes the futex code to avoid taking a lock when
there are no waiters. This code has been subsequently fixed in commit
11d4616bd07f (futex: revert back to the explicit waiter counting code).
Both the original commit and the fix-up rely on get_futex_key_refs() to
always imply a barrier.

However, for private futexes, none of the cases in the switch statement
of get_futex_key_refs() would be hit and the function completes without
a memory barrier as required before checking the "waiters" in
futex_wake() -> hb_waiters_pending(). The consequence is a race with a
thread waiting on a futex on another CPU, allowing the waker thread to
read "waiters == 0" while the waiter thread to have read "futex_val ==
locked" (in kernel).

Without this fix, the problem (user space deadlocks) can be seen with
Android bionic's mutex implementation on an arm64 multi-cluster system.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Matteo Franchin <Matteo.Franchin@arm.com>
Fixes: b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing to wake up)
Cc: <stable@vger.kernel.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/futex.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/futex.c b/kernel/futex.c
index 815d7af2ffe8..f3a3a071283c 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -343,6 +343,8 @@ static void get_futex_key_refs(union futex_key *key)
 	case FUT_OFF_MMSHARED:
 		futex_get_mm(key); /* implies MB (B) */
 		break;
+	default:
+		smp_mb(); /* explicit MB (B) */
 	}
 }
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-17 16:38 [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier Catalin Marinas
@ 2014-10-18  6:54 ` Mike Galbraith
  2014-10-18  7:09   ` Mike Galbraith
  2014-10-18 15:28   ` Linus Torvalds
  2014-10-18  7:33 ` Davidlohr Bueso
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 20+ messages in thread
From: Mike Galbraith @ 2014-10-18  6:54 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-kernel, Matteo Franchin, Davidlohr Bueso, Linus Torvalds,
	Darren Hart, Thomas Gleixner, Peter Zijlstra, Ingo Molnar,
	Paul E. McKenney

On Fri, 2014-10-17 at 17:38 +0100, Catalin Marinas wrote: 
> Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's
> nothing to wake up) changes the futex code to avoid taking a lock when
> there are no waiters. This code has been subsequently fixed in commit
> 11d4616bd07f (futex: revert back to the explicit waiter counting code).
> Both the original commit and the fix-up rely on get_futex_key_refs() to
> always imply a barrier.
> 
> However, for private futexes, none of the cases in the switch statement
> of get_futex_key_refs() would be hit and the function completes without
> a memory barrier as required before checking the "waiters" in
> futex_wake() -> hb_waiters_pending(). The consequence is a race with a
> thread waiting on a futex on another CPU, allowing the waker thread to
> read "waiters == 0" while the waiter thread to have read "futex_val ==
> locked" (in kernel).
> 
> Without this fix, the problem (user space deadlocks) can be seen with
> Android bionic's mutex implementation on an arm64 multi-cluster system.

How 'bout that, you just triggered my "watch this pot" alarm.

https://lkml.org/lkml/2014/10/8/406

The hang I encountered with stockfish only ever happened on one specific
box.  Linus/Thomas said it I was likely a problem with the futex usage,
but it suspiciously deterministic, so I put this on the "watch out for
further evidence" back burner.

The barrier fixing up my problematic box smells a lot like evidence.

> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Reported-by: Matteo Franchin <Matteo.Franchin@arm.com>
> Fixes: b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing to wake up)
> Cc: <stable@vger.kernel.org>
> Cc: Davidlohr Bueso <davidlohr@hp.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Darren Hart <dvhart@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  kernel/futex.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/futex.c b/kernel/futex.c
> index 815d7af2ffe8..f3a3a071283c 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -343,6 +343,8 @@ static void get_futex_key_refs(union futex_key *key)
>  	case FUT_OFF_MMSHARED:
>  		futex_get_mm(key); /* implies MB (B) */
>  		break;
> +	default:
> +		smp_mb(); /* explicit MB (B) */
>  	}
>  }
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-18  6:54 ` Mike Galbraith
@ 2014-10-18  7:09   ` Mike Galbraith
  2014-10-18 15:28   ` Linus Torvalds
  1 sibling, 0 replies; 20+ messages in thread
From: Mike Galbraith @ 2014-10-18  7:09 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-kernel, Matteo Franchin, Davidlohr Bueso, Linus Torvalds,
	Darren Hart, Thomas Gleixner, Peter Zijlstra, Ingo Molnar,
	Paul E. McKenney

(fixes Davidlohr bounce)

On Sat, 2014-10-18 at 08:54 +0200, Mike Galbraith wrote: 
> On Fri, 2014-10-17 at 17:38 +0100, Catalin Marinas wrote: 
> > Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's
> > nothing to wake up) changes the futex code to avoid taking a lock when
> > there are no waiters. This code has been subsequently fixed in commit
> > 11d4616bd07f (futex: revert back to the explicit waiter counting code).
> > Both the original commit and the fix-up rely on get_futex_key_refs() to
> > always imply a barrier.
> > 
> > However, for private futexes, none of the cases in the switch statement
> > of get_futex_key_refs() would be hit and the function completes without
> > a memory barrier as required before checking the "waiters" in
> > futex_wake() -> hb_waiters_pending(). The consequence is a race with a
> > thread waiting on a futex on another CPU, allowing the waker thread to
> > read "waiters == 0" while the waiter thread to have read "futex_val ==
> > locked" (in kernel).
> > 
> > Without this fix, the problem (user space deadlocks) can be seen with
> > Android bionic's mutex implementation on an arm64 multi-cluster system.
> 
> How 'bout that, you just triggered my "watch this pot" alarm.
> 
> https://lkml.org/lkml/2014/10/8/406
> 
> The hang I encountered with stockfish only ever happened on one specific
> box.  Linus/Thomas said it I was likely a problem with the futex usage,
> but it suspiciously deterministic, so I put this on the "watch out for
> further evidence" back burner.
> 
> The barrier fixing up my problematic box smells a lot like evidence.
> 
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Reported-by: Matteo Franchin <Matteo.Franchin@arm.com>
> > Fixes: b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing to wake up)
> > Cc: <stable@vger.kernel.org>
> > Cc: Davidlohr Bueso <davidlohr@hp.com>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Darren Hart <dvhart@linux.intel.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  kernel/futex.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/kernel/futex.c b/kernel/futex.c
> > index 815d7af2ffe8..f3a3a071283c 100644
> > --- a/kernel/futex.c
> > +++ b/kernel/futex.c
> > @@ -343,6 +343,8 @@ static void get_futex_key_refs(union futex_key *key)
> >  	case FUT_OFF_MMSHARED:
> >  		futex_get_mm(key); /* implies MB (B) */
> >  		break;
> > +	default:
> > +		smp_mb(); /* explicit MB (B) */
> >  	}
> >  }
> >  
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-17 16:38 [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier Catalin Marinas
  2014-10-18  6:54 ` Mike Galbraith
@ 2014-10-18  7:33 ` Davidlohr Bueso
  2014-10-18 19:58   ` Davidlohr Bueso
  2014-10-18 19:32 ` Darren Hart
  2014-10-24  3:27 ` [PATCH] futex: Mention key referencing differences between shared and private futexes Davidlohr Bueso
  3 siblings, 1 reply; 20+ messages in thread
From: Davidlohr Bueso @ 2014-10-18  7:33 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-kernel, Matteo Franchin, Davidlohr Bueso, Linus Torvalds,
	Darren Hart, Thomas Gleixner, Peter Zijlstra, Ingo Molnar,
	Paul E. McKenney, Mike Galbraith

On Fri, 2014-10-17 at 17:38 +0100, Catalin Marinas wrote:
> Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's
> nothing to wake up) changes the futex code to avoid taking a lock when
> there are no waiters. This code has been subsequently fixed in commit
> 11d4616bd07f (futex: revert back to the explicit waiter counting code).
> Both the original commit and the fix-up rely on get_futex_key_refs() to
> always imply a barrier.
> 
> However, for private futexes, none of the cases in the switch statement
> of get_futex_key_refs() would be hit and the function completes without
> a memory barrier as required before checking the "waiters" in
> futex_wake() -> hb_waiters_pending(). 

Good catch, glad I ran into this thread (my email recently changed).
Private process futex (PTHREAD_PROCESS_PRIVATE) have no reference on an
inode or mm so it would need the explicit barrier in those cases.

> The consequence is a race with a
> thread waiting on a futex on another CPU, allowing the waker thread to
> read "waiters == 0" while the waiter thread to have read "futex_val ==
> locked" (in kernel).

Yeah missing wakeups are a strong sign of a problem with the
hb_waiters_pending() side.

> Without this fix, the problem (user space deadlocks) can be seen with
> Android bionic's mutex implementation on an arm64 multi-cluster system.
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Reported-by: Matteo Franchin <Matteo.Franchin@arm.com>
> Fixes: b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing to wake up)
> Cc: <stable@vger.kernel.org>
> Cc: Davidlohr Bueso <davidlohr@hp.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Darren Hart <dvhart@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  kernel/futex.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/futex.c b/kernel/futex.c
> index 815d7af2ffe8..f3a3a071283c 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -343,6 +343,8 @@ static void get_futex_key_refs(union futex_key *key)
>  	case FUT_OFF_MMSHARED:
>  		futex_get_mm(key); /* implies MB (B) */
>  		break;
> +	default:
> +		smp_mb(); /* explicit MB (B) */
>  	}

Should we comment that this default is for the private futex case?
Otherwise:

Acked-by: Davidlohr Bueso <dave@stgolabs.net>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-18  6:54 ` Mike Galbraith
  2014-10-18  7:09   ` Mike Galbraith
@ 2014-10-18 15:28   ` Linus Torvalds
  2014-10-18 16:15     ` Mike Galbraith
  1 sibling, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2014-10-18 15:28 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Catalin Marinas, Linux Kernel Mailing List, Matteo Franchin,
	Davidlohr Bueso, Darren Hart, Thomas Gleixner, Peter Zijlstra,
	Ingo Molnar, Paul E. McKenney

On Fri, Oct 17, 2014 at 11:54 PM, Mike Galbraith
<umgwanakikbuti@gmail.com> wrote:
>
> The barrier fixing up my problematic box smells a lot like evidence.

Is this a "tested-by"? Did you actuallyu verify that the patch ends up
fixing the problem you saw?

            Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-18 15:28   ` Linus Torvalds
@ 2014-10-18 16:15     ` Mike Galbraith
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Galbraith @ 2014-10-18 16:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Catalin Marinas, Linux Kernel Mailing List, Matteo Franchin,
	Davidlohr Bueso, Darren Hart, Thomas Gleixner, Peter Zijlstra,
	Ingo Molnar, Paul E. McKenney

On Sat, 2014-10-18 at 08:28 -0700, Linus Torvalds wrote: 
> On Fri, Oct 17, 2014 at 11:54 PM, Mike Galbraith
> <umgwanakikbuti@gmail.com> wrote:
> >
> > The barrier fixing up my problematic box smells a lot like evidence.
> 
> Is this a "tested-by"? Did you actuallyu verify that the patch ends up
> fixing the problem you saw?

Yup, it definitely fixed it up.

Weird that it didn't show at all on the 2 socket 20 core box the problem
I was looking into was reported on, but was 100% busted on the similar 2
socket 28 core box I borrowed to have a peek, and only that box.  My
(crippled/slow) 64 core DL980 was perfectly happy, as were my 3 home
boxen, not a peep from anywhere but that one 28 core box.

Hohum..  Tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>

-Mike


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-17 16:38 [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier Catalin Marinas
  2014-10-18  6:54 ` Mike Galbraith
  2014-10-18  7:33 ` Davidlohr Bueso
@ 2014-10-18 19:32 ` Darren Hart
  2014-10-18 20:19   ` Davidlohr Bueso
  2014-10-24  3:27 ` [PATCH] futex: Mention key referencing differences between shared and private futexes Davidlohr Bueso
  3 siblings, 1 reply; 20+ messages in thread
From: Darren Hart @ 2014-10-18 19:32 UTC (permalink / raw)
  To: Catalin Marinas, linux-kernel
  Cc: Matteo Franchin, Davidlohr Bueso, Linus Torvalds,
	Thomas Gleixner, Peter Zijlstra, Ingo Molnar, Paul E. McKenney,
	Darren Hart

On 10/17/14 11:38, Catalin Marinas wrote:
> Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's
> nothing to wake up) changes the futex code to avoid taking a lock when
> there are no waiters. This code has been subsequently fixed in commit
> 11d4616bd07f (futex: revert back to the explicit waiter counting code).
> Both the original commit and the fix-up rely on get_futex_key_refs() to
> always imply a barrier.
> 
> However, for private futexes, none of the cases in the switch statement
> of get_futex_key_refs() would be hit and the function completes without
> a memory barrier as required before checking the "waiters" in
> futex_wake() -> hb_waiters_pending(). The consequence is a race with a
> thread waiting on a futex on another CPU, allowing the waker thread to
> read "waiters == 0" while the waiter thread to have read "futex_val ==
> locked" (in kernel).


Verified that this is:

a) how it is documented to work
b) not how it actually works currently

Nice catch indeed.

...

> diff --git a/kernel/futex.c b/kernel/futex.c
> index 815d7af2ffe8..f3a3a071283c 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -343,6 +343,8 @@ static void get_futex_key_refs(union futex_key *key)
>  	case FUT_OFF_MMSHARED:
>  		futex_get_mm(key); /* implies MB (B) */
>  		break;
> +	default:

A comment here indicating this covers the PROCESS_PRIVATE futex case
would be welcome, given the complexity involved.

> +		smp_mb(); /* explicit MB (B) */

Also, the "Basic" futex operation and ordering guarantees documentation
currently reads:

 * Where (A) orders the waiters increment and the futex value read through
 * atomic operations (see hb_waiters_inc) and where (B) orders the write
 * to futex and the waiters read -- this is done by the barriers in
 * get_futex_key_refs(), through either ihold or atomic_inc, depending
on the
 * futex type.

Which is not incomplete (lacking the explicit smp_mb()) added by this
patch. Perhaps the MB implementation of get_futex_key_refs() need not be
explicitly enumerated here?

-- 
Darren Hart
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-18  7:33 ` Davidlohr Bueso
@ 2014-10-18 19:58   ` Davidlohr Bueso
  2014-10-18 20:50     ` Linus Torvalds
  0 siblings, 1 reply; 20+ messages in thread
From: Davidlohr Bueso @ 2014-10-18 19:58 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-kernel, Matteo Franchin, Linus Torvalds, Darren Hart,
	Thomas Gleixner, Peter Zijlstra, Ingo Molnar, Paul E. McKenney,
	Mike Galbraith

On Sat, 2014-10-18 at 00:33 -0700, Davidlohr Bueso wrote:
> On Fri, 2014-10-17 at 17:38 +0100, Catalin Marinas wrote:
> > Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's
> > nothing to wake up) changes the futex code to avoid taking a lock when
> > there are no waiters. This code has been subsequently fixed in commit
> > 11d4616bd07f (futex: revert back to the explicit waiter counting code).
> > Both the original commit and the fix-up rely on get_futex_key_refs() to
> > always imply a barrier.
> > 
> > However, for private futexes, none of the cases in the switch statement
> > of get_futex_key_refs() would be hit and the function completes without
> > a memory barrier as required before checking the "waiters" in
> > futex_wake() -> hb_waiters_pending(). 
> 
> Good catch, glad I ran into this thread (my email recently changed).
> Private process futex (PTHREAD_PROCESS_PRIVATE) have no reference on an
> inode or mm so it would need the explicit barrier in those cases.

And [get/put]_futex_keys() shouldn't even be called for private futexes.
The following patch had some very minor testing on a 60 core box last
night, but passes both Darren's and perf's tests. So I *think* this is
right, but lack of sleep and I overall just don't trust them futexes!

8<----------------------------------------------------------
From: Davidlohr Bueso <dave@stgolabs.net>
Date: Sat, 18 Oct 2014 12:30:37 -0700
Subject: [PATCH 2/1] futex: No key referencing for private futexes

Because private futexes do not hold references on either
an inode or mm, they should not be calling key referencing
operations (even though they are practically a nop). However,
we need to call the get part only because we need the barrier
in order to maintain correct ordering guarantees for the lockless
waiter checking. In addition, we can avoid calling the put part
for private futexes altogether, as it serves no purpose in the
ordering.

This patch 1) documents the situation, 2) explicitly avoids calling
drop_futex_key_refs() when calling put_futex_keys() for private futexes
and 3) changes the interface of the function to pass the 'fshared'
variable, similarly to get_futex_key_refs(). In theory this should apply
to all drop_futex_key_refs() callers, but just keep it simple and apply
it as the get/put alternatives when calling futex(2).

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 kernel/futex.c | 99 ++++++++++++++++++++++++++++++++--------------------------
 1 file changed, 55 insertions(+), 44 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 815d7af..21f7e41 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -415,6 +415,11 @@ get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key, int rw)
 	if (!fshared) {
 		key->private.mm = mm;
 		key->private.address = address;
+		/*
+		 * Private futexes do not hold reference on an inode or
+		 * mm, therefore the only purpose of calling get_futex_key_refs
+		 * is because we need the barrier for the lockless waiter check.
+		 */
 		get_futex_key_refs(key);  /* implies MB (B) */
 		return 0;
 	}
@@ -530,9 +535,14 @@ out:
 	return err;
 }
 
-static inline void put_futex_key(union futex_key *key)
+static inline void put_futex_key(int fshared, union futex_key *key)
 {
-	drop_futex_key_refs(key);
+	/*
+	 * See comment in get_futex_key() about key
+	 * referencing when dealing with private futexes.
+	 */
+	if (fshared)
+		drop_futex_key_refs(key);
 }
 
 /**
@@ -1202,12 +1212,12 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
 	struct futex_hash_bucket *hb;
 	struct futex_q *this, *next;
 	union futex_key key = FUTEX_KEY_INIT;
-	int ret;
+	int ret, fshared = flags & FLAGS_SHARED;
 
 	if (!bitset)
 		return -EINVAL;
 
-	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &key, VERIFY_READ);
+	ret = get_futex_key(uaddr, fshared, &key, VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out;
 
@@ -1238,7 +1248,7 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
 
 	spin_unlock(&hb->lock);
 out_put_key:
-	put_futex_key(&key);
+	put_futex_key(fshared, &key);
 out:
 	return ret;
 }
@@ -1254,13 +1264,13 @@ futex_wake_op(u32 __user *uaddr1, unsigned int flags, u32 __user *uaddr2,
 	union futex_key key1 = FUTEX_KEY_INIT, key2 = FUTEX_KEY_INIT;
 	struct futex_hash_bucket *hb1, *hb2;
 	struct futex_q *this, *next;
-	int ret, op_ret;
+	int ret, op_ret, fshared = flags & FLAGS_SHARED;
 
 retry:
-	ret = get_futex_key(uaddr1, flags & FLAGS_SHARED, &key1, VERIFY_READ);
+	ret = get_futex_key(uaddr1, fshared, &key1, VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out;
-	ret = get_futex_key(uaddr2, flags & FLAGS_SHARED, &key2, VERIFY_WRITE);
+	ret = get_futex_key(uaddr2, fshared, &key2, VERIFY_WRITE);
 	if (unlikely(ret != 0))
 		goto out_put_key1;
 
@@ -1292,11 +1302,11 @@ retry_private:
 		if (ret)
 			goto out_put_keys;
 
-		if (!(flags & FLAGS_SHARED))
+		if (!fshared)
 			goto retry_private;
 
-		put_futex_key(&key2);
-		put_futex_key(&key1);
+		put_futex_key(fshared, &key2);
+		put_futex_key(fshared, &key1);
 		goto retry;
 	}
 
@@ -1331,9 +1341,9 @@ retry_private:
 out_unlock:
 	double_unlock_hb(hb1, hb2);
 out_put_keys:
-	put_futex_key(&key2);
+	put_futex_key(fshared, &key2);
 out_put_key1:
-	put_futex_key(&key1);
+	put_futex_key(fshared, &key1);
 out:
 	return ret;
 }
@@ -1485,10 +1495,11 @@ static int futex_requeue(u32 __user *uaddr1, unsigned int flags,
 			 u32 *cmpval, int requeue_pi)
 {
 	union futex_key key1 = FUTEX_KEY_INIT, key2 = FUTEX_KEY_INIT;
-	int drop_count = 0, task_count = 0, ret;
+	int drop_count = 0, task_count = 0;
 	struct futex_pi_state *pi_state = NULL;
 	struct futex_hash_bucket *hb1, *hb2;
 	struct futex_q *this, *next;
+	int ret, fshared = flags & FLAGS_SHARED;
 
 	if (requeue_pi) {
 		/*
@@ -1528,10 +1539,10 @@ retry:
 		pi_state = NULL;
 	}
 
-	ret = get_futex_key(uaddr1, flags & FLAGS_SHARED, &key1, VERIFY_READ);
+	ret = get_futex_key(uaddr1, fshared, &key1, VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out;
-	ret = get_futex_key(uaddr2, flags & FLAGS_SHARED, &key2,
+	ret = get_futex_key(uaddr2, fshared, &key2,
 			    requeue_pi ? VERIFY_WRITE : VERIFY_READ);
 	if (unlikely(ret != 0))
 		goto out_put_key1;
@@ -1565,11 +1576,11 @@ retry_private:
 			if (ret)
 				goto out_put_keys;
 
-			if (!(flags & FLAGS_SHARED))
+			if (!fshared)
 				goto retry_private;
 
-			put_futex_key(&key2);
-			put_futex_key(&key1);
+			put_futex_key(fshared, &key2);
+			put_futex_key(fshared, &key1);
 			goto retry;
 		}
 		if (curval != *cmpval) {
@@ -1619,8 +1630,8 @@ retry_private:
 		case -EFAULT:
 			double_unlock_hb(hb1, hb2);
 			hb_waiters_dec(hb2);
-			put_futex_key(&key2);
-			put_futex_key(&key1);
+			put_futex_key(fshared, &key2);
+			put_futex_key(fshared, &key1);
 			ret = fault_in_user_writeable(uaddr2);
 			if (!ret)
 				goto retry;
@@ -1634,8 +1645,8 @@ retry_private:
 			 */
 			double_unlock_hb(hb1, hb2);
 			hb_waiters_dec(hb2);
-			put_futex_key(&key2);
-			put_futex_key(&key1);
+			put_futex_key(fshared, &key2);
+			put_futex_key(fshared, &key1);
 			cond_resched();
 			goto retry;
 		default:
@@ -1721,9 +1732,9 @@ out_unlock:
 		drop_futex_key_refs(&key1);
 
 out_put_keys:
-	put_futex_key(&key2);
+	put_futex_key(fshared, &key2);
 out_put_key1:
-	put_futex_key(&key1);
+	put_futex_key(fshared, &key1);
 out:
 	if (pi_state != NULL)
 		free_pi_state(pi_state);
@@ -2098,7 +2109,7 @@ static int futex_wait_setup(u32 __user *uaddr, u32 val, unsigned int flags,
 			   struct futex_q *q, struct futex_hash_bucket **hb)
 {
 	u32 uval;
-	int ret;
+	int ret, fshared = flags & FLAGS_SHARED;
 
 	/*
 	 * Access the page AFTER the hash-bucket is locked.
@@ -2119,7 +2130,7 @@ static int futex_wait_setup(u32 __user *uaddr, u32 val, unsigned int flags,
 	 * while the syscall executes.
 	 */
 retry:
-	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &q->key, VERIFY_READ);
+	ret = get_futex_key(uaddr, fshared, &q->key, VERIFY_READ);
 	if (unlikely(ret != 0))
 		return ret;
 
@@ -2135,10 +2146,10 @@ retry_private:
 		if (ret)
 			goto out;
 
-		if (!(flags & FLAGS_SHARED))
+		if (!fshared)
 			goto retry_private;
 
-		put_futex_key(&q->key);
+		put_futex_key(fshared, &q->key);
 		goto retry;
 	}
 
@@ -2149,7 +2160,7 @@ retry_private:
 
 out:
 	if (ret)
-		put_futex_key(&q->key);
+		put_futex_key(fshared, &q->key);
 	return ret;
 }
 
@@ -2256,7 +2267,7 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, int detect,
 	struct hrtimer_sleeper timeout, *to = NULL;
 	struct futex_hash_bucket *hb;
 	struct futex_q q = futex_q_init;
-	int res, ret;
+	int res, ret, fshared = flags & FLAGS_SHARED;
 
 	if (refill_pi_state_cache())
 		return -ENOMEM;
@@ -2270,7 +2281,7 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, int detect,
 	}
 
 retry:
-	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &q.key, VERIFY_WRITE);
+	ret = get_futex_key(uaddr, fshared, &q.key, VERIFY_WRITE);
 	if (unlikely(ret != 0))
 		goto out;
 
@@ -2294,7 +2305,7 @@ retry_private:
 			 * - The user space value changed.
 			 */
 			queue_unlock(hb);
-			put_futex_key(&q.key);
+			put_futex_key(fshared, &q.key);
 			cond_resched();
 			goto retry;
 		default:
@@ -2348,7 +2359,7 @@ out_unlock_put_key:
 	queue_unlock(hb);
 
 out_put_key:
-	put_futex_key(&q.key);
+	put_futex_key(fshared, &q.key);
 out:
 	if (to)
 		destroy_hrtimer_on_stack(&to->timer);
@@ -2361,10 +2372,10 @@ uaddr_faulted:
 	if (ret)
 		goto out_put_key;
 
-	if (!(flags & FLAGS_SHARED))
+	if (!fshared)
 		goto retry_private;
 
-	put_futex_key(&q.key);
+	put_futex_key(fshared, &q.key);
 	goto retry;
 }
 
@@ -2379,7 +2390,7 @@ static int futex_unlock_pi(u32 __user *uaddr, unsigned int flags)
 	union futex_key key = FUTEX_KEY_INIT;
 	struct futex_hash_bucket *hb;
 	struct futex_q *match;
-	int ret;
+	int ret, fshared = flags & FLAGS_SHARED;
 
 retry:
 	if (get_user(uval, uaddr))
@@ -2390,7 +2401,7 @@ retry:
 	if ((uval & FUTEX_TID_MASK) != vpid)
 		return -EPERM;
 
-	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &key, VERIFY_WRITE);
+	ret = get_futex_key(uaddr, fshared, &key, VERIFY_WRITE);
 	if (ret)
 		return ret;
 
@@ -2431,12 +2442,12 @@ retry:
 
 out_unlock:
 	spin_unlock(&hb->lock);
-	put_futex_key(&key);
+	put_futex_key(fshared, &key);
 	return ret;
 
 pi_faulted:
 	spin_unlock(&hb->lock);
-	put_futex_key(&key);
+	put_futex_key(fshared, &key);
 
 	ret = fault_in_user_writeable(uaddr);
 	if (!ret)
@@ -2544,7 +2555,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
 	struct futex_hash_bucket *hb;
 	union futex_key key2 = FUTEX_KEY_INIT;
 	struct futex_q q = futex_q_init;
-	int res, ret;
+	int res, ret, fshared = flags & FLAGS_SHARED;
 
 	if (uaddr == uaddr2)
 		return -EINVAL;
@@ -2571,7 +2582,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
 	RB_CLEAR_NODE(&rt_waiter.tree_entry);
 	rt_waiter.task = NULL;
 
-	ret = get_futex_key(uaddr2, flags & FLAGS_SHARED, &key2, VERIFY_WRITE);
+	ret = get_futex_key(uaddr2, fshared, &key2, VERIFY_WRITE);
 	if (unlikely(ret != 0))
 		goto out;
 
@@ -2673,9 +2684,9 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
 	}
 
 out_put_keys:
-	put_futex_key(&q.key);
+	put_futex_key(fshared, &q.key);
 out_key2:
-	put_futex_key(&key2);
+	put_futex_key(fshared, &key2);
 
 out:
 	if (to) {
-- 
1.8.4.5




^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-18 19:32 ` Darren Hart
@ 2014-10-18 20:19   ` Davidlohr Bueso
  2014-10-18 20:25     ` Darren Hart
  2014-10-20 10:15     ` Catalin Marinas
  0 siblings, 2 replies; 20+ messages in thread
From: Davidlohr Bueso @ 2014-10-18 20:19 UTC (permalink / raw)
  To: Darren Hart
  Cc: Catalin Marinas, linux-kernel, Matteo Franchin, Linus Torvalds,
	Thomas Gleixner, Peter Zijlstra, Ingo Molnar, Paul E. McKenney,
	Darren Hart, Mike Galbraith

On Sat, 2014-10-18 at 14:32 -0500, Darren Hart wrote:
> Which is not incomplete (lacking the explicit smp_mb()) added by this
> patch. Perhaps the MB implementation of get_futex_key_refs() need not be
> explicitly enumerated here?

Agreed, how about this:

diff --git a/kernel/futex.c b/kernel/futex.c
index 21f7e41..7a0805a 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -143,9 +143,8 @@
  *
  * Where (A) orders the waiters increment and the futex value read through
  * atomic operations (see hb_waiters_inc) and where (B) orders the write
- * to futex and the waiters read -- this is done by the barriers in
- * get_futex_key_refs(), through either ihold or atomic_inc, depending on the
- * futex type.
+ * to futex and the waiters read -- this is done by the barriers for both
+ * shared and private futexes in get_futex_key_refs().
  *
  * This yields the following case (where X:=waiters, Y:=futex):
  *




^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-18 20:19   ` Davidlohr Bueso
@ 2014-10-18 20:25     ` Darren Hart
  2014-10-20 10:15     ` Catalin Marinas
  1 sibling, 0 replies; 20+ messages in thread
From: Darren Hart @ 2014-10-18 20:25 UTC (permalink / raw)
  To: Davidlohr Bueso, Darren Hart
  Cc: Catalin Marinas, linux-kernel, Matteo Franchin, Linus Torvalds,
	Thomas Gleixner, Peter Zijlstra, Ingo Molnar, Paul E. McKenney,
	Mike Galbraith



On October 18, 2014 3:19:50 PM CDT, Davidlohr Bueso <dave@stgolabs.net> wrote:
>On Sat, 2014-10-18 at 14:32 -0500, Darren Hart wrote:
>> Which is not incomplete (lacking the explicit smp_mb()) added by this
>> patch. Perhaps the MB implementation of get_futex_key_refs() need not
>be
>> explicitly enumerated here?
>
>Agreed, how about this:
>
>diff --git a/kernel/futex.c b/kernel/futex.c
>index 21f7e41..7a0805a 100644
>--- a/kernel/futex.c
>+++ b/kernel/futex.c
>@@ -143,9 +143,8 @@
>  *
>* Where (A) orders the waiters increment and the futex value read
>through
>* atomic operations (see hb_waiters_inc) and where (B) orders the write
>- * to futex and the waiters read -- this is done by the barriers in
>- * get_futex_key_refs(), through either ihold or atomic_inc, depending
>on the
>- * futex type.
>+ * to futex and the waiters read -- this is done by the barriers for
>both
>+ * shared and private futexes in get_futex_key_refs().

Works for me.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-18 19:58   ` Davidlohr Bueso
@ 2014-10-18 20:50     ` Linus Torvalds
  2014-10-19  2:16       ` Davidlohr Bueso
  0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2014-10-18 20:50 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Catalin Marinas, Linux Kernel Mailing List, Matteo Franchin,
	Darren Hart, Thomas Gleixner, Peter Zijlstra, Ingo Molnar,
	Paul E. McKenney, Mike Galbraith

On Sat, Oct 18, 2014 at 12:58 PM, Davidlohr Bueso <dave@stgolabs.net> wrote:
>
> And [get/put]_futex_keys() shouldn't even be called for private futexes.
> The following patch had some very minor testing on a 60 core box last
> night, but passes both Darren's and perf's tests. So I *think* this is
> right, but lack of sleep and I overall just don't trust them futexes!

Hmm. I don't see the advantage of making the code more complex in
order to avoid the functions that are no-ops for the !fshared case?

IOW, as far as I can tell, this patch doesn't actually really *change*
anything. Am I missing something?

         Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-18 20:50     ` Linus Torvalds
@ 2014-10-19  2:16       ` Davidlohr Bueso
  2014-10-20  9:11         ` Thomas Gleixner
  0 siblings, 1 reply; 20+ messages in thread
From: Davidlohr Bueso @ 2014-10-19  2:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Catalin Marinas, Linux Kernel Mailing List, Matteo Franchin,
	Darren Hart, Thomas Gleixner, Peter Zijlstra, Ingo Molnar,
	Paul E. McKenney, Mike Galbraith

On Sat, 2014-10-18 at 13:50 -0700, Linus Torvalds wrote:
> On Sat, Oct 18, 2014 at 12:58 PM, Davidlohr Bueso <dave@stgolabs.net> wrote:
> >
> > And [get/put]_futex_keys() shouldn't even be called for private futexes.
> > The following patch had some very minor testing on a 60 core box last
> > night, but passes both Darren's and perf's tests. So I *think* this is
> > right, but lack of sleep and I overall just don't trust them futexes!
> 
> Hmm. I don't see the advantage of making the code more complex in
> order to avoid the functions that are no-ops for the !fshared case?
> 
> IOW, as far as I can tell, this patch doesn't actually really *change*
> anything. Am I missing something?

Right, all we do is avoid a NOP, but I don't see how this patch makes
the code more complex. In fact, the whole idea is to make it easier to
read and makes the key referencing differences between shared and
private futexes crystal clear, hoping to mitigate future bugs. 

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-19  2:16       ` Davidlohr Bueso
@ 2014-10-20  9:11         ` Thomas Gleixner
  2014-10-20 10:49           ` Catalin Marinas
  0 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2014-10-20  9:11 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Linus Torvalds, Catalin Marinas, Linux Kernel Mailing List,
	Matteo Franchin, Darren Hart, Peter Zijlstra, Ingo Molnar,
	Paul E. McKenney, Mike Galbraith

On Sat, 18 Oct 2014, Davidlohr Bueso wrote:
> On Sat, 2014-10-18 at 13:50 -0700, Linus Torvalds wrote:
> > On Sat, Oct 18, 2014 at 12:58 PM, Davidlohr Bueso <dave@stgolabs.net> wrote:
> > >
> > > And [get/put]_futex_keys() shouldn't even be called for private futexes.
> > > The following patch had some very minor testing on a 60 core box last
> > > night, but passes both Darren's and perf's tests. So I *think* this is
> > > right, but lack of sleep and I overall just don't trust them futexes!
> > 
> > Hmm. I don't see the advantage of making the code more complex in
> > order to avoid the functions that are no-ops for the !fshared case?
> > 
> > IOW, as far as I can tell, this patch doesn't actually really *change*
> > anything. Am I missing something?
> 
> Right, all we do is avoid a NOP, but I don't see how this patch makes
> the code more complex. In fact, the whole idea is to make it easier to
> read and makes the key referencing differences between shared and
> private futexes crystal clear, hoping to mitigate future bugs. 

I tend to disagree. The current code is symetric versus get/drop and
you make it unsymetric by avoiding the drop call with a pointless
extra conditional in all call sites.

I really had to look twice to figure out that the patch is correct,
but I really cannot see any value and definitely have a hard time how
this makes the code clearer and would prevent future bugs.

I rather keep it symetric and document the NOP property for private
futexes in both get and drop.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-18 20:19   ` Davidlohr Bueso
  2014-10-18 20:25     ` Darren Hart
@ 2014-10-20 10:15     ` Catalin Marinas
  1 sibling, 0 replies; 20+ messages in thread
From: Catalin Marinas @ 2014-10-20 10:15 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Darren Hart, linux-kernel, Matteo Franchin, Linus Torvalds,
	Thomas Gleixner, Peter Zijlstra, Ingo Molnar, Paul E. McKenney,
	Darren Hart, Mike Galbraith

On Sat, Oct 18, 2014 at 09:19:50PM +0100, Davidlohr Bueso wrote:
> On Sat, 2014-10-18 at 14:32 -0500, Darren Hart wrote:
> > Which is not incomplete (lacking the explicit smp_mb()) added by this
> > patch. Perhaps the MB implementation of get_futex_key_refs() need not be
> > explicitly enumerated here?
> 
> Agreed, how about this:
> 
> diff --git a/kernel/futex.c b/kernel/futex.c
> index 21f7e41..7a0805a 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -143,9 +143,8 @@
>   *
>   * Where (A) orders the waiters increment and the futex value read through
>   * atomic operations (see hb_waiters_inc) and where (B) orders the write
> - * to futex and the waiters read -- this is done by the barriers in
> - * get_futex_key_refs(), through either ihold or atomic_inc, depending on the
> - * futex type.
> + * to futex and the waiters read -- this is done by the barriers for both
> + * shared and private futexes in get_futex_key_refs().
>   *
>   * This yields the following case (where X:=waiters, Y:=futex):

Looks fine to me. Since Linus already picked the original patch, if you
plan to send an update for the comments please also mention the
"explicit MB (B) for private futexes" in get_futex_key_refs().

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-20  9:11         ` Thomas Gleixner
@ 2014-10-20 10:49           ` Catalin Marinas
  2014-10-20 15:32             ` Linus Torvalds
  0 siblings, 1 reply; 20+ messages in thread
From: Catalin Marinas @ 2014-10-20 10:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Davidlohr Bueso, Linus Torvalds, Linux Kernel Mailing List,
	Matteo Franchin, Darren Hart, Peter Zijlstra, Ingo Molnar,
	Paul E. McKenney, Mike Galbraith

On Mon, Oct 20, 2014 at 10:11:40AM +0100, Thomas Gleixner wrote:
> On Sat, 18 Oct 2014, Davidlohr Bueso wrote:
> > On Sat, 2014-10-18 at 13:50 -0700, Linus Torvalds wrote:
> > > On Sat, Oct 18, 2014 at 12:58 PM, Davidlohr Bueso <dave@stgolabs.net> wrote:
> > > >
> > > > And [get/put]_futex_keys() shouldn't even be called for private futexes.
> > > > The following patch had some very minor testing on a 60 core box last
> > > > night, but passes both Darren's and perf's tests. So I *think* this is
> > > > right, but lack of sleep and I overall just don't trust them futexes!
> > > 
> > > Hmm. I don't see the advantage of making the code more complex in
> > > order to avoid the functions that are no-ops for the !fshared case?
> > > 
> > > IOW, as far as I can tell, this patch doesn't actually really *change*
> > > anything. Am I missing something?
> > 
> > Right, all we do is avoid a NOP, but I don't see how this patch makes
> > the code more complex. In fact, the whole idea is to make it easier to
> > read and makes the key referencing differences between shared and
> > private futexes crystal clear, hoping to mitigate future bugs. 
> 
> I tend to disagree. The current code is symetric versus get/drop and
> you make it unsymetric by avoiding the drop call with a pointless
> extra conditional in all call sites.

Since you mention symmetry, something like below makes the barriers more
explicit. However, it removes the implied barrier for get_futex_key_refs
and the other calling places need to be verified (requeue_futex and
requeue_pi_futex; if barrier is not required here, the result may be
slightly more optimal, not sure you would see the difference though).


diff --git a/kernel/futex.c b/kernel/futex.c
index f3a3a071283c..5b9d857d0816 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -143,9 +143,7 @@
  *
  * Where (A) orders the waiters increment and the futex value read through
  * atomic operations (see hb_waiters_inc) and where (B) orders the write
- * to futex and the waiters read -- this is done by the barriers in
- * get_futex_key_refs(), through either ihold or atomic_inc, depending on the
- * futex type.
+ * to futex and the waiters read (see hb_waiters_pending).
  *
  * This yields the following case (where X:=waiters, Y:=futex):
  *
@@ -262,12 +260,6 @@ static struct futex_hash_bucket *futex_queues;
 static inline void futex_get_mm(union futex_key *key)
 {
 	atomic_inc(&key->private.mm->mm_count);
-	/*
-	 * Ensure futex_get_mm() implies a full barrier such that
-	 * get_futex_key() implies a full barrier. This is relied upon
-	 * as full barrier (B), see the ordering comment above.
-	 */
-	smp_mb__after_atomic();
 }
 
 /*
@@ -297,6 +289,10 @@ static inline void hb_waiters_dec(struct futex_hash_bucket *hb)
 
 static inline int hb_waiters_pending(struct futex_hash_bucket *hb)
 {
+	/*
+	 * Full barrier (B), see the ordering comment above.
+	 */
+	smp_mb__before_atomic();
 #ifdef CONFIG_SMP
 	return atomic_read(&hb->waiters);
 #else
@@ -338,13 +334,11 @@ static void get_futex_key_refs(union futex_key *key)
 
 	switch (key->both.offset & (FUT_OFF_INODE|FUT_OFF_MMSHARED)) {
 	case FUT_OFF_INODE:
-		ihold(key->shared.inode); /* implies MB (B) */
+		__iget(key->shared.inode);
 		break;
 	case FUT_OFF_MMSHARED:
-		futex_get_mm(key); /* implies MB (B) */
+		futex_get_mm(key);
 		break;
-	default:
-		smp_mb(); /* explicit MB (B) */
 	}
 }
 
@@ -417,7 +411,7 @@ get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key, int rw)
 	if (!fshared) {
 		key->private.mm = mm;
 		key->private.address = address;
-		get_futex_key_refs(key);  /* implies MB (B) */
+		get_futex_key_refs(key);
 		return 0;
 	}
 
@@ -524,7 +518,7 @@ again:
 		key->shared.pgoff = basepage_index(page);
 	}
 
-	get_futex_key_refs(key); /* implies MB (B) */
+	get_futex_key_refs(key);
 
 out:
 	unlock_page(page_head);

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-20 10:49           ` Catalin Marinas
@ 2014-10-20 15:32             ` Linus Torvalds
  2014-10-20 16:00               ` Catalin Marinas
  0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2014-10-20 15:32 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Thomas Gleixner, Davidlohr Bueso, Linux Kernel Mailing List,
	Matteo Franchin, Darren Hart, Peter Zijlstra, Ingo Molnar,
	Paul E. McKenney, Mike Galbraith

On Mon, Oct 20, 2014 at 3:49 AM, Catalin Marinas
<catalin.marinas@arm.com> wrote:
>
> Since you mention symmetry, something like below makes the barriers more
> explicit.

Borken, for two reasons:

> diff --git a/kernel/futex.c b/kernel/futex.c
> index f3a3a071283c..5b9d857d0816 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -143,9 +143,7 @@
>  static inline void futex_get_mm(union futex_key *key)
>  {
>         atomic_inc(&key->private.mm->mm_count);
> -       /*
> -        * Ensure futex_get_mm() implies a full barrier such that
> -        * get_futex_key() implies a full barrier. This is relied upon
> -        * as full barrier (B), see the ordering comment above.
> -        */
> -       smp_mb__after_atomic();
>  }

So the thing is, this means that we can't take advantage of the fact
that "atomic_inc" is already an atomic. So this is just a performance
breakage. But:

>
>  static inline int hb_waiters_pending(struct futex_hash_bucket *hb)
>  {
> +       /*
> +        * Full barrier (B), see the ordering comment above.
> +        */
> +       smp_mb__before_atomic();
>  #ifdef CONFIG_SMP
>         return atomic_read(&hb->waiters);

This is just entirely broken.

"atomic_read()" isn't really an "atomic op" at all. despite the name,
it's just a read that is basically ACCESS_ONCE.

So smp_mb__before_atomic() doesn't work for atomic_read(), and the
code is nonsensical and doesn't work. It would need to be a full
memory barrier.

                   Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier
  2014-10-20 15:32             ` Linus Torvalds
@ 2014-10-20 16:00               ` Catalin Marinas
  0 siblings, 0 replies; 20+ messages in thread
From: Catalin Marinas @ 2014-10-20 16:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Davidlohr Bueso, Linux Kernel Mailing List,
	Matteo Franchin, Darren Hart, Peter Zijlstra, Ingo Molnar,
	Paul E. McKenney, Mike Galbraith

On Mon, Oct 20, 2014 at 04:32:00PM +0100, Linus Torvalds wrote:
> On Mon, Oct 20, 2014 at 3:49 AM, Catalin Marinas
> <catalin.marinas@arm.com> wrote:
> > Since you mention symmetry, something like below makes the barriers more
> > explicit.
> 
> Borken, for two reasons:
> 
> > diff --git a/kernel/futex.c b/kernel/futex.c
> > index f3a3a071283c..5b9d857d0816 100644
> > --- a/kernel/futex.c
> > +++ b/kernel/futex.c
> > @@ -143,9 +143,7 @@
> >  static inline void futex_get_mm(union futex_key *key)
> >  {
> >         atomic_inc(&key->private.mm->mm_count);
> > -       /*
> > -        * Ensure futex_get_mm() implies a full barrier such that
> > -        * get_futex_key() implies a full barrier. This is relied upon
> > -        * as full barrier (B), see the ordering comment above.
> > -        */
> > -       smp_mb__after_atomic();
> >  }
> 
> So the thing is, this means that we can't take advantage of the fact
> that "atomic_inc" is already an atomic. So this is just a performance
> breakage. But:

OK, I looked at this from an ARM perspective only and it would not make
any difference. But it seems that MIPS makes a distinction between
"before" and "after" barriers with "before" defined as wmb in some
configuration, so the hunk below would break it.

> >  static inline int hb_waiters_pending(struct futex_hash_bucket *hb)
> >  {
> > +       /*
> > +        * Full barrier (B), see the ordering comment above.
> > +        */
> > +       smp_mb__before_atomic();
> >  #ifdef CONFIG_SMP
> >         return atomic_read(&hb->waiters);
> 
> This is just entirely broken.
> 
> "atomic_read()" isn't really an "atomic op" at all. despite the name,
> it's just a read that is basically ACCESS_ONCE.
> 
> So smp_mb__before_atomic() doesn't work for atomic_read(), and the
> code is nonsensical and doesn't work. It would need to be a full
> memory barrier.

Looking at the semantics of smp_mb__*_atomic(), it would indeed have to
be a full smp_mb() here.

-- 
Catalin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] futex: Mention key referencing differences between shared and private futexes
  2014-10-17 16:38 [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier Catalin Marinas
                   ` (2 preceding siblings ...)
  2014-10-18 19:32 ` Darren Hart
@ 2014-10-24  3:27 ` Davidlohr Bueso
  2014-10-24  9:11   ` Catalin Marinas
  2014-10-26 15:22   ` [tip:locking/urgent] " tip-bot for Davidlohr Bueso
  3 siblings, 2 replies; 20+ messages in thread
From: Davidlohr Bueso @ 2014-10-24  3:27 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-kernel, Matteo Franchin, Davidlohr Bueso, Linus Torvalds,
	Darren Hart, Thomas Gleixner, Peter Zijlstra, Ingo Molnar,
	Paul E. McKenney

From: Davidlohr Bueso <dave@stgolabs.net>

Update our documentation as of fix 76835b0ebf8 (futex: Ensure
get_futex_key_refs() always implies a barrier). Explicitly
state that we don't do key referencing for private futexes.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 kernel/futex.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index f3a3a07..bbf071f 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -143,9 +143,8 @@
  *
  * Where (A) orders the waiters increment and the futex value read through
  * atomic operations (see hb_waiters_inc) and where (B) orders the write
- * to futex and the waiters read -- this is done by the barriers in
- * get_futex_key_refs(), through either ihold or atomic_inc, depending on the
- * futex type.
+ * to futex and the waiters read -- this is done by the barriers for both
+ * shared and private futexes in get_futex_key_refs().
  *
  * This yields the following case (where X:=waiters, Y:=futex):
  *
@@ -344,13 +343,20 @@ static void get_futex_key_refs(union futex_key *key)
 		futex_get_mm(key); /* implies MB (B) */
 		break;
 	default:
+		/*
+		 * Private futexes do not hold reference on an inode or
+		 * mm, therefore the only purpose of calling get_futex_key_refs
+		 * is because we need the barrier for the lockless waiter check.
+		 */
 		smp_mb(); /* explicit MB (B) */
 	}
 }
 
 /*
  * Drop a reference to the resource addressed by a key.
- * The hash bucket spinlock must not be held.
+ * The hash bucket spinlock must not be held. This is
+ * a no-op for private futexes, see comment in the get
+ * counterpart.
  */
 static void drop_futex_key_refs(union futex_key *key)
 {
-- 
1.8.4.5




^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] futex: Mention key referencing differences between shared and private futexes
  2014-10-24  3:27 ` [PATCH] futex: Mention key referencing differences between shared and private futexes Davidlohr Bueso
@ 2014-10-24  9:11   ` Catalin Marinas
  2014-10-26 15:22   ` [tip:locking/urgent] " tip-bot for Davidlohr Bueso
  1 sibling, 0 replies; 20+ messages in thread
From: Catalin Marinas @ 2014-10-24  9:11 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: linux-kernel, Matteo Franchin, Davidlohr Bueso, Linus Torvalds,
	Darren Hart, Thomas Gleixner, Peter Zijlstra, Ingo Molnar,
	Paul E. McKenney

On Fri, Oct 24, 2014 at 04:27:00AM +0100, Davidlohr Bueso wrote:
> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> Update our documentation as of fix 76835b0ebf8 (futex: Ensure
> get_futex_key_refs() always implies a barrier). Explicitly
> state that we don't do key referencing for private futexes.
> 
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>

Looks fine to me.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [tip:locking/urgent] futex: Mention key referencing differences between shared and private futexes
  2014-10-24  3:27 ` [PATCH] futex: Mention key referencing differences between shared and private futexes Davidlohr Bueso
  2014-10-24  9:11   ` Catalin Marinas
@ 2014-10-26 15:22   ` tip-bot for Davidlohr Bueso
  1 sibling, 0 replies; 20+ messages in thread
From: tip-bot for Davidlohr Bueso @ 2014-10-26 15:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dbueso, peterz, hpa, Matteo.Franchin, linux-kernel, dave,
	catalin.marinas, torvalds, dvhart, paulmck, davidlohr, mingo,
	tglx

Commit-ID:  993b2ff221999066fcff231590593d0b98f45d32
Gitweb:     http://git.kernel.org/tip/993b2ff221999066fcff231590593d0b98f45d32
Author:     Davidlohr Bueso <dave@stgolabs.net>
AuthorDate: Thu, 23 Oct 2014 20:27:00 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sun, 26 Oct 2014 16:16:18 +0100

futex: Mention key referencing differences between shared and private futexes

Update our documentation as of fix 76835b0ebf8 (futex: Ensure
get_futex_key_refs() always implies a barrier). Explicitly
state that we don't do key referencing for private futexes.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Matteo Franchin <Matteo.Franchin@arm.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: http://lkml.kernel.org/r/1414121220.817.0.camel@linux-t7sj.site
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/futex.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index f3a3a07..bbf071f 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -143,9 +143,8 @@
  *
  * Where (A) orders the waiters increment and the futex value read through
  * atomic operations (see hb_waiters_inc) and where (B) orders the write
- * to futex and the waiters read -- this is done by the barriers in
- * get_futex_key_refs(), through either ihold or atomic_inc, depending on the
- * futex type.
+ * to futex and the waiters read -- this is done by the barriers for both
+ * shared and private futexes in get_futex_key_refs().
  *
  * This yields the following case (where X:=waiters, Y:=futex):
  *
@@ -344,13 +343,20 @@ static void get_futex_key_refs(union futex_key *key)
 		futex_get_mm(key); /* implies MB (B) */
 		break;
 	default:
+		/*
+		 * Private futexes do not hold reference on an inode or
+		 * mm, therefore the only purpose of calling get_futex_key_refs
+		 * is because we need the barrier for the lockless waiter check.
+		 */
 		smp_mb(); /* explicit MB (B) */
 	}
 }
 
 /*
  * Drop a reference to the resource addressed by a key.
- * The hash bucket spinlock must not be held.
+ * The hash bucket spinlock must not be held. This is
+ * a no-op for private futexes, see comment in the get
+ * counterpart.
  */
 static void drop_futex_key_refs(union futex_key *key)
 {

^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-10-26 15:24 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-17 16:38 [PATCH] futex: Ensure get_futex_key_refs() always implies a barrier Catalin Marinas
2014-10-18  6:54 ` Mike Galbraith
2014-10-18  7:09   ` Mike Galbraith
2014-10-18 15:28   ` Linus Torvalds
2014-10-18 16:15     ` Mike Galbraith
2014-10-18  7:33 ` Davidlohr Bueso
2014-10-18 19:58   ` Davidlohr Bueso
2014-10-18 20:50     ` Linus Torvalds
2014-10-19  2:16       ` Davidlohr Bueso
2014-10-20  9:11         ` Thomas Gleixner
2014-10-20 10:49           ` Catalin Marinas
2014-10-20 15:32             ` Linus Torvalds
2014-10-20 16:00               ` Catalin Marinas
2014-10-18 19:32 ` Darren Hart
2014-10-18 20:19   ` Davidlohr Bueso
2014-10-18 20:25     ` Darren Hart
2014-10-20 10:15     ` Catalin Marinas
2014-10-24  3:27 ` [PATCH] futex: Mention key referencing differences between shared and private futexes Davidlohr Bueso
2014-10-24  9:11   ` Catalin Marinas
2014-10-26 15:22   ` [tip:locking/urgent] " tip-bot for Davidlohr Bueso

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.