All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] futex:  replace bare barrier() with a READ_ONCE()
@ 2016-03-02 13:08 Jianyu Zhan
  2016-03-02 13:35 ` Christian Borntraeger
  2016-03-02 16:48 ` Darren Hart
  0 siblings, 2 replies; 5+ messages in thread
From: Jianyu Zhan @ 2016-03-02 13:08 UTC (permalink / raw)
  To: linux-kernel, paulmck, peterz, tglx, dave, akpm, mingo, linux,
	dvhart, bigeasy
  Cc: schwidefsky, heiko.carstens, nasa4836

Commit e91467ecd1ef ("bug in futex unqueue_me") introduces a barrier()
in unqueue_me(), to address below problem.

The scenario is like this:

====================
original code:

retry:
       lock_ptr = q->lock_ptr;
       if (lock_ptr != 0)  {
                  spin_lock(lock_ptr)
                  if (unlikely(lock_ptr != q->lock_ptr)) {
                        spin_unlock(lock_ptr);
                         goto retry;
                  }
                   ...
       }

====================
s390x generates code that is equivalent to:

retry:
       if (q->lock_ptr != 0)  {
                  spin_lock(q->lock_ptr)
                  if (unlikely(lock_ptr != q->lock_ptr)) {
                        spin_unlock(lock_ptr);
                         goto retry;
                  }
                   ...
       }

since q->lock_ptr might change between test of non-nullness and spin_lock(),
the double load may lead to problem. So that commit use a barrier() to prevent this.

This patch replaces this bare barrier() with a READ_ONCE().

The reasons are:

1) READ_ONCE() is a more weak form of barrier() that affect only the specific
   accesses, while barrier() is a more general compiler level memroy barrier.

2) READ_ONCE() which could be more informative by its name, while a bare barrier()
   without comment leads to quite a bit of perplexity.

3) READ_ONCE() _might_ prevent more _theoretical_ "optimizations" by the compiler:

   The above retry logic is effectively the same as:

	while (lock_ptr = READ_ONCE(q->lock_ptr)) {
                   spin_lock(lock_ptr)
                   if (unlikely(lock_ptr != q->lock_ptr)) {
                         spin_unlock(lock_ptr);
                   }

                   ...
         }

   If without the READ_ONCE, the compiler _might_ optimize the load of q->lock_ptr out of the
   test satement, which will also lead to the same pointer aliasing problem.

   For why do compiler might do this, here is a snippet quoted from Documentation/memory-barriers.txt:

   |(*) The compiler is within its rights to reload a variable, for example,
   |    in cases where high register pressure prevents the compiler from
   |    keeping all data of interest in registers.  The compiler might
   |    therefore optimize the variable 'tmp' out of our previous example:
   |
   |     while (tmp = a)
   |         do_something_with(tmp);
   |
   |    This could result in the following code, which is perfectly safe in
   |    single-threaded code, but can be fatal in concurrent code:
   |
   |     while (a)
   |         do_something_with(a);
   |
   |    For example, the optimized version of this code could result in
   |    passing a zero to do_something_with() in the case where the variable
   |    a was modified by some other CPU between the "while" statement and
   |    the call to do_something_with().
   |
   |    Again, use READ_ONCE() to prevent the compiler from doing this:
   |
   |    while (tmp = READ_ONCE(a))
   |         do_something_with(tmp);


Suggested-by: Darren Hart <dvhart@infradead.org>
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
---
 v1->v2: 
   According to suggestion by Darren Hart, revise the commit log to justify the replacement.

 I also cc'ed the s390 maintainers, if they could help provide assembly code after this patch
 to prove the correctness, that would be better.

 kernel/futex.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 5d6ce64..6796084 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1927,8 +1927,13 @@ static int unqueue_me(struct futex_q *q)
 
 	/* In the common case we don't take the spinlock, which is nice. */
 retry:
-	lock_ptr = q->lock_ptr;
-	barrier();
+	/*
+	 *  On s390x, it was observed that compiler generates such code that spin_lock() will operate on
+	 *  another load of q->lock_ptr, instead of on @lock_ptr, and since q->lock_ptr might change between
+	 *  the test of non-nullness and the spin_lock(), which leads to problem. So use READ_ONCE() here to
+	 *  prevent this compiler "optimization".
+	 */
+	lock_ptr = READ_ONCE(q->lock_ptr);
 	if (lock_ptr != NULL) {
 		spin_lock(lock_ptr);
 		/*
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] futex: replace bare barrier() with a READ_ONCE()
  2016-03-02 13:08 [PATCH v2] futex: replace bare barrier() with a READ_ONCE() Jianyu Zhan
@ 2016-03-02 13:35 ` Christian Borntraeger
  2016-03-02 13:56   ` Jianyu Zhan
  2016-03-02 16:48 ` Darren Hart
  1 sibling, 1 reply; 5+ messages in thread
From: Christian Borntraeger @ 2016-03-02 13:35 UTC (permalink / raw)
  To: Jianyu Zhan, linux-kernel, paulmck, peterz, tglx, dave, akpm,
	mingo, linux, dvhart, bigeasy
  Cc: schwidefsky, heiko.carstens

On 03/02/2016 02:08 PM, Jianyu Zhan wrote:
> Commit e91467ecd1ef ("bug in futex unqueue_me") introduces a barrier()
> in unqueue_me(), to address below problem.
> 
> The scenario is like this:
> 
> ====================
> original code:
> 
> retry:
>        lock_ptr = q->lock_ptr;
>        if (lock_ptr != 0)  {
>                   spin_lock(lock_ptr)
>                   if (unlikely(lock_ptr != q->lock_ptr)) {
>                         spin_unlock(lock_ptr);
>                          goto retry;
>                   }
>                    ...
>        }
> 
> ====================
> s390x generates code that is equivalent to:
> 
> retry:
>        if (q->lock_ptr != 0)  {
>                   spin_lock(q->lock_ptr)
>                   if (unlikely(lock_ptr != q->lock_ptr)) {
>                         spin_unlock(lock_ptr);
>                          goto retry;
>                   }
>                    ...
>        }
> 
> since q->lock_ptr might change between test of non-nullness and spin_lock(),
> the double load may lead to problem. So that commit use a barrier() to prevent this.
> 
> This patch replaces this bare barrier() with a READ_ONCE().
> 
> The reasons are:
> 
> 1) READ_ONCE() is a more weak form of barrier() that affect only the specific
>    accesses, while barrier() is a more general compiler level memroy barrier.
> 
> 2) READ_ONCE() which could be more informative by its name, while a bare barrier()
>    without comment leads to quite a bit of perplexity.
> 
> 3) READ_ONCE() _might_ prevent more _theoretical_ "optimizations" by the compiler:
> 
>    The above retry logic is effectively the same as:
> 
> 	while (lock_ptr = READ_ONCE(q->lock_ptr)) {
>                    spin_lock(lock_ptr)
>                    if (unlikely(lock_ptr != q->lock_ptr)) {
>                          spin_unlock(lock_ptr);
>                    }
> 
>                    ...
>          }
> 
>    If without the READ_ONCE, the compiler _might_ optimize the load of q->lock_ptr out of the
>    test satement, which will also lead to the same pointer aliasing problem.
> 
>    For why do compiler might do this, here is a snippet quoted from Documentation/memory-barriers.txt:
> 
>    |(*) The compiler is within its rights to reload a variable, for example,
>    |    in cases where high register pressure prevents the compiler from
>    |    keeping all data of interest in registers.  The compiler might
>    |    therefore optimize the variable 'tmp' out of our previous example:
>    |
>    |     while (tmp = a)
>    |         do_something_with(tmp);
>    |
>    |    This could result in the following code, which is perfectly safe in
>    |    single-threaded code, but can be fatal in concurrent code:
>    |
>    |     while (a)
>    |         do_something_with(a);
>    |
>    |    For example, the optimized version of this code could result in
>    |    passing a zero to do_something_with() in the case where the variable
>    |    a was modified by some other CPU between the "while" statement and
>    |    the call to do_something_with().
>    |
>    |    Again, use READ_ONCE() to prevent the compiler from doing this:
>    |
>    |    while (tmp = READ_ONCE(a))
>    |         do_something_with(tmp);


Not sure if you need that huge patch description, but it does not hurt either.


> 
> 
> Suggested-by: Darren Hart <dvhart@infradead.org>
> Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
> ---
>  v1->v2: 
>    According to suggestion by Darren Hart, revise the commit log to justify the replacement.
> 
>  I also cc'ed the s390 maintainers, if they could help provide assembly code after this patch
>  to prove the correctness, that would be better.
> 
>  kernel/futex.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/futex.c b/kernel/futex.c
> index 5d6ce64..6796084 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -1927,8 +1927,13 @@ static int unqueue_me(struct futex_q *q)
> 
>  	/* In the common case we don't take the spinlock, which is nice. */
>  retry:
> -	lock_ptr = q->lock_ptr;
> -	barrier();
> +	/*
> +	 *  On s390x, it was observed that compiler generates such code that spin_lock() will operate on
> +	 *  another load of q->lock_ptr, instead of on @lock_ptr, and since q->lock_ptr might change between
> +	 *  the test of non-nullness and the spin_lock(), which leads to problem. So use READ_ONCE() here to
> +	 *  prevent this compiler "optimization".
> +	 */

That change makes sense. I did the original barrier back in 2006 (could you cc me next time?)
ACCESS_ONCE or READ_ONCE was not available at that time and its now the better way.

This is not an s390 specific problem, it was just triggered there as the gcc cost model 
considered the memory read as cheap as a register read.

Maybe simplify the comment to something like

	/* Prevent the compiler to read lock_ptr twice (if and spin_lock) */


> +	lock_ptr = READ_ONCE(q->lock_ptr);
>  	if (lock_ptr != NULL) {
>  		spin_lock(lock_ptr);
>  		/*
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] futex: replace bare barrier() with a READ_ONCE()
  2016-03-02 13:35 ` Christian Borntraeger
@ 2016-03-02 13:56   ` Jianyu Zhan
  2016-03-02 16:41     ` Darren Hart
  0 siblings, 1 reply; 5+ messages in thread
From: Jianyu Zhan @ 2016-03-02 13:56 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: LKML, Paul McKenney, Peter Zijlstra, Thomas Gleixner, dave,
	Andrew Morton, Ingo Molnar, linux, dvhart, bigeasy, schwidefsky,
	heiko.carstens

On Wed, Mar 2, 2016 at 9:35 PM, Christian Borntraeger
<borntraeger@de.ibm.com> wrote:
> That change makes sense. I did the original barrier back in 2006 (could you cc me next time?)

Sorry for that,  I thought get_maintainer.pl would had spit the
original author's email, but apparently it didn't :(

> ACCESS_ONCE or READ_ONCE was not available at that time and its now the better way.
>
> This is not an s390 specific problem, it was just triggered there as the gcc cost model
> considered the memory read as cheap as a register read.
>

Thanks for the clarification.

> Maybe simplify the comment to something like
>
>         /* Prevent the compiler to read lock_ptr twice (if and spin_lock) */

Will do.


Regards,
Jianyu Zhan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] futex: replace bare barrier() with a READ_ONCE()
  2016-03-02 13:56   ` Jianyu Zhan
@ 2016-03-02 16:41     ` Darren Hart
  0 siblings, 0 replies; 5+ messages in thread
From: Darren Hart @ 2016-03-02 16:41 UTC (permalink / raw)
  To: Jianyu Zhan
  Cc: Christian Borntraeger, LKML, Paul McKenney, Peter Zijlstra,
	Thomas Gleixner, dave, Andrew Morton, Ingo Molnar, linux, dvhart,
	bigeasy, schwidefsky, heiko.carstens

On Wed, Mar 02, 2016 at 09:56:32PM +0800, Jianyu Zhan wrote:
> On Wed, Mar 2, 2016 at 9:35 PM, Christian Borntraeger
> <borntraeger@de.ibm.com> wrote:
> > That change makes sense. I did the original barrier back in 2006 (could you cc me next time?)
> 
> Sorry for that,  I thought get_maintainer.pl would had spit the
> original author's email, but apparently it didn't :(

get_maintainer gives you the documented maintainers per the MAINTAINERS file, as
well as frequent committers. This patch was in 2008. Always include the author
of patches you are referencing (and that's manual).

> 
> > ACCESS_ONCE or READ_ONCE was not available at that time and its now the better way.
> >

Noting this specifically in the commit message helps the argument and why the
change is being made now.

> > This is not an s390 specific problem, it was just triggered there as the gcc cost model
> > considered the memory read as cheap as a register read.
> >
> 
> Thanks for the clarification.
> 

And as with the original patch (e91467ecd1ef), we should be able to justify the
change with an assembly dump, as Christian did in his commit.

-- 
Darren Hart
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] futex:  replace bare barrier() with a READ_ONCE()
  2016-03-02 13:08 [PATCH v2] futex: replace bare barrier() with a READ_ONCE() Jianyu Zhan
  2016-03-02 13:35 ` Christian Borntraeger
@ 2016-03-02 16:48 ` Darren Hart
  1 sibling, 0 replies; 5+ messages in thread
From: Darren Hart @ 2016-03-02 16:48 UTC (permalink / raw)
  To: Jianyu Zhan
  Cc: linux-kernel, paulmck, peterz, tglx, dave, akpm, mingo, linux,
	dvhart, bigeasy, schwidefsky, heiko.carstens

On Wed, Mar 02, 2016 at 09:08:31PM +0800, Jianyu Zhan wrote:

...

> 3) READ_ONCE() _might_ prevent more _theoretical_ "optimizations" by the compiler:
> 
>    The above retry logic is effectively the same as:
> 
> 	while (lock_ptr = READ_ONCE(q->lock_ptr)) {
>                    spin_lock(lock_ptr)


The spin_lock() is memory barrier, and therefor a general compiler barrier. The
READ_ONCE would be redundant in this case.

Unless you can demonstrate a failure mode in disassembly, or can point out how
the spin_lock barrier is insufficient that I have missed, this third point is
already covered.

-- 
Darren Hart
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-03-02 16:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-02 13:08 [PATCH v2] futex: replace bare barrier() with a READ_ONCE() Jianyu Zhan
2016-03-02 13:35 ` Christian Borntraeger
2016-03-02 13:56   ` Jianyu Zhan
2016-03-02 16:41     ` Darren Hart
2016-03-02 16:48 ` Darren Hart

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.